Saturday, December 10, 2016

Why Resharper is Evil

There was a post once upon a time called "Why Resharper is Evil", and a person named Hadi commented on that post that Resharper is highly configurable and thus what it does is not 'evil'. i would like to respond to that comment here.

The things that Resharper does are evil, and for one very simple reason. Companies are rolling out Resharper as a part of their development environment and increasingly *requiring* its use. However, it is inevitably set up by the IT division, which installs the defaults, because they're IT - they don't know any better.

New developers are not born with good coding habits or practices. They learn them, just like everything else. However, if on their first job they get a default Resharper installed and are required to use it (or even if they are not required), they will accept it as the 'correct' way to write code. They'll learn their coding standards and style from the tool. Thus, they will learn bad habits.

However, most of the people know that from experience are retired, in administrative positions, or otherwise indisposed - that is, not actively coding. I love coding. I've skipped on more promotion opportunities than you can count because I love coding. So I have the experience AND still do the work.

THESE ARGUMENTS HAVE BEEN DONE. They've been hashed out before, and thrown out because they don't work. The fact that kids are militantly aggressive in their belief that they already know everything is not germaine. Every kid thinks they know everything until they're about 30 and they finally grow up. I was no different, although I figured out what an idiot I was when I was about 25 - I've always been a bit ahead of the curve. Once they figure out that they've been an idiot, then they spend ten years trying to catch up.

Using var for everything? BAD. Why? Because any context you lose in a program is bad. It will come back and bite you sooner or later. Throwing it out on purpose is just dumb. Throwing it out because a program that is riding on popular opinion tells you to is worse. Twenty percent of your time is spent coding. The other 80 percent is spent debugging. I don't care how many characters of typing you save, you aren't saving anything - you WILL spend more time later on as a consequence. Sure, you can mouse-over the function declaration, and if it is properly documented, intellisense will tell you the return type - IF it's properly documented. Of course, if you could see the return type in front of your eyes, you wouldn't need to spend that five seconds looking it up, would you? And trust me, you WILL spend more time looking it up than you could have ever conceivably spent typing it, no matter how long and baroque the type definition is.

I've seen the over-use of 'var' justified using the DRY principle. However, that principle categorically states that "Every piece of knowledge must have exactly one definitive, unique, authoritative representation." You will note that there is no-place in that definition where it mentions 'Not repeating the same sequence of keystrokes if you can avoid it.' More to the point, if you've got three different variable declarations in the same function, all 'var', all assigned to DIFFERENT TYPES, then you have explicitly VIOLATED the DRY principle because you have deliberately thrown away half of the 'knowledge' - the type of the variable. Your code is now declaratively ambiguous.

Similarly naming conventions. I know that Hungarian notation is currently 'out of style'. However, the problem has never been Hungarian notation. The problem is that a whole legion of styles of it have popped out of the ether that were determined and designed by people who NEVER READ THE BOOK. They just looked at what Hungarian notation appeared to be doing and adapted something that was Hungarian-like that was more to their personal coding predelictions, and then called it Hungarian.

Hungarian notation itself,however, is perfectly valid and highly useful. A naming convention that allows you to have a property MyValue and a local variable or parameter myValue that refer to different and perhaps unequal variants of the same data that can be confused by simply 'blowing a shift' (not completely depressing the shift key)? That's EVIL. Because the code will compile fine, run fine, and it's possible that the bug won't manifest for years. When the bug finally does surface, it will be a bugger to track down precisely because NOTHING LOOKS WRONG. With a properly designed Hungarian notation, if that happens it ALWAYS LOOKS EXPLICITLY WRONG. So much so that if you use Hungarian notation, and you use a field instead of an accessor where both exist, you generally comment the reason why - precisely so that somebody who comes after you won't instantly notice that it looks wrong, 'fix' it, and break what you did on purpose - by accident.

So throwing out Hungarian Notation as 'evil' based on some poor implementations is exactly the same as holding up a bunch of badly made pizzas as 'proof' that all pizza is lousy food.

The same extends to many other things. Because a thing 'could' be readonly in a specific context by no means even suggests that it should be. That a condition might 'appear' redundant does not of necessity mean that it is. The fact that a code block 'could' be collapsed into a more concise form does not mean it should be - in fact, that is frequently the exact opposite of the truth. The fact that you 'could' in-line a variable declaration with its initial assignment is true and also almost always a bad thing.

Even things that are clearly redundant are not necessarily evil. If you put in a test for a condition that should never occur, that is called *DEFENSIVE CODING*. It *should* never occur. But after five or ten other people have been wading through your code doing other stuff, maybe eventually it will become possible, or even easy. But even if that happens, because you were wise enough to put the redundant check in there in the first place, the problem that would otherwise exist gets nipped in the bud because an exception comes flying out that says, in unkind terms, "HEY!!! THIS ISN'T EVER SUPPOSED TO HAPPEN! WHAT THE HELL IS GOING ON HERE!?"

You may have heard of 'assertions'. These were things that people would put in their code that would check that conditions were being met as they should be. Normally, these assertions would compile out of release builds, but I know that sometimes those debug builds make it out there, because I see messages of the form: "Assertion Failed: " Yep, that happened. And that was an assertion. So that was NEVER SUPPOSED TO FAIL! That's why you can't trap assertions. They're not supposed to be trapped. They're supposed to be a figurative slap in the developer's face. "Wake up, for crying out loud! You're screwing this up, here. Fix it before somebody sees what an idiot you are!" That's why redundancy isn't a bad thing. Redundancy would be bad if humans were perfect. We're not, so it's not.

The compiler is going to squib out the same executable no matter how you write. The code you write is not for the compiler's use. It is for YOUR use. And it is not for your use NOW. It is for your (and everybody else's) use LATER. The fact is that pretty much every line of code that will ever be written will be debugged many times more often than it is written or re-written, and thus, if you have to err, it is best to err to the benefit of the debugger - not the editor or yourself. Thus, in the long run, verbose is almost invariably the Best Thing. That, however, is something that you don't learn without lots and lots of experience. Which, of course, will be harder to get now, because now everybody's got Resharper tut-tutting them even if they *try* to do the Right Thing.

It seems human nature that every new generation must repeat all the same stupid mistakes for themselves. I accept that. What Resharper has done is codify and automate that stupidity for the ages - and make it easy to not bother learning. When it finally comes back on everybody, they can conveniently deny their own complicity by declaring, "Resharper told me to!" That is the very canonical definition of EVIL.

All my life I've gone by a rule - if I can look at something I wrote even six months ago and *not* ask myself, "What was I thinking?" then I've stopped learning. What I see now are young programmers devoted to Resharper who crank out the exact same thing year after year and remain convinced that it's done right. They've stopped learning. Resharper didn't help them. It didn't make them better. It switched their brains off.

If it were up to me, I would forbid the use of Resharper by any programmer who hadn't been coding for at least ten years *without* it. By then, they'll at least have a clue about how to configure the thing, and an idea of what to ignore and when.

Is Resharper Evil? It's the one thing responsible for more and worse code than anything that has come out of computer science since before the invention of the Atanasoff-Berry Computer. So that would be "all time" as computer science time is reckoned.

Saturday, December 7, 2013

On Using Var

There is a lot of debate going around about how to use the keyword 'var' in C#.  Var is not precisely a variant type.  What var means is that you're leaving it up to the compiler to assign the thing an appropriate type when it's compiling the code.  This is as opposed to 'object', which is effectively a variant type, in that every single type that exists in C# has a class representation that is descended from the object class, so you can assign literally anything to an object variable.

The difference is exemplified in that, if you use a var declaration, the resulting variable will have such properties and methods as are appropriate to the type that the compiler assigned to the variable thus declared.  If you assign the same thing to a variable declared as an object, then the resulting variable will only have such properties and methods as are appropriate to an object until such time as you coerce the variable into something compatible, for example with a type-cast.

Now the var type didn't exist in C# before.  C# is a typed language.  Typed languages are far better - there are considerably fewer ways to screw up.  So in a typed language, a thing like 'var' serves no function.

Var was introduced along with LINQ.  Basically, C# introduced a mechanism by which you could construct a block of code where the type of the result was either unknown or difficult to determine at coding-time.  So they stuck the var type in there so you could intercept the results of those constructions and save them for further use.

That is the canonical and One True Use of the var type.  You use it when you don't know what type will be ultimately assigned to the variable.  This has always been how I've used it, and the axiomatically correct way to use it.

The var type, however, has been co-opted since to be used for literally everything as a coding convenience.  Why bother figuring out the type when the compiler will do it for you automatically?

Well, here's a bit of wisdom for you, kids.  The code you write isn't for the compiler.  Nope.  It's for you.  And it's not for you right now.  It's for you later.  And perhaps for a lot of other people later.  The compiler's going to reduce it into binary gobbledy-gook that the computer can read.  Once that's done, the computer could care less about your code - it's so much junk.

No, you write that stuff out and carefully save it to disk and back it up to a source code repository precisely because *YOU* will need it - later.  Or if not you, then somebody else.  That's why it exists.

Now, if you use the var type for all declarations everywhere, what exactly are you saving?  A lot of the time you're not saving time (type var instead of int - typing var takes fractionally longer because the 'v' is an uncommon character), but sometimes you are - it's faster to type 'var' than 'IEnumerable<AccountGroup>'.  However, you're saving that time now.

What's going to happen later?  Well, if you come back to that line later, then you're trying to find out why that part of the code is not working.  But it's been maybe a day, or a week, or six months since you wrote it.  Back when you wrote it, you knew exactly what it was, but now?  Not so sure, now.  So now you have to grab your mouse and mouse over it or it's assignment and find out from the intellisense (or whatever) what type is being assigned to it.

So you saved yourself five seconds typing in a long type name when you used the var originally, and now you've just lost the time you saved the VERY FIRST TIME you looked at it again.  You saved nothing.

In fact you've saved less than nothing, because you might write that once, but have to debug or analyse it, over the life of that program, four, ten, maybe fifty times.  All the time you saved writing it you burned up on the first time you looked at it.  Every other time you look at it, you have to WASTE that time, because you have to look it up AGAIN.

It's called 'false economy' in the vernacular.  It's like shopping around to various gas stations for an hour to find a place where you'll save $0.01 per gallon.  If you've got a 20 gallon tank, you save a whole $0.20 cents.  Driving around for an hour burned off a gallon of gas, which will cost you a couple of dollars to replace.  So you spent a couple of dollars to save twenty cents.  False economy.

In the coding example, you saved five seconds up front, but you're going to wind up spending 20 seconds, or a minute, or two minutes over the life of the program to compensate for that initial saving of five seconds.  That's just one variable.  If you used five hundred variables like that, you saved 2500 seconds, or a whole 42 minutes or so, but over the course of debugging that program for it's useful lifetime, you're going to wind up wasting anywhere from 3 hours to 2 days making up for that short-sightedness.

If you program for a living, you're basically setting yourself up to be a candidate for the least efficient developer in the world.  I'd like anybody to explain to me why that's a 'best practice'.

I've heard people defend the practice because they claim that it obeys the DRY principle (Don't Repeat Yourself).  It doesn't actually.

DRY comes from "The Pragmatic Programmer" (Hunt and Thomas).  The principle, as described IN THAT BOOK is summarized with: "Every piece of knowledge must have a single, unambiguous, authoritative representation within a system."

You'll notice that nowhere in that does it say anything about saving keystrokes. In fact, another principle in the same book is "Keep Knowledge in Plain Text".  If you have a variable, it has three pieces of 'knowledge' contained in it:


  1. Data type
  2. Value
  3. Purpose


The value is intrinsic.  The purpose is generally given by the variable name.  The data type you must supply.  Except by using the var type, you are explicitly and DELIBERATELY throwing that data out of the 'Plain Text' representation of that variable.  This incidentally also violates the DRY principle, because what you are going to wind up with are a number of variables that have no given type associated with them, which make them ambiguous within the context of other such variables.  Thus, that piece of knowledge is no longer 'unambiguous' which, you will note, is right there in the statement of purpose for DRY.

Can there be anything so wrong as justifying an action by citing a principle from a book that does not apply in and of itself and the action not only violates that principle but OTHER PRINCIPLES in the same book?

Probably not.


Copyright ©2013 by David Wright. All rights reserved.

Monday, June 24, 2013

Coding Standards - The Wright Way - Writing Code

It had to get to this eventually, now, didn't it?  No problemo.  I have only a few rules, and they make eminent  sense.  You may have noticed that I tend not to be very rabid and frothing about most things, so you've got wiggle-room, too.

Organization

First up, organization.  People - a lot of people - just slap their code down in the order it comes into their head.  I guess they like digging round like gophers all the time.  I like to put a bit of organization on my code, so that if I'm looking for something, I know approximately where it's going to be, and I can make use of outlining in an effective manner to manage my display.

So, in a generic class definition, in any language (and in OOP, everything is a class by fiat), I order my elements as follows, with regions specified as shown (assuming C#, of course):

<library declarations/>

<namespace>
    <namespace delegate declarations/>

    <class definition> <ascendants/><interfaces/>
        #region Constants
        <class-local constants/>
        #endregion

        #region Enumerations
        <enumerations/>
        #endregion

        #region Delegates
        <class-local delegate declarations/>
        #endregion

        #region Static Fields
        <class static fields/>
        #endregion

        #region Static Accessors
        <class static accessors (class properties)/>
        #endregion

        #region Fields
        <instance fields/>
        #endregion

         #region Accessors
         <instance accessors (properties)/>
         #endregion

         #region Static Constructors
         <class static constructors/>
         #endregion

         #region Constructors and Destructors
         <constructors>
         <destructors>
         #endregion

         #region Static Methods
         <class static methods/>
         #endregion

         #region Methods
         <instance methods/>
         #endregion

         #region Event Handlers
         <event-handler methods/>
         #endregion

         #region Events
         <event declaration>
         <Event/>
         <OnEvent method/>
         </event declaration>
         #endregion

         #region Interface Implementations
         <interface implementation>
         #region <Interface Name>
         <implementation/>
         #endregion
         </interface implementation>
         #endregion
    </class definition>
</namespace>

Now, you will note that you definitely don't use all of these all the time, and that's fine - leave out the ones you don't use at any given time.  This just shows you for each where they go in relation to each other.   You'll also note that for some interface implementations, for example IComparable, they consist of a single element, in that case a CompareTo() definition.  In others, they may be composed of multiple types of things.  The scheme above seems to suggest that all of the properties and methods of a specific interface should show up in the bottom group.  That's exactly what it suggests.

Now, the next complaint will be that this involves a lot of extra non-functional coding.  I regard it as implicit documentation, but if it helps you at all, you could actually write a macro that you click when you're writing a new class definition that automatically slaps all of those #region..#endregion groups in there.  They're a lot quicker to delete if you don't need them than they are to type if you do.

I've also read rants that suggest that #region/#endregion constructs are evil because they are used by default as border tags for text folding.  Bushwah.  If you don't like text folding, then turn it off - you can do that.  They're first and foremost a GROUPING construction, not a folding construction.  The primary use just happens to lend itself to the secondary use.  Grouping is very useful in code.  I've lost a lot of hair to having to basically scan through an entire file looking for a method because the stuff is all just scribbled out in the order it was scribbled out, so the methods are everywhere and nowhere at the same time.  I've also seen this taken too far, where there is, for example, a region that is sub-divided by a region for each access modifier (public, protected, private).  That might be going too far, but if that works for you and it makes things clear for you, I'm not going to call it bad.

Variable Declarations

Now, how should you declare variables?  You don't actually have to declare them all at the top anymore, you know, and usually it's clearer if you don't declare things until just before you use them for the first time.  I'm not particularly militant either way.  Should you declare multiple things on the same line?  Generally not - it's messy, but in some cases it's a heck of a lot more convenient, so use your best judgement, depending on the situation.

Should you declare different types of variables in clusters?  Sure, if you want.  If you think that's inane, then don't.  I've never actually known it to be relevant either way.

Scope

How about scope?  OK, little-known rule for you younger folk:  In languages like C, C++ and C#, the braces that are used to block statements together actually define stack-frames.  Didn't know that, huh?  So if you declare a variable inside braces of a for() {} loop, in principle that variable is only in scope for the duration of that for() {} loop.  Once you exit the loop, the variable is out of scope again.  Cool, huh?

In C++ that can be a very handy thing to know.  In C#, not so much because for reasons nobody has ever adequately explained, they built the compiler so that while it respects the scope of the variable for the purposes of use, it doesn't respect the scope for the purposes of naming.  I suppose they thought that was too much responsibility for our poor, feeble, malnourished little idiot brains.  Consequently, this scoping phenomenon, while it can be used to control variable access effectively, also and simultaneously eliminates the variable name from the pool of names available to you from that point in the function to the end.  Which means that even I don't use it much.

Oddly enough, variables declared within a looping construct respect the scope of the declaration both for scope and naming, so if you write something like this, it's fine:

Function()
{
    for( int nIndex = 0; nIndex < asStrings.Length; nIndex++
        Trace.WriteLine( asStrings[ nIndex ];

    for( int nIndex = 0; nIndex < asStrings.Length; nIndex++ )
        Debug.WriteLine( asStrings[ nIndex ];
}

Go figure.  But as a rule, I declare one variable per line, and declare variables just before the code block in which they first appear.  That way it's clear, and if you accidentally refer to a variable before it is first used, at least the compiler is going to kick you in the shin and force you to take a look at what you're doing, which might prevent you from making a mistake.

Statement Structure

I'm not going to say much about statement structure.  If you follow the suggestions provided about spacing, bracing and what-not, things should be fine.  Only one caveat, here, and offenders will know who they are.

Use one statement at a time, OK?  That is, none of this garbage:

if( this.GetCount() < this.TotalValue() ) bResult = this.CheckValue( this.GetCount() ) != oVal.oFactor.GetInt() );

Ugh.  Write that as:

int nCount = this.GetCount();
int nTotal = this.TotalValue();
bool bResult = false;

if( nCount < nTotal )
{
    int nCheck = this.CheckValue( nCount );
    Factor oFactor = oVal.oFactor;
    int nInt = oFactor.GetInt();
    bResult = ( nCheck != nInt )
}

Why?  The first one is hard to read, hard to understand, and HARD TO DEBUG.  Every intermediate result is being held anonymously on the stack and you have no idea what they are unless you poke around a lot more deeply than you have to.  The second one is clear, easy to understand and easy to debug because you can see EVERY INTERMEDIATE VALUE.

At the end of the day, the compiler is going to generate the exact same code either way.  That first abortion proves nothing, but it does tell me that you aren't interested in writing even halfway decent code.  There is nothing cool in daisy-chain obfuscation.  The only possible motive for that is to make other people's lives miserable, or because you value speed over clarity.  If either one is your game, then nobody wants you on the team, buddy.  Nobody.

If you're doing it because you think it makes you look 'hot' and 'sophisticated' as a programmer, no it doesn't.  It makes you look like a noob, because it highlights, in stark relief, that you have absolutely no idea what is going on inside that "magical 'puter thingy".  Go take a class or something and come back when you have a vague barking clue.

Let me tell you something about speed, OK?  Just so you know.  Most of the time you spend writing programs is spent debugging.  Raw typing speed counts for very little, and I can type upwards of 120 words per minute, so trust me on that.

If you sacrifice a little time writing more verbose code, the amount of time you will save in debugging will more than make it up, and you'll wind up being A) more effective B) more efficient, and C) more team-friendly.  Those three things by themselves will make up for 60 IQ points of genius in a team environment.  Those three things will see a self-styled hot-shot programmer shown the door over somebody who is perhaps half as gifted, but considerably more effective, efficient, and team-friendly as a programmer.  It's not all about speed.

On the other hand, if you really are a gifted hot-shot, and you're also effective, efficient, and team-friendly, then nobody's ever going to show you the door unless they're walking you out while they shut the business down, chances are excellent that's not your fault - and they'll be willing to say so in their letter of recommendation.

Re-Throwing Exceptions

This particular section is peculiar to C#, just so you know, and it has to do with what happens when exceptions are re-thrown.  I just got certified in C# and I found this out during my studies. In general, I just use throw; anyway, so it was never really important.

In any case, if you are re-throwing an exception from within a catch{} statement, the way in which you do it is relevant.  If you use a straight throw; then you get what you expect - the exception that was caught is re-thrown unaltered.  However, if you catch it, give it a name, and then re-throw it BY NAME, for example throw ex; then the exception will be copied.  And since there is no way for the compiler to know the exception it's throwing did not originate inside the catch{} block, the exception's stack trace will reflect the fact that it was thrown from within the catch{} block.  That is, if it originated in the try{} block, you lose that part of the stack trace.

A subtle but important difference.  Keep it in mind.


Copyright ©2013 by David Wright. All rights reserved.

Sunday, June 23, 2013

Coding Standards - The Wright Way - Naming Conventions

Now we're into it.  Lots of people get really firmly attached to their particular naming convention, and generally not for a very good reason.  To a degree, I'm guilty of that, too.  I started using Hungarian Notation a long time ago, and I still favor it.  However, what I use now is not really Hungarian Notation.  It's a variant that has seen a good many tweaks and modifications over the years, some of my own invention, some ideas that I got from others.

Why Hungarian Notation?  Well, because even though most languages are strongly typed, I like it for the following reasons:


  1. I don't have to go back to the declaration to find out what the thing is - I can see that.
  2. If I have several variables that refer to the same data but in different forms, I don't have to come up with contrived and baroque ways to distinguish them.  I can distinguish them literally by type.  That means that the association of the data is retained in the name, and not lost because I had to jump through a dozen hoops because it changes data type a couple of times.
  3. It makes it clear from the code whether up-casting or down-casting can be implicit or must be explicit.
  4. With things like IntelliSense, if I've forgotten what the thing is and only know vague particulars about it, I can still quickly narrow the scope of candidates to a very small field - by the prefix.
  5. If you name variables and such with Hungarian Notation, you will absolutely NEVER have a name-collision with your compiler/programming language, because NONE OF THEM USE IT.
Now a lot of people pooh-pooh Hungarian Notation simply because it is Hungarian Notation.  "No real programmer uses that any more!  Harrumph!" Bullshit.  What works works because it works.  I don't care if Charles Babbage came up with the idea.  If it's good it's good.   And if you're one of those people who insists that new is always better, I have exactly two words for you:  "New Coke".

One of the more baroque reasons for not using Hungarian Notation that I've come across is, oddly, that it's not Hungarian Notation.  That is not a reason for not using Hungarian Notation.  That's a reason for not using the things that are like Hungarian Notation where the association is by name only and the thing really violates pretty much all of the principles of Hungarian Notation.

What I do notice frequently about the above group, though, is that even though they firmly pooh-pooh Hungarian Notation, they frequently cobble together contrived and inefficient variants of it on the spot in their code because they find themselves in need of just such a mechanic!  Instead of using ad hoc mechanisms that are invented on the spur of the moment to meet a particular need, isn't it better just to use the thing pro forma and have the problem solved before it occurs?  Clarity, remember?

Basic Rule

Now, the Basic Rule of naming conventions.  This should never be disregarded, and if you choose to do so, you do so at your own peril.  Many compilers will enforce it, so they won't give you much choice, but some are a little more lenient, and in those cases, you have to impose the discipline on yourself.  So what you don't want to do:
  1. Never use a character in a name that has a special meaning in any other context.
I really shouldn't have to say that, but I see that happening every bloody day and it sticks in my craw something awful.

Recommendations

Now, besides that, there are a few tried and true recommendations.  Here they are:
  1. If your compiler or language imposes all-upper-case or all-lower-case, then you're stuck with it.  If neither is true, then you have the freedom to choose, so use BOTH cases.  I'm talking about Title-case, or Pascal-case as it's sometimes called. That is, if the thing you are naming is a first name, then you should use a variable name like FirstName.
  2. Do Not Ever Use Camel-Case for Anything, Period.  I don't know who came up with that, but it was a dumb idea then and it hasn't aged well.  Camel case is like Title-case but the first character is not capitalized: firstName.  Ugh.  I will explain that in more detail later, but for now, just take it as read.
  3. Lay off the underscores.  Lots of people use them, and it's a bad idea for one very simple reason.  That character is in an awkward place on the keyboard, and there are a dozen ways to arrange your life so it is not useful, so why add the pain when there is absolutely no gain?
  4. If it is possible, restrain your naming convention to using alphabetic characters only.  That's just for the sake of speed.  All of this comes together, I promise.
  5. Don't use single-character names unless the name is naming a variable that is being used in a mathematical equation and that's how the variable appears in the equation.  So if you're calculating the conversion of mass to energy, you can use rE = rM * rC ^ 2.0F );, but if you're naming a generic index in a loop, then call it nIndex, not nI.

Name Composition

How do you compose a name?  Well, what does the thing represent, or do?  That's the name.  So FirstName, LastName, FullName and Age are all OK. A function that combines FirstName and LastName into a complete name might be declared as string MakeFullName( string xsFirstName xsLastname );.  Some people get caught up in rigid Noun-Verb orders and nonsense like that.  I say if you have Nouns and Verbs, then put them in an order that makes sense.  If under the circumstances FullNameGet makes more sense than GetFullName, then go with it.  Rigid rules, in my experience, ultimately lead to nonsense names that have to be nonsense because that's what the rule says they have to be.  Rules aren't meant for that kind of thing.

You may choose the order based entirely on an artificial environmental constraint.  Perhaps your IDE keeps a list of classes, properties, and methods and by default sorts them in canonical alphabetic order in that list.  For the sake of convenience, you might choose to go noun-verb so that things are associated by name in that list, or verb-noun so that they are associated by function.  Either is a valid reason. Do what works best.  Just one caveat, though.  If you're in an environment where many people work on the same code-base, then get together with them and arrive at a consensus about how you want that list to sort.  Everybody will have to deal with it, so everybody should get a shot at airing their concerns.

Having said that, how about using abbreviations?  Well, if you can safely use the abbreviation without losing clarity, then by all means save yourself the typing time.  I'm not going to stand here and demand that you name a variable SizeInMilliMeters instead of MmSize just because I slapped an arbitrary rule on you that said, "You can't use abbreviations".  That's nonsense.  If you don't know that 'mm' is short for Millimeter then you've got bigger problems than I'm inclined to help you with.


Bringing It Together

Now, you're going to say that it looks like I'm advocating the exclusive use of Title-case.  You got it right the first time!  Move to the front of the class.  If you have the freedom of mixed-case, then there is no reason to use anything else.  TitleCase makes things clear to read and it's really fast to type.  Couple that with Hungarian Notation, and what you've got is a naming convention that depends virtually entirely on the alphabet alone, is fast and easy to type, clear and expressive.  What else do you really need?  Nothing.

Now, what should you name like this?  Anything not covered below, so Class/struct declarations, method definitions, delegate declarations, enumeration definitions, and event definitions.  Now there is a certain amount of leeway, of course.  If you are writing a container class and you want the Count and Capacity properties to be Count and Capacity for the sake of consistency with other container classes, I'm not going to rag on you about 'not following the standard'.  Consistency contributes to clarity, too you know.  And while it would be nice if everybody agreed on something, that's not happening at least in my lifetime, so live and let live.

One benefit of that leniency, however, is that if you have a container class, for example, and you define a property Count that does exactly what Count does in every other container class, but you need another, similar one that is almost but not exactly like Count, you can provide a different one.  Count for the property that behaves just like Count, and nCount for the special variant that does something just slightly different, thus gaining the benefit of both while sacrificing neither.

Camel-case and Loathing

I mentioned I would explain my loathing of CamelCase.  One very popular coding standard espouses the following ridiculous doctrine:  Use CamelCase for fields and parameters, and TitleCase for Properties and Variables.  Why is that ridiculous?  Think of it.  You have a class.  It has a field which convention requires you to name "firstName".  It has a property called, by convention, 'FirstName' that reads the variable out as stored, but when a value is passed in, the property code validates the contents of the value before it assigns it to the field.  This is not the least bit unlikely as a scenario - in fact it is quite likely.

What you have now is this:  Code where whether or not the contents of that field get validated depends on whether or not you accidentally miss the shift key.  That's an incredibly easy mistake to make and very common - I do it a dozen times a day, and I'm pretty good.  My problem is that I type really fast, so sometimes I don't actually hit the keys in the order my brain intended, so maybe I intended to Hold-Shift-Press-F, but what my hands actually did was Press-F-Hold-Shift, and voila, I have a lower-case 'f' where I meant to type an upper-case 'F'.

The problem now being that, if we're talking about that firstName field up there, now the value assigned to it won't be validated.  That bug might sit in that code for ten months and never show up once.  Not until an invalid value is passed in.  Now you will have invalid data lurking in your program that every other stitch of code assumes is good data - because it's supposed to have been validated, after all.  Yet, looking at the code, it's really hard to see that there's anything wrong with it.  And even now that you've got bad data in there it might not actually throw up any red flags!  Your program might pass through bad data for years without complaint, and it's entirely possible that the bug is detected, if it ever is detected, completely by accident.  That's B.A.D. as in Broken As Designed.

You might spend hours looking for that bug, especially if it lurks a long time before it manifests itself.  The problem will be worse if somebody else is debugging it, because they don't know that you intended to use the property instead of the field - something that, after ten months or so, even you won't be sure of anymore.  It's a bug that should never have occurred in the first place, and the coding standard actually makes it possible. Coding standards should not encourage bugs that otherwise would not exist.  In fact, that is precisely unlike anything a coding standard should do.

Hungarian Notation - My Convention

I mentioned that I favor Hungarian notation, but I also noted that what I use now is not like the classical version.  Languages evolve.  Data types evolve.  Coding standards should evolve, too.  So what I use is based on Hungarian Notation, but different in many respects.  I'm not going to say this is the One True Way - I don't think like that.  If you've got a better idea, then go ahead.  Just make sure you do it for a reason, not just because you demand that you have to be different.  Here is the summary:

The prefixes for types that I use are fairly extensive, because there are lots of new types and lots of new ways to use them.  So I'll go by category, starting with first-level modifiers.  These go at the immediate left of the name of the object.

Integral Types

  • Signed Integers:  n   (nIndex)  (from iNteger)
  • Unsigned Integers: u  (uByte) (from Unsigned integer)
  • Real Numbers: r  (rPi) (from Real number)
  • Characters: c  (cChar) (from Character)
  • Pointers: p (pPointer) (from Pointer)
Integral types may optionally be modified with the byte-length of the type, so Signed Integers can be n, or n1 (byte), n2 (short), n4 (int), or n8 (long).  Unsigned Integers likewise.  Reals can be r, r4, r8, or r16.  Chars can be c, c1, c2, or c4.  Pointers can be p, p4 or p8.  Sometimes it is useful to make the distinction, sometimes it is not.  Do whichever is clear under the circumstances.  

You will recognize that in some languages there are identities to be found, for example in C++, u1 can be the same as  c1, u2 can be the same as c2, and u4 can be the same as c4.  Use whatever clearly expresses your intent.  For example, say you have a string, which you wish to convert to an array, which you then want to stream as a byte-stream (for an admittedly contrived example).  You might convert sString to acString, and then to au1Bytes.  Internally, acString and au1Bytes might be identical structures.  In fact, in C++ there is a good chance that all three are identical structures.  That's fine.  If they are, the compiler will optimize out the difference and likely treat them as the same entity until their contents diverge, at which time it will make them separate copies, but if the contents don't diverge, the compiler will likely simply treat one as an alias for the other until it has a reason to do otherwise.  As a programmer, you're interested in CLARITY.  If the optimizer finagles your code to make it more precise, well, that's what it's there for.

Non-Integral Types

  • Strings: s (sFirstName) (from String)
  • DateTime: d (dToday) (from Datetime)
  • Extended Reals (like the C# decimal type): m  (mOnamatopoeia) (from deciMal)
  • Instanciated Interfaces: i (iArray) (from Interface)
  • Instanciated Classes/Structures: o (oMyObject) (from Object)
  • Generic Types: t (tClass) (from Type)
  • Delegates: g (gCallback) (from deleGate)
  • Enumerations: e (eEnum) (from Enumeration)
I put delegates in there specifically for C#, because they are taken as different than in C++, but in C++ if you feel more comfortable calling them pointers and treating them that way, I'm not going to fault you for that - they are.

Array Modifier

  • Array: a (acString) (from Array)
An array modifier may optionally be modified with a number of dimensions, so for example a two-dimensional array of byte might be called a2uBytes.  You may also repeat the 'a' to represent the number of arrays in a sparse array, so you might have aauBytes to name a two-dimensional sparse array of byte.  If those are required for clarity, then by all means use them.  If they do not lend any clarity, than do not - just tag it as an array, so auBytes.  Of course, this can be combined with integral modifiers, so you could have a4u8Matrix, which describes a four-dimensional array of 8-byte unsigned int, or aai4Indices, which indicates a two-dimensional sparse array of four-byte signed integers.  Use whatever you need to be as clear as you need to be.

Again, don't feel like you need to get nuts.  These distinctions are sometimes very useful to have, and at other times there are just meaningless overhead.  You'll know you have a multi-dimension array generally from the usage (anNumbers[ 10, 10 ]) and the same goes for a sparse array (anNumbers[ 10 ][ 10 ]).  Clarity - rule 1 - do whatever makes the code more clear.

Collection Modifier

  • Collection: c (csString) (from Collection)
This is a new one.  Historically, I've just treated collections like arrays, but as the utility of collection classes expands, that identity doesn't really apply well anymore.  So I've adopted a new modifier to indicate a collection class, as distinct from an Array.  This can apply to any type of collection, be that a List, Dictionary, Stack, Queue, Bag or what have you.

Context: Variable/Constant/Input Parameter/In-Out Parameter

As you may or may not know, a property isn't actually a variable - it's a function.  I mention that up-front so as to avoid confusion.  The next modifiers are prefixed to the above modifiers to indicate the context in which a variable exists.  Specifically, whether or not you are intended to CHANGE IT.

  • Class Constants: k (knMaxCount) (from Konstant)
    • This applies to both constants and statics.  Enumerations require no such prefix because they're enumerations - which means that for practical purposes they qualify as a type definition and thus may be plain TitleCase, like public enum State { None, First, Second, Third };.
  • Instance Variables/Fields: f (fau1Buffer) (from Field)
    • These are garden-variety instanced field variables.  The corresponding property (if there is one) will probably be declared with exactly the same name, but without the leading 'f'.  Thus, if you refer to it as 'fau1Buffer', you are referring directly to the instance variable, but if you refer to 'au1Buffer', then you are referring to it through the accessor (property), and thus are using any additional code that implies.
  • Input Parameters: x (xsName) (from eXclusive)
    • This is prefixed to parameter declarations of any type that are being passed into a function with the intention that they are not modified.  
    • This is an important distinction, because in C# for example, the default is to pass instances of classes by reference which means, in principle, that if you modify that instance, you have modified it outside the scope of the function.  Normally, that's bad - and those bugs are tough to track down.  
    • So the leading 'x' is a specific statement both to the calling function and inside the declaring function that when the function exits, that value has not been changed - guaranteed.  Basically, it means 'read-only' or 'constant' if you prefer.
  • Input/Output Parameters: m (mmFactor) (from Modifiable)
    • I've used this specifically along with the type declaration 'm' to illustrate that they serve different functions and are in different places and that's OK.  The 'm' meaning 'Input/output parameter' is always followed by either an array/type or type indicator, so there is no risk of confusion.  
    • This is used, as 'x' is used, in the declaration of a parameter.  The purpose of the 'm' in this case is to indicate specifically that the parameter in question may be modified by the function, which means that the value after the function exits will not of necessity be the same as it was when it was passed into the function.  
    • It is not mandatory that the value change - that's not the point.  The point is that from both the calling function and inside the declaring function, it is clear that the value can be modified, and indeed probably should be.

Interface Elements/Components

This comes up a lot more frequently these days, and is one of the places where people fall back on cobbled-together variants of Hungarian Notation because they need something.  Well, here is a set to start you off:  These are three letters long instead of one, because A) they are a specific type of object that is used in a specific context, and B) there are a lot of them and more by the day.  There may be a time when four are needed, but three allow for 17,576 combinations, so there's time.  These likewise go in front, to btnOK.

  • Panel: pnl (from PaNeL)
  • Form: frm (from FoRM)
  • Menu: mnu (from MeNU)
  • Context Menu: mnc (from MeNu Context)
  • Menu Item: mni (from MeNu Item)
  • Button: btn (from BuTtoN)
  • Status Bar: sts (from STatuS)
  • Text Field: txt (from TeXT)
  • Numeric Field (Roller): nmr (from NuMeric Roller)
  • Drop-down-list: ddl (from Drop-Down List)
  • List: lst (from LiST)
  • Grid: grd (from GRiD)
  • Table: tbl (from TaBeL)
  • View: vew (from ViEW)
  • PictureBox: pic (from PICture)
  • Image: img (from IMaGe)
  • Icon: ico (from ICOn)
  • Bitmap: bmp  (from BitMaP)
  • Dialog: dlg (from DiaLoG)
  • Thumb: thm (from THumB)
  • Scroll-bar: scb (from SCroll Bar)
  • Cursor: csr (from CurSoR)
  • Header: hdr (from HeaDeR)
  • Footer: ftr (from FooTeR)
  • Interface Element: ele (from ELEment)
    • This is kind of a catch-all for interface elements that don't already appear in the list above, and in fact if you'd rather just use 'ele' as the prefix for all interface elements, by all means do.  I fund it useful to know precisely which are which, at least for the common ones.
  • Component: cmp (from CoMPonent)
    • This is a catch-all for those 'drop-on' interface elements that aren't actually interface elements as such.  Things like the C# Timer class, which can be instanced in the form designer, but has absolutely no corresponding visual representation.


Calling Conventions

I'm going to toss a note in here about calling conventions.  This applies specifically to object-oriented languages, and I feel strongly about it, so it's worth mentioning.

If you're referring to something that is a constant, property, enumeration or method associated with a class, then when you call it you should always directly prefix it with the parent class.  Thus, MyClass.MyEnumeration.  If you are referring to something that is a property or member of an instance of a class, then you should always directly prefix it with the instance name of the instance.  Thus oObjectInstance.ToString().  That's not difficult - some of you will be saying, "Uh, yeah, dude.  How else can you do it?"  Sure.  That's the obvious part.

Now here is the tricky bit.  This applies even inside the class or instance in question.  That is, if there is code inside a class definition that makes reference to a property or method of that class, then it should refer to it using the keyword 'this'. Thus this.ToString(), or this.GetType().  If you are referring to the enumeration MyClass.MyEnumeration in the code inside MyClass that appears in an instance-specific method, then you still use the class name.  Thus this.eEnumValue = MyClass.MyEnumeration.One;.

Why?  Clarity.  Make it starkly clear what you are referring to.  You might say, "It doesn't matter, I know which ones are local variables and which one are instance properties!"  Yeah.  You do.  Right now.  Look at that same code in two years and you won't be so sure.  So you'll have to check.  If you always do it this way, then there is never any question about what you're doing.  If a variable in a function is not referred to with the prefix 'this', then it's not an instance value - it's on the local stack.  Now you know the precise scope of the value - whether it will be in effect instance wide or just in the body of the function you're actually in.

Clarity first.  Always Clarity.

Summary

Don't get the impression that this list is by any means exhaustive.  I change newer parts of it occasionally, and invent things as they become necessary.  This is already considerably more elaborate than the first formalized version of this schema I learned from a fellow whose full name I won't use without permission, but you know who you are, PM.  That was in turn more sophisticated than the version of it I had been using up to that point, which I adapted when I read about Hungarian Notation, which modified an earlier version of it I had picked up from UseNet back in the early '80s before I found out it had a real name.

Things change, and over the past 30+ years, this has changed a lot.  This will change more, of necessity.  Computer science is still very young.  There will perhaps come a time when schemes like this are no longer even relevant.  That's progress.  For now, it works, and I'll continue to adapt it and use it until it stops working or I find something better.


Copyright ©2013 by David Wright. All rights reserved.

Coding Standards - The Wright Way - Formatting

Formatting is a lot more important than it seems.  If you don't believe that, then!justLOOKat,this-mess?here.  You get the point, I trust.

Most languages in use today are C-like in structure.  Block delineators like {} or BEGIN...END or some such thing, explicit end-of-line markers (or implied ones) so that statements can span multiple lines, and indentation allowed.  For languages that don't have any of those things, you're on your own, but as a bit of advice, I'd advise you to move into the 21st century, for dog's sake.

Bracing Style

Now the management of block delineators is called 'bracing style', and there are lots of them.  At least a half-dozen that I know of personally that are in use, and another half-dozen that I've heard of but never actually seen in use beyond sample code.  There is a reason many fall by the wayside.  Lots of them are junk.  A few stick around, though.

Of the kinds I know and have actually seen used here is a sample:

Whitesmith


if( <expression> )
    {
    <statement>;
    <statement>;
    }


TB (or 1TBS or One True Brace or Kernighan and Ritchie or K&R)

if( <expression> {
    <statement>;
    <statement>;
}


PICO

if{ <expression> )
{ <statement>;
<statement>; } 

GNU

if( <expression> )
    {
        <statement>;
        <statement>;
    }


Allman or Block

if( <expression> )
{
    <statement>;
    <statement>;
}


Most of them are just "I do it that way" styles - they have no particular benefit.  Many of them are designed to conserve one thing - screen real-estate.

On conserving screen real-estate, that was a valid concern back in 'the day' when K&R were coding.  Even a high-end monitor had 24 visible text lines to play with and a line length of 80 characters per.  In a VI editor, 2-4 of those were chewed up by the editor itself, leaving 20 for actual coding.  Screen real-estate was a big deal.  Nowadays, you've got upwards of 40 lines available in most code editors with something in excess of 120 characters per line - more if you've got good vision and can use smaller fonts.  If you still feel a deep-seated need to conserve screen real-estate, then whatever you're doing, you're doing it wrong.

I've always used the Allman/Block style myself, for one very simple reason.  It's nice and sparkling clear.  I used it even back in the days when I was dealing with 24x80 monitors - yes, screen real-estate was an issue, but to me the clarity of the code was a more important consideration.  That opinion has not changed.

I've tried (and sometimes had to use) the K&R/OTB style popularized in Javascript and while I can see the intent of the original use, it is no longer important in that respect and it should have died out a long time ago from lack of use.  The rest are either screen-conserving to the point of being unclear, or unnecessarily baroque.

Almost any editor will natively support block bracing, most do that by default, and the ones that don't can be tweaked that way or used that way with minimal effort.  So use block bracing:

Function/Method Bodies (Including Accessors/Properties)

Function/Method bodies should ALWAYS be code blocks, and thus ALWAYS be enclosed in braces, regardless of length.  That is, none of this:

void main()
   MyFunction(); 


instead use:

void main()
{
    MyFunction();
}

Control Constructs

In simple control constructs, if the body of the construct has only one line, and that line is not itself a control construct, then braces are optional, but if the control constructs are nested, then brace them individually - if you don't, it is very easy to construct control trees where the actual flow is not evident from the indentation.  So:

if( <expression> )
    <statement> 


or

if( <expression> )
{
    <statement>
}


or

if( <expression> )
{
    while( <expression> )
        <statement>
}


BUT NOT

if( <expression> )
    while( <expression> )
        <statement> 


In the simplest case, the latter will work, but at even low levels of complexity, the precedence can be confused or lost and those can be fiendishly difficult bugs to track down, especially since there is generally nothing obviously wrong with the code unless you examine it very carefully.

Special Cases - Switch Statements

In case clauses of a switch() statement, you may use braces in individual case clauses or not, at your discretion.  Clarity is important, but there is no need to go nuts.  Switch...case statements are by their nature fairly clear already.  I would recommend that if the body of the case clause is longer than five lines or so, then making a code block is probably a good idea, so use braces.  If it's only two or three lines, why bother unless you think it's important (if for example there is a comment)?  But the switch() statement should always be a code block, so this is fine:

switch( nVar )
{
    case 0:
       return 9;
    case 55:
       print( "I'm a duck" );
       break;
    case 99:
    {
       print( "I'm a moose" );
       <other lines of code>
       break;
    }
    default:
    {
       // Nothing to do
     
       return -1;
    }
}


As shown, indent the case statements one tab from the switch, and the body of each case statement one tab from the case.

To Tab or Not To Tab

For indenting, I've actually had university students say that their professors instructed them to use spaces for indenting, or to use tabs but set the editor to convert tabs to spaces.  This reinforces my opinion that university professors really need to get out of the classroom more.

That was a valid procedure 'way back in ancient times when not all editors expanded tabs correctly.  I haven't seen an editor like that in 20 years.  Set your editor to use tabs, and to KEEP THE TABS.

Why?  Remember that rap about multiple coders?  If you keep the tabs, then if I like to indent 4 characters, Fred likes to use 2 and Jane likes to use 8, each of our editors will expand the tabs as far as we have the editor set to expand them, and we'll all see what we want to see regardless of who wrote the code.  At the end of the day, the code will all be indented uniformly in the proper way according to local settings, and you won't wind up with any of this junk:

public void MyFunc( int xnParameter )
{
  nReturnValue = 0;

      switch( xnParameter )
    {
                case 0:
        nReturnValue = 42;
             break;
      case 1:
                                   nReturnValue = 16;
                            break;
                                           case 2:
                    nReturnValue = 88;
                break;
          default:
                                  nReturnValue = -1;
                         break;
                   }

         return nReturnValue;
  } 


'Nuff said.

Spacing

Now, on the use of spacing in statements.  There are a number of places where this is used:

In brackets ():
  • If the brackets are empty, place them together: void MyFunction()
  •  If the brackets are not empty, then a space after the opening brace and before the closing brace: ( int xnParameter )
  • If the brackets are enclosing a type-specification (as in a type-cast), no spaces (to make it visibly distinct): (float)nValue
For square brackets, as are commonly used for array dimensions, the same rules apply.  Together if empty or spaces after the opening square bracket and before the closing square bracket if they are not empty: int[] anMyArray = new int[ 2 ];

For angle brackets, as used in C++ and C# for Template/Generic types, no spaces (to make it visibly distinct): void MyGenericFunction<T>( T xoParameter )

If a comma or semi-colon is used as a separator, no space before it, one space after it:

void MyGenericFunction<T, P>( T xoParam1, P xoParam2 )

If a comma is used as a thousands separator, no space before or after it, just like regular people: 1,000,000

Periods used as decimal places or type/member separators, no space before or after:

System.Web.HttpCookie 

or

1.05

(In French the use of commas and periods is generally reversed in numbers, so the same rules apply anyway:  1.000,50)

If there is an explicit end-of-line designator, no space before it:  int nMyVar = 0;

No leading or trailing spaces in lines.  If there is whitespace leading, it should be a tab, and there should be none trailing.  It's wasted space that will only ultimately force some schmoe to have to reformat the code eventually, because it'll screw up a cut/paste operation.

If you use Labels (and they should seldom be used, but I'm realistic - sometimes they're the best solution), then place the label flush against the left-hand-side of the page.

Put spaces before and after all operators and use brackets freely in expressions to make precedence clear even if it is implied and especially if the desired precedence is different from the default precedence.  Assume nobody knows the implicit precedence by heart:

int nMyVar = 5 + ( 3 - ( 18 / 2 ) + 5 );

The exception are auto-increment and auto-decrement operators, which need not have a space between them and their associated variable:  nValue--;


In-line Void Function Bodies

There is much debate on this.  Some say "NO DAMN WAY" and others say "Why not?"  I say, go ahead if you can keep it clear, but be careful, so:

public MyList: List<int>
{
    public void MyList(): base() {}
    public void MyList( int xnCapacity ): base( xnCapacity ) {}
    public void MyList( IEnumerable xanValue ): base( xanValue ) {}
}

is clear enough - the constructors call the base constructors in the normal way and do nothing else.

But in some cases, it is very bad.  For example in a catch() statement, if you use an empty code block you are swallowing an exception.  If it is manifestly clear why you are doing it (ThreadAbortException), then no problem, but if it's even possible to be unclear, then at least EXPLAIN YOURSELF.

You should never swallow an exception willfully - it's Bad Juju.  If you've got a reason, make sure that if somebody else comes along, or you are looking at it a year from now, you've left a note that gives your reason so you don't put in some handling just because it's pro forma and, because you didn't know or forgot the original reason, you introduce a problem that didn't exist before.

try
{
}
catch( ThreadAbortException ) {}
catch( Exception )
{
   // I'm not doing anything with this exception because this is just an example.
}

or even use a hybrid:

try
{
}
catch( ThreadAbortException ) { /* Automatically thrown on Abort() - no need to handle */ }
catch( Exception )
{
   // I'm not doing anything with this exception because this is just an example.
}

And, of course, if you're just going to re-throw, you don't have to really explain that (except perhaps to say why you have the catch() there in the first place):

try
{
}
catch( Exception ) { throw; }  // Why??

but this is obvious enough:

try
{
}
catch( Exception xoError )
{
    Log( xoerror );
    throw xoError;
}


Commenting

Some people get really rabid and frothing about commenting.  Let me be blunt.  If you're coding well, what you are doing will be evident even to juniors - most of the time.  Commenting every line of code is just wasting time, and those who espouse such practice do it out of dogmatic adherence to ancient strictures that have long since fallen away.  Back in the days when an identifier could only hold 4 letters, commenting was important, because without it most code was nearly unintelligible.  Now we have long identifiers, and it's not as important, since the identifiers can be self-describing.

Having said that, if what you are doing is even a little unclear or tricky, or your algorithm is unusual or complicated, then explain yourself - step-by step if that's what it takes to be clear.  If it's a really weird algorithm, you might want a fairly large block explaining it in general PLUS step-by-step comments as you code the algorithm to describe the steps as they occur.  What you require to be clear is what is right.  Not what some schmoe tells you because he's perpetually stuck in the '60s.

On commenting functions, I generally go with the principle that all public members should at least have cursory commenting to describe function, parameters, and return values.  the more complicated the function, the more commenting is appropriate.  For private or protected members, comment if you deem it necessary.  However, if you're using standard Form event handlers in C# for example, then:

private void MyOKButton_Click( object xoSender, EventArgs xoArguments ) { ... }

doesn't leave much to the imagination, does it?  Any comment you add to that will only be reiterating what you have so thoughtfully coded quite clearly, unless that handler looks that way artificially and it doesn't actually do what it clearly appears to be doing - handling the Click event of a Button called "MyOKButton".

In Visual Studio, there is an auto-documenting comment feature that can be turned on that is very handy and I highly recommend using it.  In that case, make sure that you comment anything that is not clear from the FUNCTION PROTOTYPE - operating on the assumption that the prototype may be all that the person can see - they might be reading the documentation generated from the comment.

In that environment, also keep in mind that your function summaries are what appear as the tool-tip when the built-in IntelliSense system tries to supply automatic help for your function - so take advantage of it.  In the long run it will save you and your associates much time for very little pain.  Such systems may appear in other IDEs as well - where available, take advantage of them.  Over-commenting has never killed anybody, and is preferable to under-commenting.


Copyright ©2013 by David Wright. All rights reserved.

Coding Standards - The Wright Way - Introduction

Much is said about coding standards.  First thing to get out of the way: I think they're a Good Thing.  Second thing:  Most of the ones widely publicized are dreck.

What is a coding standard good for?  well, if you've got a number of people working collaboratively on something, either one-shot or ongoing projects, it is useful if everybody codes the same way - it makes everything easier to understand by everybody, which makes everything more efficient.  Even if you're the only one who does the programming, and you do it all for yourself, sticking with a good coding standard is good for just you, because when you come back to that code in six months or six years, it will still make sense to you and it won't be painful to make your eyes look at it.  At least not unnecessarily painful.

What should a good coding standard do?  First and foremost, it should promote clarity and legibility.  Second, and nearly as important, it should promote interchangeability of code.  Third, it should as much as possible improve the comprehensibility of code by adding as much implicit documentation as reasonably achievable.  Fourth, it should not be unduly intrusive or inordinately easy to screw up.  Fifth, and least important, it should be as convenient as possible while remaining compatible with the preceding four goals.

What is the One Best Coding Standard?  The God Standard, I suppose.  The one that god used when he wrote the universe.  I have no idea where you can find it, or if it would be applicable in contemporary programming languages, but experience is an excellent teacher, and I'm gonna bet that god has more than anybody.

As I said in the introduction, I don't believe in the One True Path.  No such critter, no such place.  So I'm not going to foist the One True Way on you here.  I'm going to spell out a few things that I've learned over the years that have proved, over many years, to be useful in the fulfillment of all the above criteria.  It may be that you don't agree with some of it, or all of it.  I don't particularly care.  What I would challenge you with, though, is this:  Before you discard any or part of this out of hand, ask yourself honestly what your reason is.

I don't do anything capriciously.  I've been doing this for over 30 years, and I'm gonna bet that means I've got a lot more experience than you.  Experience isn't the be-all and end-all, but it does have advantages that you can't get anywhere else.  You see a lot of things with a lot of experience.  You use a lot of things, too.  Over time, you find out which things work well and which things don't.  You sometimes stumble upon things that work very well that maybe nobody has ever thought of before.  If you ask yourself that question honestly, you're probably going to answer yourself with one of the excuses I used to use all the time:

A) "I know what I'm doing."
B) "I was taught this way, so I'm not going to change."
C) "I don't see an immediate benefit, so it's not worth the trouble."
D) "Who is this guy anyway?"

None of them are valid reasons.  All of them are excuses. All of them are BS.  I used to use them myself.  Over time, I grew up - learned to stop lying to myself.  Learned to be a little more open-minded and a little less narrowly focused.  Learned to look at things less immediate than coffee-time.

Having said that, I see a lot of 'coding standards'. I'm forced to use some of them.  Most of them are crap.  Most of them default to one of three reasons for existing:

1) Somebody with a university degree researched the topic thoroughly for a year or two and wrote a paper.  Since it came from a university, it must be good - right?
2) Somebody else does it that way, and they do a lot of programming, so it must be good - right?
3) Somebody called themselves an 'expert' and said it was the best way, so it must be good - right?

As you may have guessed, all of the above are not reasons at all.  They're rationalizations - no more than that.  Maybe the coding standard you use has a lot of peer support.  Here is a clue for you.  Lots of teenagers are sure they know everything, and lots of other teenagers agree with them.  Lots of teenagers are still idiots.  Lots of them don't realize that until they're in their 30's.  Some of them never learn.  The hubris of the young is a universal constant.  The fact that lots of people do it that way means only one thing - it's handy.  Convenience has never equated to quality.  Ever.

So, on to the meat and potatoes, as it were...


Copyright ©2013 by David Wright. All rights reserved.

Thursday, June 13, 2013

On status, 'checking' in, interruptions, and various other forms of making damned sure nothing gets done.

I hate being interrupted.  I program. It requires concentration. Some of the problems are pretty thorny, and require a lot of thought.  The profession itself requires rigor, discipline, and not a little precision.  So I hate being interrupted.

I know a lot of you feel the same, so if you want to, print this out and surreptitiously drop it on your manager's desk.  Maybe they'll learn, maybe not, but it's worth a try:

Interruptions


Regrettably, there is a certain type of middle manager who just doesn't actually believe they are doing their job unless they are poking.  They scale all the way from the ones who poke every five minutes, to the ones who poke ever hour, to the ones who poke every couple of hours, or a couple of times a day. Some only poke a couple of times a week. The 'Saints', I call them.

The poking covers the gamut, too, from the died-in-the-wool bullies who stand there and berate you verbally, if not physically, to the passive-aggressive ones who just drop by for a 'chat' from time to time, to the officious ones who require 'status updates'.  Now this is offensive a couple of different ways.

One is the implicit assumption that I'm not doing my job.  Like when they turn their back, I work for three minutes and then magically teleport to a mystical Shangri-la where fairies feed me chocolate-covered strawberries, and I get mystically teleported back just before I'm going to get poked again.

Another is the totally irrational idea that somehow, by interrupting me, they are in fact getting me to work harder, or be more 'focused', or work better.  I'd like to be in the operating theater when they're getting open-brain surgery some time, so that I could poke the surgeon every few minutes and ask him, "Hey, are you doing alright?  Are you being the best brain surgeon you can be right this very second?"  I'd just like to see how they come out of that operation.  Better, normal, dysfunctional, semi-comatose, or catastrophically brain dead.  I'm betting one of the last three.

Another is the ridiculous notion that they can get me to work faster.  I only have one speed.  Flat-out.  If the deadline is short, I can perhaps rearrange things to get some things done first, and if time is short, trust me - I feel the pressure more than they do.  But everything is going to take as long as it takes. Sorry, but that's just how life is.  I can't get N + 1 things done in the same amount of time as N things.  If I could, I would have cancer cured in five seconds flat while simultaneously solving every single problem in the world and getting rich.  Since I haven't done that yet, the theory ain't panning out, can we agree on that?

So I'd like to explain what that poke does.  Just so you know.

I'm focused, concentrating, and working at top speed. When you poke me, you are forcibly redirecting my attention AWAY from what I am working on to you.  What that has done is derail my train of thought, like you'd nuked the tracks.  At this point, it doesn't matter anymore if your poke is going to last 3 seconds or fifteen minutes.  The damage is done.

Now, depending on how upset I am, I will no longer be productive for as long as I'm being poked plus anywhere from fifteen to thirty minutes - perhaps longer.  Why so long?  BECAUSE YOU DE-RAILED THE GODDAMNED TRAIN!  Now I have to pick it up, get it settled back on the tracks, dust it off, fix it, get in, start it up, and get it back up to speed again.  If what I was working on was very difficult, that will take longer than if what I was doing was comparatively simple, but it's not going to be instant, no matter how simple it is.  The very best you can possibly hope for is that, by sheer coincidence, the train had just pulled into a station when you interrupted.  In that case, it will take me about a minute to realize that it's in the station, maybe another two to recall which direction I was heading next, and I'm off.  That isn't likely, though.  It's rare as hell.  Certainly nothing any sane person would bet on.

If I am more upset, then I will lose even more time because I'm trying to do that while I'm fuming, so it's kind of like trying to work in a dense fog.  It's harder to see stuff, harder to move with precision, harder to focus - just harder to do everything.

If your hope was to get me working faster because there is an approaching dead-line, well, congratulations, what you have actually achieved is that the result of my efforts will now be delayed by <time for nudge to occur> + <time to get train of thought back on-track> x <level of difficulty> x <fuming factor>.  You've delayed what you need to have done quickly, and it's your own damn fault because I sure as hell didn't ask you to interrupt me.

There is exactly one reason why you should interrupt me, if you must do so.  If, and ONLY IF you need me to stop what I'm doing and switch to doing something else.  In that case, I have to change trains anyway, and there's no way I'll know I have to do that unless you tell me to, so interrupt away.

If you feel you must have status reports, then schedule a meeting at a specific time each day when I can tell you how things are going, preferably first thing in the morning.  Once the status meeting is over, then GO AWAY.  Because now I'm working.  If you schedule it in the middle of the day, then here's a news flash:  It's STILL INTERRUPTING.

And if you can put off that status meeting to once a week, so much the better.

Now in general I have more than one person poking me, so if you only interrupt twice a day, and are thinking, "that's not so bad", well, it ain't just you.  It's you, and that guy over there and that guy, and the guy over in the corner. All of you are doing it, and even if all of you are only doing it twice a day, that's still an interruption every goddamned hour, which means my productivity is being cut in half by your collective messing around.  I would be farther ahead if you all got together twice a day and ganged up on me.

All of you are absolutely certain that your particular thing is THE MOST IMPORTANT thing, but you know what?  Get together with them and work it out among yourselves, because if everything is a top-priority emergency, then it's all just normal stuff.  Figure out some priorities and get back to me when you have them sorted.  I've GOT WORK TO DO, DAMMIT!

This rant has been brought to you by the Geek-boy.  We now return you to our regularly scheduled programming.


Copyright ©2013 by David Wright. All rights reserved.