Monday, June 24, 2013

Coding Standards - The Wright Way - Writing Code

It had to get to this eventually, now, didn't it?  No problemo.  I have only a few rules, and they make eminent  sense.  You may have noticed that I tend not to be very rabid and frothing about most things, so you've got wiggle-room, too.

Organization

First up, organization.  People - a lot of people - just slap their code down in the order it comes into their head.  I guess they like digging round like gophers all the time.  I like to put a bit of organization on my code, so that if I'm looking for something, I know approximately where it's going to be, and I can make use of outlining in an effective manner to manage my display.

So, in a generic class definition, in any language (and in OOP, everything is a class by fiat), I order my elements as follows, with regions specified as shown (assuming C#, of course):

<library declarations/>

<namespace>
    <namespace delegate declarations/>

    <class definition> <ascendants/><interfaces/>
        #region Constants
        <class-local constants/>
        #endregion

        #region Enumerations
        <enumerations/>
        #endregion

        #region Delegates
        <class-local delegate declarations/>
        #endregion

        #region Static Fields
        <class static fields/>
        #endregion

        #region Static Accessors
        <class static accessors (class properties)/>
        #endregion

        #region Fields
        <instance fields/>
        #endregion

         #region Accessors
         <instance accessors (properties)/>
         #endregion

         #region Static Constructors
         <class static constructors/>
         #endregion

         #region Constructors and Destructors
         <constructors>
         <destructors>
         #endregion

         #region Static Methods
         <class static methods/>
         #endregion

         #region Methods
         <instance methods/>
         #endregion

         #region Event Handlers
         <event-handler methods/>
         #endregion

         #region Events
         <event declaration>
         <Event/>
         <OnEvent method/>
         </event declaration>
         #endregion

         #region Interface Implementations
         <interface implementation>
         #region <Interface Name>
         <implementation/>
         #endregion
         </interface implementation>
         #endregion
    </class definition>
</namespace>

Now, you will note that you definitely don't use all of these all the time, and that's fine - leave out the ones you don't use at any given time.  This just shows you for each where they go in relation to each other.   You'll also note that for some interface implementations, for example IComparable, they consist of a single element, in that case a CompareTo() definition.  In others, they may be composed of multiple types of things.  The scheme above seems to suggest that all of the properties and methods of a specific interface should show up in the bottom group.  That's exactly what it suggests.

Now, the next complaint will be that this involves a lot of extra non-functional coding.  I regard it as implicit documentation, but if it helps you at all, you could actually write a macro that you click when you're writing a new class definition that automatically slaps all of those #region..#endregion groups in there.  They're a lot quicker to delete if you don't need them than they are to type if you do.

I've also read rants that suggest that #region/#endregion constructs are evil because they are used by default as border tags for text folding.  Bushwah.  If you don't like text folding, then turn it off - you can do that.  They're first and foremost a GROUPING construction, not a folding construction.  The primary use just happens to lend itself to the secondary use.  Grouping is very useful in code.  I've lost a lot of hair to having to basically scan through an entire file looking for a method because the stuff is all just scribbled out in the order it was scribbled out, so the methods are everywhere and nowhere at the same time.  I've also seen this taken too far, where there is, for example, a region that is sub-divided by a region for each access modifier (public, protected, private).  That might be going too far, but if that works for you and it makes things clear for you, I'm not going to call it bad.

Variable Declarations

Now, how should you declare variables?  You don't actually have to declare them all at the top anymore, you know, and usually it's clearer if you don't declare things until just before you use them for the first time.  I'm not particularly militant either way.  Should you declare multiple things on the same line?  Generally not - it's messy, but in some cases it's a heck of a lot more convenient, so use your best judgement, depending on the situation.

Should you declare different types of variables in clusters?  Sure, if you want.  If you think that's inane, then don't.  I've never actually known it to be relevant either way.

Scope

How about scope?  OK, little-known rule for you younger folk:  In languages like C, C++ and C#, the braces that are used to block statements together actually define stack-frames.  Didn't know that, huh?  So if you declare a variable inside braces of a for() {} loop, in principle that variable is only in scope for the duration of that for() {} loop.  Once you exit the loop, the variable is out of scope again.  Cool, huh?

In C++ that can be a very handy thing to know.  In C#, not so much because for reasons nobody has ever adequately explained, they built the compiler so that while it respects the scope of the variable for the purposes of use, it doesn't respect the scope for the purposes of naming.  I suppose they thought that was too much responsibility for our poor, feeble, malnourished little idiot brains.  Consequently, this scoping phenomenon, while it can be used to control variable access effectively, also and simultaneously eliminates the variable name from the pool of names available to you from that point in the function to the end.  Which means that even I don't use it much.

Oddly enough, variables declared within a looping construct respect the scope of the declaration both for scope and naming, so if you write something like this, it's fine:

Function()
{
    for( int nIndex = 0; nIndex < asStrings.Length; nIndex++
        Trace.WriteLine( asStrings[ nIndex ];

    for( int nIndex = 0; nIndex < asStrings.Length; nIndex++ )
        Debug.WriteLine( asStrings[ nIndex ];
}

Go figure.  But as a rule, I declare one variable per line, and declare variables just before the code block in which they first appear.  That way it's clear, and if you accidentally refer to a variable before it is first used, at least the compiler is going to kick you in the shin and force you to take a look at what you're doing, which might prevent you from making a mistake.

Statement Structure

I'm not going to say much about statement structure.  If you follow the suggestions provided about spacing, bracing and what-not, things should be fine.  Only one caveat, here, and offenders will know who they are.

Use one statement at a time, OK?  That is, none of this garbage:

if( this.GetCount() < this.TotalValue() ) bResult = this.CheckValue( this.GetCount() ) != oVal.oFactor.GetInt() );

Ugh.  Write that as:

int nCount = this.GetCount();
int nTotal = this.TotalValue();
bool bResult = false;

if( nCount < nTotal )
{
    int nCheck = this.CheckValue( nCount );
    Factor oFactor = oVal.oFactor;
    int nInt = oFactor.GetInt();
    bResult = ( nCheck != nInt )
}

Why?  The first one is hard to read, hard to understand, and HARD TO DEBUG.  Every intermediate result is being held anonymously on the stack and you have no idea what they are unless you poke around a lot more deeply than you have to.  The second one is clear, easy to understand and easy to debug because you can see EVERY INTERMEDIATE VALUE.

At the end of the day, the compiler is going to generate the exact same code either way.  That first abortion proves nothing, but it does tell me that you aren't interested in writing even halfway decent code.  There is nothing cool in daisy-chain obfuscation.  The only possible motive for that is to make other people's lives miserable, or because you value speed over clarity.  If either one is your game, then nobody wants you on the team, buddy.  Nobody.

If you're doing it because you think it makes you look 'hot' and 'sophisticated' as a programmer, no it doesn't.  It makes you look like a noob, because it highlights, in stark relief, that you have absolutely no idea what is going on inside that "magical 'puter thingy".  Go take a class or something and come back when you have a vague barking clue.

Let me tell you something about speed, OK?  Just so you know.  Most of the time you spend writing programs is spent debugging.  Raw typing speed counts for very little, and I can type upwards of 120 words per minute, so trust me on that.

If you sacrifice a little time writing more verbose code, the amount of time you will save in debugging will more than make it up, and you'll wind up being A) more effective B) more efficient, and C) more team-friendly.  Those three things by themselves will make up for 60 IQ points of genius in a team environment.  Those three things will see a self-styled hot-shot programmer shown the door over somebody who is perhaps half as gifted, but considerably more effective, efficient, and team-friendly as a programmer.  It's not all about speed.

On the other hand, if you really are a gifted hot-shot, and you're also effective, efficient, and team-friendly, then nobody's ever going to show you the door unless they're walking you out while they shut the business down, chances are excellent that's not your fault - and they'll be willing to say so in their letter of recommendation.

Re-Throwing Exceptions

This particular section is peculiar to C#, just so you know, and it has to do with what happens when exceptions are re-thrown.  I just got certified in C# and I found this out during my studies. In general, I just use throw; anyway, so it was never really important.

In any case, if you are re-throwing an exception from within a catch{} statement, the way in which you do it is relevant.  If you use a straight throw; then you get what you expect - the exception that was caught is re-thrown unaltered.  However, if you catch it, give it a name, and then re-throw it BY NAME, for example throw ex; then the exception will be copied.  And since there is no way for the compiler to know the exception it's throwing did not originate inside the catch{} block, the exception's stack trace will reflect the fact that it was thrown from within the catch{} block.  That is, if it originated in the try{} block, you lose that part of the stack trace.

A subtle but important difference.  Keep it in mind.


Copyright ©2013 by David Wright. All rights reserved.

No comments:

Post a Comment