Saturday, December 7, 2013

On Using Var

There is a lot of debate going around about how to use the keyword 'var' in C#.  Var is not precisely a variant type.  What var means is that you're leaving it up to the compiler to assign the thing an appropriate type when it's compiling the code.  This is as opposed to 'object', which is effectively a variant type, in that every single type that exists in C# has a class representation that is descended from the object class, so you can assign literally anything to an object variable.

The difference is exemplified in that, if you use a var declaration, the resulting variable will have such properties and methods as are appropriate to the type that the compiler assigned to the variable thus declared.  If you assign the same thing to a variable declared as an object, then the resulting variable will only have such properties and methods as are appropriate to an object until such time as you coerce the variable into something compatible, for example with a type-cast.

Now the var type didn't exist in C# before.  C# is a typed language.  Typed languages are far better - there are considerably fewer ways to screw up.  So in a typed language, a thing like 'var' serves no function.

Var was introduced along with LINQ.  Basically, C# introduced a mechanism by which you could construct a block of code where the type of the result was either unknown or difficult to determine at coding-time.  So they stuck the var type in there so you could intercept the results of those constructions and save them for further use.

That is the canonical and One True Use of the var type.  You use it when you don't know what type will be ultimately assigned to the variable.  This has always been how I've used it, and the axiomatically correct way to use it.

The var type, however, has been co-opted since to be used for literally everything as a coding convenience.  Why bother figuring out the type when the compiler will do it for you automatically?

Well, here's a bit of wisdom for you, kids.  The code you write isn't for the compiler.  Nope.  It's for you.  And it's not for you right now.  It's for you later.  And perhaps for a lot of other people later.  The compiler's going to reduce it into binary gobbledy-gook that the computer can read.  Once that's done, the computer could care less about your code - it's so much junk.

No, you write that stuff out and carefully save it to disk and back it up to a source code repository precisely because *YOU* will need it - later.  Or if not you, then somebody else.  That's why it exists.

Now, if you use the var type for all declarations everywhere, what exactly are you saving?  A lot of the time you're not saving time (type var instead of int - typing var takes fractionally longer because the 'v' is an uncommon character), but sometimes you are - it's faster to type 'var' than 'IEnumerable<AccountGroup>'.  However, you're saving that time now.

What's going to happen later?  Well, if you come back to that line later, then you're trying to find out why that part of the code is not working.  But it's been maybe a day, or a week, or six months since you wrote it.  Back when you wrote it, you knew exactly what it was, but now?  Not so sure, now.  So now you have to grab your mouse and mouse over it or it's assignment and find out from the intellisense (or whatever) what type is being assigned to it.

So you saved yourself five seconds typing in a long type name when you used the var originally, and now you've just lost the time you saved the VERY FIRST TIME you looked at it again.  You saved nothing.

In fact you've saved less than nothing, because you might write that once, but have to debug or analyse it, over the life of that program, four, ten, maybe fifty times.  All the time you saved writing it you burned up on the first time you looked at it.  Every other time you look at it, you have to WASTE that time, because you have to look it up AGAIN.

It's called 'false economy' in the vernacular.  It's like shopping around to various gas stations for an hour to find a place where you'll save $0.01 per gallon.  If you've got a 20 gallon tank, you save a whole $0.20 cents.  Driving around for an hour burned off a gallon of gas, which will cost you a couple of dollars to replace.  So you spent a couple of dollars to save twenty cents.  False economy.

In the coding example, you saved five seconds up front, but you're going to wind up spending 20 seconds, or a minute, or two minutes over the life of the program to compensate for that initial saving of five seconds.  That's just one variable.  If you used five hundred variables like that, you saved 2500 seconds, or a whole 42 minutes or so, but over the course of debugging that program for it's useful lifetime, you're going to wind up wasting anywhere from 3 hours to 2 days making up for that short-sightedness.

If you program for a living, you're basically setting yourself up to be a candidate for the least efficient developer in the world.  I'd like anybody to explain to me why that's a 'best practice'.

I've heard people defend the practice because they claim that it obeys the DRY principle (Don't Repeat Yourself).  It doesn't actually.

DRY comes from "The Pragmatic Programmer" (Hunt and Thomas).  The principle, as described IN THAT BOOK is summarized with: "Every piece of knowledge must have a single, unambiguous, authoritative representation within a system."

You'll notice that nowhere in that does it say anything about saving keystrokes. In fact, another principle in the same book is "Keep Knowledge in Plain Text".  If you have a variable, it has three pieces of 'knowledge' contained in it:


  1. Data type
  2. Value
  3. Purpose


The value is intrinsic.  The purpose is generally given by the variable name.  The data type you must supply.  Except by using the var type, you are explicitly and DELIBERATELY throwing that data out of the 'Plain Text' representation of that variable.  This incidentally also violates the DRY principle, because what you are going to wind up with are a number of variables that have no given type associated with them, which make them ambiguous within the context of other such variables.  Thus, that piece of knowledge is no longer 'unambiguous' which, you will note, is right there in the statement of purpose for DRY.

Can there be anything so wrong as justifying an action by citing a principle from a book that does not apply in and of itself and the action not only violates that principle but OTHER PRINCIPLES in the same book?

Probably not.


Copyright ©2013 by David Wright. All rights reserved.