Do we need code comments any more?

Picture 7Gah. I was looking at some code last week to figure out why a layout was broken in one of our services. There was no technical documentation and no comments in the source file. It’s not unusual in the world of software, but for me at least it makes it impossible to quickly figure out what’s going on. It’s not how I liked to write code, it’s certainly not how I was taught but lately I hear the argument that modern code is self-documenting. This thinking has its roots in the object orientated programming community and has been further reinforced by followers of the agile methodology. The basic idea is that comments are just more “code” that needs to be written, re-factored and maintained while well structured code is clear, syntax checked by compilers and functionally validated by unit tests. As such, inserting dirty, unstructured text into pure, elegant code is a sign of a weak developer, and/or bad code and should be frowned upon.

Trying to reconcile this in my head I was thinking back to some of the earliest code I wrote – for the ZX Spectrum in Z80 Assembler. This kind of code directly maps to the operations of the processor and although you write it in text rather than binary, it’s pretty much the same thing. Let’s jump in with a contrived example:

// tea.asm
// AL 01/04/1992
// each ring is 4 octets
// 2: address of pan
// 1: temperature 0-10, 0 = off, 10 = very hot
// 1: spare
STOVE:
    DEFS 4*4, 0 // stove has 4 rings

// kettle is 2 octets
// 1: volume of water in litres
// 1: current temperature in degrees c
KETTLE:
    DEFB 0, 0

START:
    // initialize kettle and place on stove
    LD A, (KETTLE)
    ADD A, 5
    LD (KETTLE), A
    LD IX, STOVE
    // set ring to medium
    LD (IX+0), KETTLE
    LD (IX+2), 3
    // turn on heat
    LD A, 4
    PUSH IX
    PUSH A
    CALL HEAT
WHILE:
    LD A, (KETTLE+1)
    CP A, 100 // has water boiled?
    JP NC, DONE 
    CALL WAIT_1000
    JP WHILE

DONE:
    HALT

Without comments it’s almost impossible to figure out what’s going on in an assembler source file. At the time of writing though, even with no comments, it probably would have been as clear as day. “Any good programmer could figure it out” I would probably tell myself.

But then I became an intern and moved to C. This was much easier than assembler, a higher level of abstraction, yet always seemed to come with enormous function header explanations of purpose/parameters/authors, many lines of explanation for algorithms, and comments for each bug that was fixed along the way.

/**  Module: tea.c
  *  Author: Alex Linde
  *    Date: 1st April 1996
  * Purpose: Fill a kettle then select a free ring and boil some water.
  */
#define OFF   0
#define LOW   1
#define MED   2
#define HIGH  3

#define RINGS 4
#define WATER 5

#define BOIL  100  // 100 degrees c
#define DELAY 1000 // 1 second

struct PAN {
    unsigned char vol; // volume in litres
    signed char temp; // temperature in degrees c
};

struct RING {
    PAN * pan; // pointer to pan on this ring, or NULL for empty ring
    unsigned char temp; // temperature 0 - 3 where 0 is off
    unsigned char spare;
};
// get the water ready for some tea
int main() {
    struct PAN k;
    struct RING s[RINGS]; // X rings on our stove
    s[0].temp = HIGH; // set temp to high
    k.vol = WATER; // put X litres in the kettle
    s[0].pan = k;
    heat(&s[0],RINGS); // start heating
    // wait until water boils
    while (k.temp < BOIL) {
        sleep(DELAY);
    }
}

Yes it’s clearer – any many people wrote their C code with few or even no comments at all – compared to assembler it was self-describing! As an intern though, trawling through Ralph’s code was always a challenge – he used comments as a way to highlight something clever or something that he felt was out of the ordinary. Compared to Ryan’s code, it took me much, much longer to figure out what was going on and thus, much longer to fix a bug.

So now then, we get C++ or any other object orientated language. We model the world and describe it using classes, methods and attributes. Are we there now?

struct Volume {
  private Volume(float ml) { this.volume = ml; }
  public static Volume& fromLitres(float litres) const
    { return Volume(litres*1000.0f); }
  public Volume operator+(const Volume& rhs) const
    { return Volume(this.ml + rhs.ml); }
  public bool operator>(const Volume& rhs) const
    { return this.ml > rhs.ml; }
  public void operator=(const Volume& rhs) const
    { this.ml = rhs.ml; }
  public Volume& operator+=(const Volume& rhs)
    { this.ml += rhs.ml; return *this; }
  private int ml;
};

class Pan {
  private Volume capacity;
  private Volume volume;
  private float temperature;
  public Pan(const Volume& capacity) {
    this.capacity = capacity;
  }
  public void add(const Volume& l) {
    if (l + volume > capacity) {
      volume = capacity;
    }
    volume += l;
  }
  public bool isBoiling() const
    { return temperature >= 100.0f; }
}

struct Ring {
  public enum Temp {
    OFF = 0,
    LOW,
    MED,
    HIGH
  };
  public void adjustHeat(Temp t) { this.temperature = t; }
  public void place(Pan &p) { this.pan = &p; }
  private Pan * pan;
  private Temp temperature;
};

class template<int T> Stove {
  private Ring & rings[T];
  public Ring& nextAvailableRing() {
    for (unsigned int i = 0; i < T; ++i) {
      if (NULL == rings[i].pan) {
        return rings[i];
      }
    }
    throw NoFreeRingException();
  }
  public void heat() { ... }
}
void main() {
  Stove<4> s;
  Pan k(Volume::fromLitres(5.0f));
  k.add(Volume::fromLitres(1.0f));
  Ring & r = s.nextAvailableRing();
  r.adjustHeat(Ring::HIGH);
  r.place(k);
  s.heat();
  while (!k.isBoiling()) {
    sleep(1000);
  }
}

Sure, my example is contrived but you take my point. I’ve modeled my system, encapsulated the methods and the data and in theory it should be much easier to read. In fact, it is compared to my assembler example at the start it is but this doesn’t fully address my concern. I was looking through a page of code written by a professional developer – in this case building a data model from a number of external sources – and I couldn’t figure out what was going on.

The problem is that the code documents the actual behaviour, not the intent – and as the person looking at it later is likely to be adding functionality or fixing bugs, knowing what each part of the code was intended to do is very helpful. Second, much as we would like all our developers to be fully trained in agile refactoring and object orientated design they are not – during a crunch period code just fell out of their brain onto the page. The problem is, and to use a word that wasn’t part of my vocabulary before our new CTO arrived, this introduces technical debt.

When somebody else comes along and has to add that new image to the page, fix a bug with the old layout, or figure out why it falls over at 11,000 queries per second they are going to have to spend hours if not days figuring out why you did it the way you did. Then, if it turns out to be too difficult to figure out – or they just don’t think they have the time – they avoid fixing it and just add more code. They write another parallel function or module and before you know it you have the big ball of fail. With no comments.

So, as far as I can tell the problem is not whether or not we use comments – if you have perfectly structured object-orientated code with each method performing one task, named something other than doStuff(); then you’re probably good at a source level. If not, and you don’t really understand how to refactor it to make it that way, then just stick in plenty of comments so that someone can tidy it up later. After all, the English language is a higher level of abstraction again.

I think the real problem is in the overall software project quality. In my case there was no other technical documentation at all so it didn’t make any difference how the source files were documented. There were so many files that it’s almost impossible to figure out where to look in the first place.

You may find a reference to the thing  you’re looking for – but if you don’t know why it was done the way it was, then what effect will changing it have? Agile processes are not an excuse to write code with no design or documentation – in a lot of ways they’re just mini-waterfalls that keep splashing down until we get to where we’re trying to go. So fix the software project quality and the question of comments vs. no comments will go away.

Advertisements

2 responses to this post.

  1. Posted by Ed Voas on September 12, 2009 at 12:09 am

    While yes, code in an object-oriented environment can be a bit simpler to figure out, how many times do these environments still have long functions, or functions where you see some odd piece of code and you’re wondering what the hell it does. For these situations, you should always comment it so that someone else or future you remembers why you decided to add 1 to the width of some object. “Oh, right, gdiplus clips when it shouldn’t, I forgot.” Just because you’re using the lastest in OOL doesn’t mean you can’t write screwed up code that noone else can follow.

    My rule of thumb is to leave simple breadcrumbs to explain what might confound me later. Or to tell someone else “hey, this is like this for a reason, so don’t remove this”. Some people like to write books about what the next piece of code is for. I find this worse than writing two simple lines. I read those book-like comments and my eyes glaze over and I still don’t know what the deal is. So shorter is better.

    I consider “comments” in the code to be very local. They don’t explain architecture, nor should they necessarily. They are just to clarify something that might not be immediately clear. That said, I use file and function header comments to actually document why the class exists, and perhaps how to call a method (javadoc-style docs, etc.).

    Reply

    • I like your breadcrumbs analogue – that’s probably the best way of thinking about it.. but I’m not sure that “hey man, don’t remove this” is a breadcrumb 😉 I think we all agree that JavaDoc is documentation embedded for convenience, not actually a code comment.

      Reply

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: