Home » 2010 » May

Monthly Archives: May 2010

When Fields are Initialized, or “Lies Reflector Told Me”

The other day a coworker came to me with a Tricky Language Question. He and another chap had just finished working through a bug that had arisen due to a misunderstanding of C# constructor and field initialization order. The question?

In a derived class, when does field initialization occur, relative the derived and base constructor code?

Specifically, what does this output?

class Print
{
    public Print(string message)
    {
        Console.Out.WriteLine(message);
    }
}

class Base
{
    public Print baseField = new Print("Base Field");
    public Base()
    {
        new Print("Base Constructor");
    }
}

class Derived: Base
{
    public Print derivedField = new Print("Derived Field");
    public Derived()
    {
        new Print("Derived Constructor");
    }
}
class Program
{
    static void Main()
    {
        new Derived();
    }
}

I don’t often give much thought to the “field vs. base class constructor” thing, but I knew that the Base constructor would be called before the Derived constructor, and I’d seen disassembled code in Reflector that showed field initialization as if it were the first code executed in a constructor. My guess was:

  • Base Field
  • Base Constructor
  • Derived Field
  • Derived Constructor

“Not so,” said my coworker. The actual order is

  • Derived Field
  • Base Field
  • Base Constructor
  • Derived Constructor

The reason for this is given in C# Language Specification section 17.10.3, Constructor execution:

Variable initializers are transformed into assignment statements, and these assignment statements are executed before the invocation of the base class instance constructor. This ordering ensures that all instance fields are initialized by their variable initializers before any statements that have access to that instance are executed.

“What’s the problem here?” you may be wondering – the Base code doesn’t know anything about the Derived fields, so why go out of our way to make sure the field initializers are called before the Derived constructor?

Vitual methods

Virtual methods are the problem. If a virtual method is defined in Base and overridden in Derived, the overridden method may reference the new fields added to Derived. If the virtual method is called from the Base constructor, then we need those fields to be initialized before the constructor is called. Initializing fields even before calling base class constructors ensures that this is so.

Or does it? What if the field I’m accessing in a overridden method in the derived class doesn’t have a field initializer, that method is called from the base constructor, and the field value is set in the derived constructor? In this case, the field won’t be initialized before the method is called – it will still have the default value for its type.

So how to do we safely call virtual methods in constructors? We don’t. You can’t guarantee what code is going to go into a derived class’s virtual method, so you never know what’s going to happen.

Back to Reflector

Remember a few paragraphs ago when I said that Reflector told me that field initialization acted like it was an assignment statement at the beginning of a constructor? Well, I did, and I wanted to see whether I was misremembering, so I compiled my sample code and threw the assembly into Reflector. Here’s what I saw:

Derived Class Constructor

Derived Class Constructor

I felt somewhat vindicated – this matched my memory. For a lark, I took this code (and the matching code Reflector showed me for the Base class), compiled it, ran it, and got:

  • Base Field
  • Base Constructor
  • Derived Field
  • Derived Constructor

The more I thought about this, though, the worse I felt. How could Reflector let me down like this? Isn’t it just looking at the IL and translating into C#? I poked around a little more, and instead of just double-clicking on the Derived constructor, I right-clicked on the Derived class node in the navigation tree and picked Disassemble. Lo and behold:

Disassembled Derived Class

Disassembled Derived Class

So, Reflector does know what’s going on—you just have to ask nice. To recap,

  • if you know which Reflector action to choose,
  • you remember about field initializers running before even base class constructors, and
  • you keep careful track of virtual methods called from constructors

Reflector can tell you what’s going on in your code. Forget any of those things, and you’re lost.

Quickly make editable diagrams with yUML

I’m always on the lookout for convenient tools for creating diagrams that can be used for software development. Balsamiq is my tool of choice for UI mockups – it’s great for whipping up stylized interfaces in very little time.

Balsamiq does a great job of simplifying mockup creation, since it takes a lot of choices away from the user – you don’t have the ability to change fonts or line thicknesses or colours. It really lets you focus on the aspects of the interface that matter when you’re just getting started – the types and relative positions of the screen elements.

Recently I discovered yUML – an online tool for creating class diagrams, activity diagrams, and use case diagrams. It’s quite simple and produces attractive results. The best part is that you just specify the relationship between components – you don’t have to position them yourself.

Using an example from the site, you can create a diagram that shows that:

  • a single customer aggregates several orders,each of which
  • uses 0 or 1 PaymentMethods, and
  • is composed of some number of LineItems

by entering this code:

[Customer]+1->*[Order]
[Order]++1-items >*[LineItem]
[Order]-0..1>[PaymentMethod]

And out pops this diagram:
yUML Order Class Diagram

Over the years, I’ve become accustomed to using WYSIWYG tools to create written documents and images, but I often miss text-based tools. They:

  • give consistent, repeatable results,
  • allow easy diffing between versions, and
  • don’t encourage time-wasting as we fiddle to adjust every single pixel or line break

so I was really happy to find yUML – even though the syntax takes a little getting used to, it makes it easy to generate diagrams with a minimum of fuss.

One wrinkle I’ve had using tools like Balsamiq and yUML is going back to modify diagrams after I’ve saved them off as a PNG. The last time I posted a Balsamiq PNG, I resorted to embedding the text that represented the diagram in the post comments. Actually, don’t bother clicking that link – I’m sure I had the source saved in the post as an HTML comment, but I don’t see it there, not even in edit mode.

I’ve unwittingly demonstrated a point I was trying to make – many of these design tools don’t allow you to save the “source” of the diagram, and it could be lost. This isn’t a disaster for simple diagrams like the one above, but it could be very inconvenient for larger ones. Another problem is that typing out the code, pasting it into the conversion tool (in yUML’s case a web page), and converting it and saving the result can become tedious as you make many adjustments.

To address some of these inconveniences, I created a tiny Python script that accepts a class diagram description, hits the yUML website to create an image from it, and saves the image to disk, embedding the diagram “source code” in the PNG’s iTXt chunk:


import urllib
import urllib2

import png

def add_yuml_to_png(yuml, in_stream, out_stream):
    signature = png.read_signature(in_stream)
    out_stream.write(signature)

    for chunk in png.all_chunks(in_stream):
        if chunk.chunk_type == 'IEND':
            break
        chunk.write(out_stream)

    itxt_chunk = png.iTXtChunk.create('yuml', yuml)
    itxt_chunk.write(out_stream)

    # write the IEND chunk
    chunk.write(out_stream)

def create(yuml, output_filename):
    baseUrl = 'http://yuml.me/diagram/scruffy/class/'
    url = baseUrl + urllib.quote(yuml)

    original_png = urllib2.urlopen(url)
    output_file = file(output_filename, 'wb')

    add_yuml_to_png(yuml, original_png, output_file)

    output_file.close()

if __name__ == '__main__':
    import sys
    sys.exit(create(*sys.argv[1:3]))

The png module is a very rudimentary PNG handling module that I wrote just for this script. There are ready-made Python PNG modules out there, but I thought they’d be too heavy to pull in for this, and that it’d be fun to write the PNG-handling code. It was.

Later on, when you want to adjust the diagram, we can the following command on the PNG:

read_yuml_from_png.py yuml_order_example.png
[Customer]+1->*[Order], [Order]++1-items>*[LineItem], [Order]-0..1>[PaymentMethod]

And here’s read_yuml_from_png.py:

import png

def read(pngFilename):
    yuml = '<<no yuml found>>'
    pngFile = file(pngFilename, 'rb')
    png.read_signature(pngFile)
    for chunk in png.all_chunks(pngFile):
        if chunk.chunk_type == 'iTXt':
            chunk = png.iTXtChunk(chunk)
            if chunk.keyword == 'yuml':
                yuml = chunk.text
                break
    pngFile.close()
    return yuml

if __name__ == '__main__':
    import sys
    sys.exit(read(sys.argv[1]))

and out pops the original class diagram description.

The png module is kind of long to paste here, but you can get all of the source for this post from my Google code project (direct link to source).