Parsing C for fun and profit

On a hobby project I’m working I’m using pre-formatted binary data that’s loaded directly into memory, with optional pointer fixup. This is a very nice way to load your data as it keeps the runtime dead simple; just load everything in a single read.

I’m experimenting with various ways to create this binary data and right now Python is in the test bench. I had written a packaging library that allowed me to generate data for various C types, taking alignment and padding into consideration. Consider the following structure, detailing a bunch of data including a count and a pointer to the first element in an array:

struct foo {
    u32 flags;
    float x, y, z;
    u32 bar_count;
    const float *bar_data;
};

To create binary data in this format using my slab package in Python, I can do this:

bars = [1.0, 2.0, ...]
datum = Sequence([
    U32(flags),
    Float(1), Float(2), Float(3), # x y z
    U32(len(bars)),
    Pointer(Sequence([Float(x) for x in bars]))])
serialize(datum, ...) # create binary file

This is pretty neat and handles alignment and relocation points internally, but it is error-prone to maintain. So I looked into parsing the actual header to generate wrappers for the slab library. Looking around for parsers I ran into pycparser which seems like a very capable C parser.

It was surprisingly easy to create and consume my header files with pycparser and I pieced together a dynamic setup so I can basically do this to generate the same data:

bars = [1.0, 2.0, ...]
pack = import_header('header.h' types = ('foo',))
datum = pack.foo(flags, 1, 2, 3, len(bars),
                    Pointer(Sequence([Float(x) for x in bars])))
serialize(datum, ...) # create binary file

This is a clear improvement in readability, and also in syntax checking as you will get errors from your tool whenever the header changes so the serialization code breaks. Say what you want about Python (I know I do!) but sometimes it is fantastic what you can achieve in a few hours of coding in it!

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s