A Unit of Analogy

Porting Lemon to Zig: A YACC Shave

On Memory and Policy

In the last chapter, I hit on the basic strategy for the port: the mother of all printf debugging, essentially. The shortest path to being able to dump the internal state of both programs, is what I should have realized at the beginning: port the parser, first, stopping only to fill in code and types which the parser needs.

The Go port of Lemon had the luxury of ignoring anything in the original relating to memory. Garbage collection is great: bytes show up on demand, and disappear at some point in the indefinite future, once you’ve stopped paying attention to them.

Zig doesn’t work that way, and C doesn’t either. Both languages require manual memory management, and without any sort of RAII magic, whether of the C++ or Rust1 variety. I was to discover, however, that the approach to allocation differs between C and Zig, in ways which appear minor until one sets out to port from the former to the latter, at which point they become significant and consequential.

Not to the point where it’s at all difficult to translate from one to another: declare the essential C functions for memory management as calls to a global allocator, leave room to pass a type as the first argument, and you’d be most of the way (but not all the way) there. But an idiomatic translation requires a bit more than this.

Before we go any further, I must share with you a bit of code from a version of Lemon. I started the port with an older version of Lemon which doesn’t have this code; the commit history shows it to be of fairly recent vintage.

Behold! Memory-safe C, via the simple expedient of brute force, courtesy of D. Richard Hipp.

/*
** Compilers are getting increasingly pedantic about type conversions
** as C evolves ever closer to Ada....  To work around the latest problems
** we have to define the following variant of strlen().
*/
#define lemonStrlen(X)   ((int)strlen(X))

/*
** Header on the linked list of memory allocations.
*/
typedef struct MemChunk MemChunk;
struct MemChunk {
  MemChunk *pNext;
  size_t sz;
  /* Actually memory follows */
};

/*
** Global linked list of all memory allocations.
*/
static MemChunk *memChunkList = 0;

/*
** Wrappers around malloc(), calloc(), realloc() and free().
**
** All memory allocations are kept on a doubly-linked list.  The
** lemon_free_all() function can be called prior to exit to clean
** up any memory leaks.
**
** This is not necessary.  But compilers and getting increasingly
** fussy about memory leaks, even in command-line programs like Lemon
** where they do not matter.  So this code is provided to hush the
** warnings.
*/
static void *lemon_malloc(size_t nByte){
  MemChunk *p;
  /* if( nByte<0 ) return 0; -- size_t is unsigned */
  p = malloc( nByte + sizeof(MemChunk) );
  if( p==0 ){
    fprintf(stderr, "Out of memory.  Failed to allocate %lld bytes.\n",
            (long long int)nByte);
    exit(1);
  }
  p->pNext = memChunkList;
  p->sz = nByte;
  memChunkList = p;
  return (void*)&p[1];
}
static void *lemon_calloc(size_t nElem, size_t sz){
  void *p = lemon_malloc(nElem*sz);
  memset(p, 0, nElem*sz);
  return p;
}
static void lemon_free(void *pOld){
  if( pOld ){
    MemChunk *p = (MemChunk*)pOld;
    p--;
    memset(pOld, 0, p->sz);
  }
}
static void *lemon_realloc(void *pOld, size_t nNew){
  void *pNew;
  MemChunk *p;
  if( pOld==0 ) return lemon_malloc(nNew);
  p = (MemChunk*)pOld;
  p--;
  if( p->sz>=nNew ) return pOld;
  pNew = lemon_malloc( nNew );
  memcpy(pNew, pOld, p->sz);
  return pNew;
}

/* Free all outstanding memory allocations.
** Do this right before exiting.
*/
static void lemon_free_all(void){
  while( memChunkList ){
    MemChunk *pNext = memChunkList->pNext;
    free( memChunkList );
    memChunkList = pNext;
  }
}

There is but little one can do with this extract other than to admire it. The blend of wit and, dare I say it, peevishness, in the comments! It’s really quite exquisite.

It must be said: Lemon has leaks. Some of them documented2! A comment from our primary source:

    /* free(x2a->tbl); // This program was originally written for 16-bit
    ** machines.  Don't worry about freeing this trivial amount of memory
    ** on modern platforms.  Just leak it. */

The effect this had on me was much like the feeling one would experience catching one’s pastor sneaking a cigarette behind the barn. This is drh! Easily the author of C code most acclaimed for the robustness and correctness of his work! If SQLite were to leak a single byte, drh would move Heaven and Earth to plug it. Let there be no doubt.

x2a->tbl is… four pointers. It’s the data-storing component of an associative array, discarded every time the array doubles in size. This one (there are several) happens to hold symbols, which are deduplicated and interned, that role played by this associative array.

It is allocated with 128 slots. Pikchr, my reference grammar, has 141 symbols, meaning these four words leak… once. These table heads are not the only leaks in the program, but they’re all this sort of inconsequential frippery.

This, Dear Reader, is called engineering. Lemon has a garbage collector, known as the operating system, and it will quite faithfully reap every page lent out to the process when it comes time to exit. Since Lemon is a straight-line program, one which opens a file or two, does some work, emits a file or a few, then quits, no possible harm can come of dropping a byte or two on the floor along the way. Most allocators can’t even count as low as 32 bytes: give it back if you want, they won’t know what to do with it, at least until one of its neighbors is also returned.

SQLite can make no such assumptions. It is commonly embedded in freestanding systems, with nothing we would recognize as an operating system, and no leeway at all to leak memory: we may be certain that there are cases where it remains resident in memory indefinitely, for as long as the device is powered on. I am entirely sure that both memory leaks, and fragmentation, are carefully considered and accounted for in its design.

At the time of writing, my version of Lemon also leaks3. Not in this particular spot: I swapped the associative array implementations for ArrayListHashMaps, which do much the same thing.

I was intrigued to notice that the Go port faithfully implemented every line of the associative arrays. That gave me pause, in fact. Why? Go has maps, why not use them? My conclusion was that Golemon is a means, to a means, to an end: the goal was Pikchr diagrams in a blog engine, not a Go-emitting port of Lemon, or even a Go port of Pikchr. Every deviation, and there are a few in that port, every deviation comes at a cost, by adding to the programmer’s global state. When the goal is a port which functions well enough to compile one grammar, for one other program, which needs to work just well enough to produce some diagrams, the quickest route there involves rigorously minimizing that global state.

This was an invaluable check on my hubris. My priorities are different: this port is a means to an end, and that end involves transforming and maintaining the initial port, so points expended on translating to an idiomatic form were to be paid up front or later, but would be paid regardless.

In that very first push, before pivoting to the parser, I had taken a few steps down a different road. One of the structs I translated is called action, so Action in Zig-ese, and there are various functions called Action_new, Action_sort, and so on. I wrote these so they were Action.new, Action.sort, and so on, as is normal in Zig. I resolved not to do any more of that, not until the port was feature-complete. If every single Symbol pointer is called sp, well, that’s what I’m going to call it. Refactoring a statically typed language with the assistance of a language server is short work, remembering what I decided to call this or that and translating back and forth while debugging? That, is a recipe for pain.

The terseness has actually rubbed off on me somewhat. But that’s a story for later.

But I still haven’t answered the question: other than trivial allocations vanishing into the Abyss on occasion, where does Lemon store the bytes? The answer is, to a large degree, within a few interning pools, implemented as associative arrays, and several freelists, which create a hundred or so data allocations at a time, putting them on a freelist, then doling them out on demand. Any returned data structure is pushed onto the head of the freelist, then immediately reused the next time one is needed. All of these data structures are nodes in one form of linked list or another, so they have a field free to reuse in constructing the list, making this a fine approach, sparing of allocations, flexible, and making life easy if one should happen to want to go ahead and free them once things are done cooking.

In Zig, we have a type of allocator which does exactly this, a MemoryPool. A bit more general, in fact, since it doesn’t need (and won’t make use of) a typed linked list field on the structure it allocates. It simply casts them to a constructed type of the same size, which has an appropriately-aligned pointer field to the next item on the freelist. It does preheats, which is identical to the preloading done in the original. It struck me as a satisfactory replacement, and indeed, it was.

For the interning tables, I made little data structures which hold an ArrayHashMap and an Allocator, with an intern method which does the needful. It’s actually an ArrayHashMapUnmanaged, but the standard library is moving away from having the managed variant be the default, and perhaps even existing, so to spare myself maintenance down the road, I just concocted them manually.

All of these were made to serve the same interface as in the original, through the same function names, with as similar a signature as I could easily manage. The chief difference is that, since most of the calls can allocate, they can also return an error. Emulating the C practice of returning a null pointer would be painful and pointless, so I traded try statements for null checks and carried on.

As for where to put them, well. Let’s start with another comment:

/* ... (LEMON uses no global variables and makes little use of
** static variables.  Fields in the following structure can be thought
** of as begin[sic] global variables in the program.) */

Once again, this is eminently defensible. The judicious use of global4 state in an application, if and when it makes things easier to reason about, was once uncontroversial, and should be so today.

So I made my own little use of static variables, but making them all threadlocal. Not because this distinction is very important, this being a single-threaded program without logic to benefit from changing that, but because it’s greppable. The main reason not to use global-ish state is that it inhibits use of the program as a library, and I see no reason for the successor program to lemon.zig to be structured that way. But it can certainly wait until the time is ripe.

I did not, however, go so far as to declare a global allocator. It’s easy enough to load up the threadlocals at the start of main, toss in some defer statements, and carry on to the end.

The remainder of the memory is strings, almost entirely if not entirely. So I put an allocator on Lemon, the ‘following structure’ quoted above, and called that good.

I was working these things out as I plowed ahead with the parser, and none of this proved difficult or came around to bite me. My advice to anyone porting a C program to Zig is to just emulate how it does it as best you can, tacking allocators on to structs when it makes sense to do so, rely on std.process.cleanExit() and the DebugAllocator to help out with any problems you run into, and:

Don’t be afraid to switch to process.exit(0) for awhile, if the reports are getting in the way of getting the job done. Leaks aren’t the end of the world.

So memory was not a problem: Zig and C are amicable on that front, Zig has excellent features built right in to help get it right, and barring the last footnote5 it’s perfectly straightforward to get from the one to the other via the shortest path.

In the next chapter, the penultimate, I’ll cover the hard part. Zig differs sharply from C in a few consequential matters, chief among them: nullability.

  1. It must be said, or at least, I’m going to say it: porting a program like this to Rust would be hideously painful. There’s a reason the slogan is “rewrite it in Rust”, because one does not simply port a C program to the Rust language. Can someone write a nice LALR(1) generator in Rust, with a similar or identical interface? Yes, of course, and the Lemon codebase would be a useful reference in doing so. Can you say “ope, looks like we pass a mutable pointer into this function”, and just, y’know, do it? No. No Sir, you cannot.

    1

  2. Note that this was removed from the later code, which is fair, after all, it’s no longer true. Nothing leaks.

    2

  3. At the time of publication, it does not. Why that is will be made clear as the story progresses, but briefly: Zig makes it easy enough to be worthwhile to do.

    3

  4. The distinction made between ‘static’ and ‘global’ state in the comment is of actual importance in C, although perhaps not in the case of a main application which uses only the standard library. But it would have mattered at the beginning, when Lemon was linked together from parts. In any case, Zig has only container-local state, equivalent to static in most ways, and completely so unless marked pub. So when we say global, that’s what we mean.

    4

  5. In C, it’s malloc‘s job to keep track of the size of an allocation. So free just takes a pointer, and however much memory was handed out with that pointer is however much gets freed. In Zig, it’s userspace’s job to keep track of that, and tell the allocator how much space you’re freeing. This can matter, because it’s not uncommon for C code to allocate a chunk all at once, and turn it into a collection of data structures, often linked together in various exotic ways. My advice: just do the allocations separately, accept that there might hypothetically be an efficiency cost, and maybe make a note to swing back and go to all that extra effort. Zig can do this for you, it just doesn’t want to. I think the tradeoff matrix here falls in Zig’s favor, but not in an uncomplicated way, and elaborating that would be at least a short post of its own, not just a long footnote.

    5

Part Four: The Angle of Attack
Part Six: Nulls and Slices and Signs