Notes on compiling protocol codecs

| Comments (3) | Software
When we wrote RELOAD-04, we specified it using a protocol description language based on that used to specify TLS (RFC4346). This was done with the intention that when it came time to do an implementation we would be able to write a compiler that would take the spec as input and automatically emit encoders and decoders. I chose the TLS language because I knew it well and because I already had a YACC grammar on hand from when I'd tried the same thing for TLS (though that didn't work out that well.)

Based on that experience, this time I wrote the PDU descriptions with compilation in mind so I was fairly confident I could make that approach work. Moroever, I used the first pass of the compiler as the basis for s2b, so when I cleaned up the PDUs for RELOAD-04, I had a pretty good idea of what would compile and what wouldn't and it made sense to do a compiler (s2c) for the RELOAD coding party. Even with that background, I quickly discovered mistakes, some in my choice of language constructs to use, and some in my compiler design.

More after the break.

Single Base Class
Because I was compiling into C++, I decided to use a semi object oriented methodology. There's a single abstract base class called PDU that defines pure virtual encoder and decoder functions:

class PDU {

      virtual void encode(std::ostream& out) const=0;
      virtual void decode(std::istream& in)=0;
(note: I'm eliding a bunch of stuff here for simplicity).

The idea here is that as long as you know you have an instance of PDU or any subclass, you can encode it or decode it without worrying about the exact subclass you have an instance of. This actually worked out pretty well, though it's not clear how much of an advantage it really was, since I could have machine generated all the functions with the same signature anyway.

Representing Variant Structures
Any sensible language needs to support some sort of variant structure. Here's an example from the RELOAD spec:

enum {data (128), ack (129), (255)} FramedMessageType;

struct {            
  FramedMessageType       type;

  select (type) {
    case data:
      uint24              sequence;
      opaque              message<0..2^24-1>;

    case ack:
      uint24              ack_sequence;
      uint32              received;            
} FramedMessage;

It seemed natural to use inheritance and polymorphism here. So, in this case I would autogenerate a class called s2c::FramedMessage and then two derived classes s2c::FramedMessage__data and s2c::FramedMessage__ack. So, if you had a structure that includes a FramedMessage, we'd generate something like:

class {
  FramedMessage *framedMessage;
} Container;

To encode you'd instantiate a member of the subclass and then assign it to the slot in the container structure, and just call encode(). Decoding is trickier because the decoder needs to figure out what type it's going to instantiate before it instantiates it (you can't just instantiate the base class). So, this means that the decoder needs to read the type before it instantiates the subclass and decodes the rest of the structure.

Anyway, when I showed the compiler output to other people on the project, their heads exploded. They especially hated the fake superclass which was never instantiated,but just there to hold the subclass. My next plan was to use a union, which is the C way, but when we tried that, the C++ compiler choked if any of the struct arms contained classes themselves because it didn't know which constructors to call. We settled for wasting a little space and using a series of structs, like so:

typedef enum {
   data = 128,
   ack = 129
} FramedMessageType;

class FramedMessageStruct : public PDU {
   FramedMessageType             mType;

   struct mData_ {
        UInt32                        mSequence;
        resip::Data                   mMessage;
   } mData;
   struct mAck_ {
        UInt32                        mAckSequence;
        UInt32                        mReceived;
   } mAck;

To encode, you just fill in the mType value and then the appropriate substruct, and the encoder figures out what to do. The decoder automatically reads the type value off the wire, then fills it into the struct and automatically fills in the right arm. [Note, by the way, the names of the structs, e.g., mAck_. I would ordinarily have not bothered to name the types, as opposed to the fields in the container structure, but we found the Windows compiler choked when they were anonymous.]

The downside of this technique is that it doesn't defend you from programmer error. If you screw up checking the type and then look at the wrong arm, you just get nonsense, which is bad. The C++ technique would have been safer (though you'd still get errors only at runtime as opposed to compile time) but as I said, it made people's heads explode. This is a lot easier to read, if a bit grosser.

Variant Selection
A related problem is matching up the type values to which form of the variant to decode/encode. Encoding is actually the easier version, since you can create a synthetic field which indicates which struct arm you're using and make people fill in both that and the type field to be encoded on the wire. The compiler knows which one to choose.

Decoding is more difficult because you need to read the type field off the wire and then decode based on it. This is easy in XDR, because all variants (called unions) come with a discriminator which is encoded on the wire right before the structure itself. But the TLS language lets you select on any field you want, regardless of whether they are encoded on the wire immediately before. In fact, they don't need to be encoded at all; you can switch on conceptual fields like the protocol version.

When I was writing the PDU structures, I hadn't quite decided how I was going to handle this in the compiler. I had some rough idea that I would have simple cases where I could infer from context and then complicated cases where the decoder or encoder would execute some callback which would serve as an oracle to tell it what to do. When I actually went to write the part of the compiler that would generate the code for these structures, this started to seem like a real bad idea, especially since I hadn't been real careful to make sure that the discriminators were close by in the protocol to the things they discriminated. Here's a good example:

enum { reserved(0), single_value(1), array(2),
       dictionary(3), (255)} DataModel;

select (DataModel) {
  case single_value:
    DataValue             single_value_entry;

  case array:
    ArrayEntry            array_entry;

  case DictionaryEntry:
    DictionaryEntry       dictionary_entry;

  /* This structure may be extended */
} StoredDataValue;

Clearly, this discriminates on the DataModel, but that just tells us the type, not the field we're supposed to examine. That's carried in some other part of the protocol message, and in different places depending on which message we're dealing with. That's clearly a mistake. I fixed this by taking every place in the protocol where I had a naked select (aka variant) and turning it into a struct with a select inside. The rule then becomes that the discriminator must be a field in the enclosing struct. For instance, the above structure becomes:

struct {
  DataModel            model;

  select (model) {
    case single_value:
      DataValue             single_value_entry;

    case array:
      ArrayEntry            array_entry;

    case dictionary:
      DictionaryEntry       dictionary_entry;

    /* This structure may be extended */
  } ;
} StoredDataValue;

Arguably, this is a little less elegant because we actually already know from context what the type of the variant is, but on the other hand it makes the structure more self-contained and it makes the compiler a lot easier.

Length Fields
Another problem in automatic generation is length fields. In a number of cases, we'd like to have a field that describes the length of the entire structure (or, actually the rest of the structure).

struct {
  AddressType             type;
  uint8                   length;
  select (type) {
    case ipv4_address:
       IPv4AddrPort       v4addr_port;
    case ipv6_address:
       IPv6AddrPort       v6addr_port;

    /* This structure can be extended */
  } ;
} IpAddressAndPort;
For instance, in the above structure, you could argue you don't need a length field because we know how long each of the variants is. But you might want to write a decoder which could handle new variants without choking, perhaps by ignoring them. This only works if you know how long they are, which requires a length field. But the compiler doesn't know that the length field means that. As far as it knows, this is just a random integer in the protocol. We could force the programmer to set the length field, but that's another opportunity for them to screw up and it's not really practical if the rest of the structure is of variable length, since the programmer would have to know too much about the encoding to know how to set it.

It's far better to have the compiler generate the length handling code automatically. One natural approach would be to have a heuristic where any variable named length would trigger this behavior, but heuristics like this scare me. I finally settled for a compromise using a decoration; if you add .auto_len to the end of the identifier of any variable, the compiler decides it's a length field and does two things:

  • When it's encoding, it writes a length field which has as its value the length of the remainder of the structure after the length.
  • When it's decoding, it enforces the same constraint on the rest of the structure by making an input substream of the appropriate length.

This isn't perfect, but it's pretty good because otherwise . isn't permitted in identifiers, so it's pretty hard to trigger this behavior by accident. The more serious problem is that this decoration doesn't make sense as part of the protocol description in the document, so now you've introduced a variation between the protocol standard and what you feed the compiler, which gets in the way of just compiling the standard. Eventually, I'll probably deal with this by decorating the protocol description in the XML source, and then have the program that I use to extract the protocol elements from the document also strip out the decoration in the version that gets published. Of course, this only works because I'm writing both the spec and the implementation; anyone else will need to figure things out for themselves, which is kind of a bummer, but it's not clear how to solve that without making the base language substantially richer.

Type Mappings for Variable-Length Arrays
The final problem I want to talk about is how to handle variable length arrays. The protocol description language allows you to specify these like so:

Destination via_list<0..2^16-1>;

The natural mapping for this into C++ is to an STL vector, like so:

std::vector  mDestinationList;

This works fine when we're working with a list of structures, but less well with an opaque string of bytes, since that gets turned into a std::vector<unsigned char>, which really isn't the best representation for this. I ended up doing some special case code which converts it into a Data, which is reSIProcate's "opaque blob" class. There's sort of a tradeoff here, since this is a lot more convenient for the programmer, but the more you special case this kind of stuff, the more it cuts into the value proposition of using a compiler, since you're not working on an infinite number of protocols and so there's an overhead issue. In this case, though variable length opaque vectors are all over the protocol, so this makes sense as an optimization.


Have you ever considered ASN.1 notation along with some binary encoding (e.g. DER) for APDUs? I think it fits almost all needs designers have, and there are already encoders/decoders available. The ASN.1 standard is huge and only a small subset of it is actually really useful though.

Sure, I know all about ASN.1--that's where I got my first
experience with this kind of compiler, and the name 's2c' is an homage to Brian Korver's 'a2c' compiler. The problem is that ASN.1 has such a bad reputation inside IETF that ir's not politically practical to design a protocol in ASN.1 any more outside of the few groups that already use it.

I told you ... you should have used XDR. ;-)

Leave a comment