One of the features of the new serialization format (codenamed MACAW) is a description of the “shape” of the coming data. It tells the parser whether there are mechanically relevant aspects of the data that can be deconstructed up front, and if so what to do with them. This allows the client to make sense of the data it is consuming, and also sto “read ahead” and request more data before processing has even started.
It isn’t entirely clear initially what should be included here. The byte layout of a GIF? That doesn’t meet our test of being “mechanically relevant”, meaning something that is important at the protocol level. The layout of an Ed25519 signature? That’s mechanically relevant, but very specific: the parser shouldn’t need to know anything about signature algorithms to do its job of taking bytes off the wire and packaging up their shapes.
How about a list of hashes? That’s a shape that is both mechanically relevant and also at the right level of abstraction. It allows the first stage of interpretation to proceed without needing to know the gnarly details of byte configurations for arbitrary blobs, while still giving it enough structure to handle making requests for more information and linking together data as it is received.
What about specific lists of hashes, perhaps ones that are mechanically relevant? Is it enough to have a single generic list of hashes, or does that need to be special-cased to make specific shapes in cases where the hashes in the list have mechanically relevant interpretations? The right answer seems to include a certain amount of polymorphism, where the generic list of hashes represents a fully parametrically polymorphic list type (including heterogenous lists, no less), and the specializations are themselves parametrized over equivalence classes of shapes. This yields a good blend of power and flexibility while fitting into a convenient byte-sized wrapper, and makes the macaws happy.
— Dann