We present Infra, a new baseline medium for representing data. With Infra, arbitrarily-complex structured data can be encoded, viewed, edited, and processed, all while remaining in an efficient non-textual form. It is suitable for the full range of information modalities, from free-form input, to compact schema-conforming structures. With its own equivalent of a text editor and text-field widget, Infra is designed to target the domain currently dominated by flat character strings while simultaneously enabling the expression of sub-structure, inter-reference, dynamic dependencies, abstraction, computation, and context (metadata).
Existing metaformats fit neatly into two categories. They are either textual for human readability (such as XML and JSON) or binary for compact serialization (such as Thrift and Protocol Buffers). In contrast, Infra unifies those two paradigms. In order to have the desirable properties of binary formats, Infra has no textual representation. And yet, it is designed to be easily read and authored by end-users.
We show how the organization Infra brings to data makes a new non-textual programming paradigm viable. Programs that modify data can now be embedded into the data itself. Furthermore, these programs can often be authored by demonstration. We argue that Infra can be used to improve existing software projects and that bringing direct authoring and human readability to a binary data paradigm could have rippling ramifications on the computing landscape.