Spaten, File Format Specification

Spaten Logo

Spaten is a modern geo data format that aims to resolve issues that arise from using legacy serialization methods and simplify workflows.

Goals

The goals of Spaten are:

Spaten is aimed to be flexible and future-proof. It incorporates versioning and feature flags, so the underlying structure can be replaced in the future without breaking backwards compatibility. Note that this does not ensure forward compatibility, because older decoders might not be able to unmarshal newer messages. Nonetheless encoders and decoders should be able to determine compatibility easily and changes to the base structure are going to be taken conservatively.

The block structure enables parallelized serialization/deserialization. This removes the necessity to have meta files, as e.g. present in Shapefiles.

File Structure

File Structure Graphic

File Header

The file header contains a cookie which makes it easy to determine that this is a Spaten file even without proper file extension (first 4 bytes). The next 4 bytes defined the file format version, it is currently 0 and is not planned to be changed any time soon.

Block

After the header block, there can be an arbitrary number of blocks. There should be at least one block.

Block Header

Body Length

The block starts with a 4 byte long body length field, which describes the byte length of the body.

Flags

The next 2 bytes are reserved for flags. No flag values are specified yet.

Compression

The following byte is reserved for compression information. Currently there are no compression methods specified.

Message Type

The next sequence byte is a field for defining the body serialization. Those are the allowed values:

Integer Value Message Type
0 v0, based on Protocol Buffers

Block Body

The body is a blob, the internal encoding is depending on the Message Type value.

Body Serializations

v0

v0 is based on the established Protocol Buffers serialization format. Protocol Buffer messages have definition files, which describe which fields can/must be included in a serialized message and which types they have: Proto Definition File.

The root element is Body, which can contain block wide meta data in the Tag structure (e.g. reference system). Furthermore it contains an array of Feature.

Every Feature has a geometry type enumeration (Point, Line, Polygon). The geom field is a byte blob, which contains a geometry in the specified serialization format. The only currently supported geometry serialization is WKB, as specified by the OGC.

Changelog

License

The contents of this specification are licensed under Creative Commons Attribution-ShareAlike 4.0 International license. However code that results from implementing the spec is not restricted.

References