Binary data: semantics

CAVEAT: This page is still in development and shouldn’t be considered authoritative. (Bug reports welcome!)

This page contains detailed semantics for the binary data spec.

TODO

  • eliminate 64-bit ints (defer to ES7 for value types)
  • primitive types: new throws, call converts
  • array types: new overloaded just like typed arrays spec; call takes a size argument and produces a new array type
  • struct types: new and call behave identically
  • layout is little-endian
  • always expose buffer except for types that contain pointer types, in which case .buffer is not there
  • pointer types
    • ObjectPointer, StringPointer
    • no .buffer
    • these are “opaque” types; infect their container
    • no .byteLength
    • put a note: for dynamic types, can use indirection through an ObjectPointer
  • names StructType and ArrayType should keep their names (not be struct and array) and be used with new
  • try Ref(T) and cursors API for arrays

Notation

This spec uses Italicized names for datatypes of ECMAScript values, i.e., values that can be stored in variables and object properties and passed to and returned from functions. E.g.:

  • UInt32 - integer values in the range [0, 232)
  • Int32 - integer values in the range [-231, 231)
  • Any - any ECMAScript value

This spec uses bold names for datatypes of spec-internal data structures. E.g.:

  • string - internal strings, such as the value of the [[Class]] property
  • reference[t] - a reference into the program store (aka heap) to an internal value of type t
  • reference[Object] - a reference into the program store (aka heap) to an ECMAScript value of type Object

Data blocks

This spec introduces a new, spec-internal block datatype, intuitively representing a contiguously allocated block of binary data. Blocks are not ECMAScript values and appear only in the program store (aka heap).

A block is one of:

  • a number-block
  • an array-block[t, n]
  • a struct-block[t1, ..., tn]

A number-block is one of:

  • an unsigned-integer; i.e., one of uint8, uint16, uint32, or uint64
  • a signed-integer; i.e., one of int8, int16, int32, or int64
  • a floating-point; i.e., one of float32 or float64

A uintk is an integer in the range [0, 2k). An intk is an integer in the range [-2k-1, 2k-1). A floatk is a floating-point number representable as a k-bit IEE754 value.

An array-block[t, n] is an ordered sequence of n blocks of homogeneous block type t. Each element of the array is stored at in independently addressable location in the program store, and multiple Data objects may contain references to the element.

A struct-block[t1, ..., tn] is an ordered sequence of n blocks of heterogeneous types t1 to tn, respectively. Each field of the struct is stored at in independently addressable location in the program store, and multiple Data objects may contain references to the field.

The spec also introduces a datatype of Data objects, which are ECMAScript values that encapsulate references to block data in the program store. Every Data object has the following properties:

  • [[Class]] = “Data”
  • [[Value]] : reference[block] – a reference to a block in the program store
  • [[DataType]] : reference[Type] – a reference to a Type object describing this object’s data block

Data types

Every data type supports the following properties:

  • [[Class]] = “DataType” – the constant class name of all Type objects
  • [[DataType]] : string – a string determining the variant of Type, for compatibility between data objects across multiple execution environments (such as multiple windows, module loaders, etc.)
  • [[Convert]](val : Any) → reference[block] – converts an ECMAScript value to a reference to a binary data block
  • [[IsSame]](t : Type) – compare data types for equality
  • bytes : UInt32 – the logical size of blocks of this type, in bytes

Type

There is a built-in Type constructor which serves as a “base class” for types:

function Type()

The “abstract” base constructor for data types, whose prototype is an ancestor of all block type prototypes. Calling the Type constructor always raises an error; it is provided solely for access to its prototype.

Type.[[Prototype]] = Function.prototype

Type.prototype = Data

Convenience method that produces the same result as new ArrayType(this, n).

Data objects

There is a built-in Data constructor which serves as a “base class” for data objects:

function Data()

The “abstract” base constructor for data objects, whose prototype is an ancestor of all data object prototypes. Calling the Data constructor always raises an error; it is provided solely for access to its prototype.

Data.[[Prototype]] = Function.prototype

Data.array(n : UInt32 | Int64 | UInt64) → ArrayType
    TODO: specify this

Data.prototype.[[Prototype]] = Object.prototype

Data.prototype.constructor = Data

Data.prototype.update(val : Any) → Void
    If !IsObject(this) || this.[[Class]] != “Data”
        Throw TypeError
    Let R ?= this.[[DataType]].[[Convert]](val)
    Copy Dereference(R) into this.[[Value]]
    Return undefined

Numeric data

There are several pre-defined data type objects describing numeric binary data.

var uint8, uint16, uint32, uint64 : Type
var int8, int16, int32, int64 : Type
var float32, float64 : Type

Let t be one of the above built-in data type objects.

t.[[Prototype]] = Function.prototype

t.[[DataType]] = the name of t as a string (e.g., “uint8” for uint8)

t.[[Convert]](val : Any) → reference[block]
    Let R = a reference to a new number-block of the appropriate size for t
    If val = true
        R := 1
    Else If val = false
        R := 0
    Else If val is an ECMAScript number in the domain of this type
        R := val
    Else If val is a UInt64 or Int64 and val.[[Value]] is in the domain of t
        R := val.[[Value]]
    Else Throw TypeError
    Return R

t.[[IsSame]](u : Type)
    Return t.[[DataType]] = u.[[Type]]

t.[[Cast]](val : Any) → number-block
    Let V = t.[[Convert]](val)
    If !IsError(V)
        Return Dereference(V.value)
    If val = Infinity or val = NaN
        Return 0
    If val is an ECMAScript number
        Return t.[[CCast]](val)
    If val is a UInt64 or Int64
        Return t.[[CCast]](val.[[Value]])
    If val is a numeric string, possibly prefixed by “0x” or “0X” for uints or /(-)?(0[xX])?/ for ints
        Let V = ParseNumber(val)
        Return uintk.[[CCast]](V)
    Throw TypeError

t.[[CCast]](n : number) → number-block
    TODO: do roughly what C does

t.[[Call]](val : Any) → Number
    Let x ?= t.[[Cast]](x)
    Let R = a reference to a new number block with value x
    Return t.[[Reify]](R)

t.[[Reify]](R : reference[number-block])
    Let x = Dereference(R)
    If t.[[DataType]] = “uint64”
        Return a new UInt64 with [[Value]] x
    If t.[[DataType]] = “int64”
        Return a new Int64 with [[Value]] x
    Return x

For each built-in type object t with suffix k (e.g., for uint32, k = 32):

t.bytes = k / 8

Arrays

Programmers can create array block-type objects, which describe fixed-length sequences of block data of homogeneous block-type, using the following constructor.

function ArrayType(elementType : Type, length : UInt32 | Int64 | UInt64) -> Type

ArrayType.prototype = a singleton function object provided by this library

ArrayType.prototype.[[Prototype]] = Type.prototype

ArrayType.prototype.constructor = ArrayType

ArrayType.prototype.repeat(val : Any) → ArrayData
    If !IsObject(this) || this.[[Class]] != “DataType” || this.[[DataType]] != “array”
        Throw TypeError
    Let V = this.[[Construct]]()
    For each integer i in [0, this.[[Length]])
        Let R ?= this.[[ElementType]].[[Convert]](val)
        Copy Dereference(R) into the ith element of V.[[Value]]
    Return V

ArrayType.prototype.prototype = a singleton non-function object provided by this library

ArrayType.prototype.prototype.[[Prototype]] = Data.prototype

ArrayType.prototype.prototype.constructor = ArrayType.prototype

ArrayType.prototype.prototype.forEach = the initial value of the standard Array.prototype.forEach

ArrayType.prototype.prototype.subarray
    TODO: specify this

Array types

Let elementType be a Type object and length a non-negative integer. Then it is possible to define a new array type object t using the ArrayType constructor:

t = new ArrayType(elementType, length)

The resulting array block-type object t has the following properties:

t.[[Class]] = “DataType”

t.[[Prototype]] = ArrayType.prototype

t.[[DataType]] = “array”

t.[[ElementType]] = elementType

t.[[Length]] = length

t.[[Convert]](val : Any) → reference[block]
    If IsObject(val) && val.[[Class]] = “Block”
        If val.[[DataType]].[[IsSame]](t)
            Return val.[[Value]]
        Throw TypeError
    If !IsObject(val)
        Throw TypeError
    Let u = t.[[ElementType]]
    Let n = t.[[Length]]
    Let L ?= val.[[Get]](”length”)
    If L not in UInt32 || L !== n
        Throw TypeError
    Let R = a reference to a new array-block of the appropriate size for n elements of type u
    For each integer i in [0, n)
        Let V ?= val.[[Get]](i)
        Let W ?= u.[[Convert]](V)
        Copy Dereference(W) into the ith member of R
    Return R

t.[[IsSame]](u : Type)
    Return u.[[DataType]] = “array” &&
      t.[[ElementType]].[[IsSame]](u.[[ElementType]]) &&
      t.[[Length]] = u.[[Length]]

t.[[Construct]](val : Any) → Block
    (described below)

t.[[Reify]](R : reference[array-block])
    Let V = a new block object with
        V.[[DataType]] = t
        V.[[Value]] = R
    Return V

t.prototype = a singleton non-function object created when t was created

t.prototype.[[Prototype]] = ArrayType.prototype.prototype

t.prototype.constructor = t

t.prototype.fill(val : Any) → Void
    If !IsObject(this) || this.[[Class]] != “Data” || !this.[[DataType]].[[IsSame]](t)
        Throw TypeError
    For each integer i in [0, t.[[Length]])
        Let R ?= t.[[ElementType]].[[Convert]](val)
        Copy Dereference(R) into the ith element of this.[[Value]]
    Return undefined

t.elementType = elementType

t.length = length

t.bytes = elementType.bytes x length

Array objects

Given an array type object such as t above, it is possible to construct new array data objects:

a = new t()
a = new t(val)

The resulting array data object a has the following properties:

a.[[Class]] = “Data”

a.[[Prototype]] = t.prototype

a.[[Value]] = a newly allocated array-block of the appropriate size for t, initialized to all zeroes
    If val is defined
        Let R ?= t.[[Convert]](val)
        Copy Dereference(R) into a.[[Value]]

a.[[DataType]] = t

a.length = t.[[Length]]

For each i in [0, t.[[Length]]):

get a.i()
    Let R be a reference to the ith element of a.[[Value]]
    Return t.[[ElementType]].[[Reify]](R)

set a.i(x)
    Let R ?= t.[[ElementType]].[[Convert]](x)
    Copy Dereference(R) into the ith member of a.[[Value]]
    Return undefined

Structs

Programmers can create struct type objects, which describe fixed-length sequences of data of heterogeneous types, using the following constructor.

function StructType(fields : { x1: Type, ..., xn: Type }) -> Type

StructType.prototype.[[Prototype]] = Type.prototype

StructType.prototype.constructor = StructType

Struct types

Let fields be an ECMAScript value. Then it is possible to define a new struct type object t using the StructType constructor:

t = new StructType(fields)

TODO: this should enumerate the object instead of iterate an array

Semantics
    Let n ?= fields.[[Get]](”length”)
    If n is not in UInt32
        Throw TypeError
    For each i in [0, n)
        Let V ?= fields.[[Get]](i)
        Let L ?= V.[[Get]](”length”)
        If L !== 2
            Throw TypeError
        Let s1 ?= V.[[Get]](0)
        Let t1 ?= V.[[Get]](1)
    Let t be a new object with t.[[Fields]] = [ (s0, t0), ... ] and the properties below
    Return t

t.[[Class]] = “DataType”

t.[[Prototype]] = StructType.prototype

t.[[DataType]] = “struct”

t.[[Convert]](val : Any) → reference[block]
    If IsObject(val) && val.[[Class]] = “Data”
        If val.[[DataType]].[[IsSame]](t)
            Return val.[[Value]]
        Throw TypeError
    If !IsObject(val)
        Throw TypeError
    Let names ?= EnumerateOwn(val)
    If names != { X | (X, u) in t.[[Fields]] }
        Throw TypeError
    Let R = a reference to a new struct-block of the appropriate size for t.[[Fields]]
    For each (X, u) in t.[[Fields]]
        Let V ?= val.[[Get]](X)
        Let W ?= u.[[Convert]](V)
        Copy Dereference(W) into field X of R
    Return R

t.[[IsSame]](u : Type)
    Return t === u

t.[[Construct]](val : Any) → Data
    (described below)

t.[[Reify]](R : reference[struct-block])
    Let V = a new data object with
        V.[[DataType]] = t
        V.[[Value]] = R
    Return V

t.prototype.constructor = t

t.fields = a frozen array representing t.[[Fields]], in the same format as the fields constructor argument above

t.bytes = t0.bytes + ... + tn-1.bytes

Struct objects

Given a struct block-type object such as t above, it is possible to construct new array blocks:

s = new t()

The resulting struct block object s has the following properties:

s.[[Class]] = “Block”

s.[[Prototype]] = t.prototype

s.[[Value]] = a newly allocated struct-block of the appropriate size for t, initialized to all zeroes

s.[[DataType]] = t

s.constructor = t

For each (”fi“, ui) in t.[[Fields]]:

get s.fi()
    Let R be a reference to the ith field of s.[[Value]]
    Return ui.[[Reify]](Dereference(R))

set s.fi(x)
    Let R ?= ui.[[Convert]](x)
    Copy Dereference(R) into the ith field of s.[[Value]]
    Return undefined

Prototype hierarchy

To help visualize the prototype inheritance relationship, the following diagram illustrates a case of an array type object A and an array block a created by A.

TODO

  • overload Type functions to accept (ArrayBuffer[, uint[, uint]])
  • compatibility with typed arrays:
    • Data object properties: buffer, byteLength, and byteOffset
    • buffer throws for Data objects that weren’t explicitly wrapped around a buffer
    • byteOffset is 0 for Data objects that weren’t explicitly wrapped around a buffer
  • allow specifying endianness when explicitly wrapping an ArrayBuffer
  • DataView compatibility story
 
harmony/binary_data_semantics.txt · Last modified: 2012/12/06 20:41 by dherman
 
Recent changes RSS feed Creative Commons License Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki