Monday 28 September 2009

IETF and Squeezing the Meta Dictionary

In the last few month I've struggled to find the right direction to take Argot. I've looked at reviving the Personal Browser concept, investigated SCTP and a few other things. These are all good research areas for Argot, however, they take the focus away from the core Argot idea. I've now returned to the core of Argot with a renewed focus driven by the 6lowapp IETF working group.

The 6lowapp IETF working group is being formed to develop the application protocols that will form the basis for the “Internet of Things”. Argot was originally created with small embedded systems in mind; in fact, in October 2005 I blogged about reducing an Argot RPC server to 7kb. While Argot can solve problems in other domains, the “Internet of Things” is the best fit for the problems it does solve.

The current plan is to develop an IETF Internet Draft (I-D) which provides the rationale for Argot, the technical problems it solves and provide a specification. In addition, I plan on developing an example service using Contiki. A lot of work to be done before the 19th October. Of course, there's no guarantee that Argot will become an RFC, however, I should at the very least receive some good feedback and allow Argot to fit into the application stack developed.

An important part of developing for embedded systems is size, so I've been squeezing the Argot meta data and developing ways to allow the Argot protocol to work on the smallest of devices. In doing so, I've also been improving the meta dictionary and removing a few niggling constraints.

The first change to help size was the introduction of the uvint28 data type. This is an unsigned variable length integer with up to 28 bits of integer data. It uses the high bit of each octet as a continuation bit. The integer can be between 1 and 4 octets. The 28 comes from the fact that the normal 32 bit integer loses 4 bits of precision to the continuation bits. This type has replaced the uint16 (unsigned 16 bit integer) in the meta dictionary and in doing so removes many zero bytes. It also removes the limitation of the 16 bit integer. The meta dictionary when encoded after this change is 985 bytes long and includes 29 data types.

The next change was to introduce a meta.cluster type. This is a group definition and allows each name to refer to a cluster. This allows meta.name to use cluster references instead of recording the full class names for each definition. This change introduces a few new types, however, overall it removes a lot of duplicate information. The result is that the meta dictionary is now 888 bytes long and includes 35 data types. I'm not superstitious, however it's pretty cool that the end result is 888 bytes long; a very lucky number in Chinese.

[Edit: After doing some testing I discovered a one off bug which was causing an extra byte of data in strings. The meta dictionary is now 859 bytes long. Not as cool as the 888 byte length, but it is even shorter which is great!]

The end result of these changes should allow a full service description to use from 3kb of data and 3kb of code. In the coming weeks this will be confirmed and tested using the same mechanisms developed back in 2005. That is, using a Java client with full Argot protocol stack and a cut down purpose built embedded Argot stack.

Depending on the application and size of device, 3kb of data may still be too large. To resolve this, I've been looking at the Argot type resolution protocol. Instead of storing the Argot meta data on the device, the device can simply report an URL or other host that contains the Argot meta data. This should in effect allow the device to report the full message definition of its services in less than 1kb of code and data.

For those interested, here's the new meta dictionary data definitions.


(library.list [

/* BASE_ID 1 */

(library.entry
(library.base)
(meta.cluster))

/* UINT8_ID 2 */

(library.entry
(library.definition meta.name:"uint8" meta.version:"1.3")
(meta.atom uvint28:8 uvint28:8
[ (meta.attribute.size uvint28:8)
(meta.attribute.integer)
(meta.attribute.unsigned)
(meta.attribute.bigendian) ] ))

/* UVINT28_ID 3 */

(library.entry
(library.definition meta.name:"uvint28" meta.version:"1.3")
(meta.atom uvint28:8 uvint28:32
[ (meta.attribute.size uvint28:28)
(meta.attribute.integer)
(meta.attribute.unsigned)
(meta.attribute.bigendian) ] ))

/* META_GROUP_ID 4 */

(library.entry
(library.name meta.name:"meta" )
(meta.cluster))


/* META_ID_ID 5 */

(library.entry
(library.definition meta.name:"meta.id" meta.version:"1.3")
(meta.reference #uvint28))


/* META_CLUSTER_ID 6 */

(library.entry
(library.definition meta.name:"meta.cluster" meta.version:"1.3")
(meta.sequence []))

/* META_ABSTRACT_MAP_ID 7 */

(library.entry
(library.definition meta.name:"meta.abstract_map" meta.version:"1.3")
(meta.sequence [
(meta.tag u8utf8:"id" (meta.reference #meta.id))
]))

/* META_ABSTRACT_ID 8 */

(library.entry
(library.definition meta.name:"meta.abstract" meta.version:"1.3")
(meta.sequence [
(meta.array
(meta.reference #uint8)
(meta.reference #meta.abstract_map))]))


/* U8UTF8_ID 9 */

(library.entry
(library.definition meta.name:"u8utf8" meta.version:"1.3")
(meta.encoding
(meta.array
(meta.reference #uint8)
(meta.reference #uint8))
u8utf8:"UTF-8"))


/* META_NAME_ID 10 */

(library.entry
(library.definition meta.name:"meta.name" meta.version:"1.3")
(meta.sequence [
(meta.tag u8utf8:"group" (meta.reference #meta.id))
(meta.tag u8utf8:"name" (meta.reference #u8utf8))
]))


/* META_VERSION_ID 11 */

(library.entry
(library.definition meta.name:"meta.version" meta.version:"1.3")
(meta.sequence [
(meta.tag u8utf8:"major" (meta.reference #uint8))
(meta.tag u8utf8:"minor" (meta.reference #uint8))
]))

/* META_DEFINITION_ID 12 */

(library.entry
(library.definition meta.name:"meta.definition" meta.version:"1.3")
(meta.abstract [
(meta.abstract_map #meta.cluster)
(meta.abstract_map #meta.atom)
(meta.abstract_map #meta.abstract)
(meta.abstract_map #meta.abstract_map)
(meta.abstract_map #meta.expression)
]))


/* META_EXPRESSION_ID 13 */


(library.entry
(library.definition meta.name:"meta.expression" meta.version:"1.3")
(meta.abstract [
(meta.abstract_map #meta.reference)
(meta.abstract_map #meta.tag)
(meta.abstract_map #meta.sequence)
(meta.abstract_map #meta.array)
(meta.abstract_map #meta.envelope)
(meta.abstract_map #meta.encoding)
]))

/* META_REFERENCE_ID 14 */

(library.entry
(library.definition meta.name:"meta.reference" meta.version:"1.3")
(meta.sequence [(meta.reference #meta.id)]))

/* META_TAG_ID 15 */

(library.entry
(library.definition meta.name:"meta.tag" meta.version:"1.3")
(meta.sequence [
(meta.tag u8utf8:"name"
(meta.reference #u8utf8))
(meta.tag u8utf8:"data"
(meta.reference #meta.expression))]))


/* META_SEQUENCE_ID 16 */

(library.entry
(library.definition meta.name:"meta.sequence" meta.version:"1.3")
(meta.array
(meta.reference #uint8)
(meta.reference #meta.expression)))


/* META_ARRAY_ID 17 */

(library.entry
(library.definition meta.name:"meta.array" meta.version:"1.3")
(meta.sequence [
(meta.tag u8utf8:"size" (meta.reference #meta.expression))
(meta.tag u8utf8:"data" (meta.reference #meta.expression))]))

/* META_ENVELOPE_ID 18 */

(library.entry
(library.definition meta.name:"meta.envelope" meta.version:"1.3")
(meta.sequence [
(meta.tag u8utf8:"size" (meta.reference #meta.expression))
(meta.tag u8utf8:"type" (meta.reference #meta.expression)) ]))


/* META_ENCODING_ID 19 */

(library.entry
(library.definition meta.name:"meta.encoding" meta.version:"1.3")
(meta.sequence [
(meta.tag u8utf8:"data" (meta.reference #meta.expression))
(meta.tag u8utf8:"encoding" (meta.reference #u8utf8))]))


/* META_ATOM_ID 20 */

(library.entry
(library.definition meta.name:"meta.atom" meta.version:"1.3")
(meta.sequence [
(meta.tag u8utf8:"min_bit_length" (meta.reference #uvint28))
(meta.tag u8utf8:"max_bit_length" (meta.reference #uvint28))
(meta.tag u8utf8:"attributes"
(meta.array
(meta.reference #uint8)
(meta.reference #meta.atom_attribute)))]))

/* META_ATOM_ATTRIBUTE_ID 21 */

(library.entry
(library.definition meta.name:"meta.atom_attribute" meta.version:"1.3")
(meta.abstract [
(meta.abstract_map #meta.attribute.size)
(meta.abstract_map #meta.attribute.integer)
(meta.abstract_map #meta.attribute.unsigned)
(meta.abstract_map #meta.attribute.bigendian)
]))

/* META_ATTRIBUTE_CLUSTER_ID 22 */

(library.entry
(library.name meta.name:"meta.attribute" )
(meta.cluster))


/* META_ATTRIBUTE_SIZE_ID 23 */

(library.entry
(library.definition meta.name:"meta.attribute.size" meta.version:"1.3")
(meta.sequence [
(meta.tag u8utf8:"size" (meta.reference #uvint28))
]))


/* META_ATTRIBUTE_INTEGER_ID 24 */

(library.entry
(library.definition meta.name:"meta.attribute.integer" meta.version:"1.3")
(meta.sequence []))


/* META_ATTRIBUTE_UNSIGNED_ID 25 */

(library.entry
(library.definition meta.name:"meta.attribute.unsigned" meta.version:"1.3")
(meta.sequence []))


/* META_ATTRIBUTE_BIGENDIAN_ID 26 */

(library.entry
(library.definition meta.name:"meta.attribute.bigendian" meta.version:"1.3")
(meta.sequence[]))


/* DICTIONARY 27 */

(library.entry
(library.name meta.name:"dictionary")
(meta.cluster))

/* DICTIONARY_BASE 28 */

(library.entry
(library.definition meta.name:"dictionary.base" meta.version:"1.3")
(meta.sequence []))

/* DICTIONARY_NAME 29 */

(library.entry
(library.definition meta.name:"dictionary.name" meta.version:"1.3")
(meta.sequence [
(meta.tag u8utf8:"name" (meta.reference #meta.name))
]))

/* DICTIONARY_DEFINITION 30 */

(library.entry
(library.definition meta.name:"dictionary.definition" meta.version:"1.3")
(meta.sequence [
(meta.tag u8utf8:"id" (meta.reference #meta.id))
(meta.tag u8utf8:"version" (meta.reference #meta.version))
]))

/* DICTIONARY_RELATION 31 */
(library.entry
(library.definition meta.name:"dictionary.relation" meta.version:"1.3")
(meta.sequence [
(meta.tag u8utf8:"id" (meta.reference #meta.id))
]))

/* DICTIONARY_LOCATION 32 */

(library.entry
(library.definition meta.name:"dictionary.location" meta.version:"1.3")
(meta.abstract [
(meta.abstract_map #dictionary.base)
(meta.abstract_map #dictionary.name)
(meta.abstract_map #dictionary.definition)
(meta.abstract_map #dictionary.relation)
]))

/* DICTIONARY_DEFINITION_ENVELOPE_ID 33 */

(library.entry
(library.definition meta.name:"dictionary.definition_envelope" meta.version:"1.3")
(meta.envelope
(meta.reference #uvint28)
(meta.reference #meta.definition)))

/* DEFINITION_ENTRY_ID 34 */

(library.entry
(library.definition meta.name:"dictionary.entry" meta.version:"1.3")
(meta.sequence [
(meta.tag u8utf8:"id" (meta.reference #meta.id))
(meta.tag u8utf8:"location" (meta.reference #dictionary.location))
(meta.tag u8utf8:"definition" (meta.reference #dictionary.definition_envelope))]))

/* DICTIONARY_ENTRY_LIST_ID 35 */

(library.entry
(library.definition meta.name:"dictionary.entry_list" meta.version:"1.3")
(meta.array
(meta.reference #uvint28)
(meta.reference #dictionary.entry )))

])


And just for fun, here's the meta dictionary encoded. The encoding is a mixture of hex and ascii. Ascii is only used for a-z characters.


23 01 1c 01 06 02 1e 01 05 u i n t 8 01 03 09 14 08 08 04 17 08 18 19 1a 03 1e 01 07 u
v i n t 2 8 01 03 09 14 08 1c 04 17 08 18 19 1a 04 1d 01 04 m e t a 01 06 05 1e 04
03 2e i d 01 03 02 0e 03 06 1e 04 08 2e c l u s t e r 01 03 02 10 00 07 1e 04 0d 2e
a b s t r a c t _ m a p 01 03 08 10 01 0f 02 i d 0e 05 08 1e 04 09 2e a b s
t r a c t 01 03 07 10 01 11 0e 02 0e 07 09 1e 01 06 u 8 u t f 8 01 03 0c 13 11 0e
02 0e 02 05 U T F 2d 8 0a 1e 04 05 2e n a m e 01 03 13 10 02 0f 05 g r o u p 0e
05 0f 04 n a m e 0e 09 0b 1e 04 08 2e v e r s i o n 01 03 14 10 02 0f 05 m a j
o r 0e 02 0f 05 m i n o r 0e 02 0c 1e 04 0b 2e d e f i n i t i o n 01 03 07
08 05 06 14 08 07 0d 0d 1e 04 0b 2e e x p r e s s i o n 01 03 08 08 06 0e 0f 10 11
12 13 0e 1e 04 0a 2e r e f e r e n c e 01 03 04 10 01 0e 05 0f 1e 04 04 2e t a g
01 03
n c e 01 03 07 10 01 11 0e 02 0e 0d 11 1e 04 06 2e a r r a y 01 03 12 10 02 0f 04 s
i z e 0e 0d 0f 04 t y p e 0e 0d 12 1e 04 09 2e e n v e l o p e 01 03 12 10 02
0f 04 s i z e 0e 0d 0f 04 t y p e 0e 0d 13 1e 04 09 2e e n c o d i n g 01 03
16 10 02 0f 04 d a t a 0e 0d 0f 08 e n c o d i n g 0e 09 14 1e 04 05 2e a t o
m 01 03 7 10 03 0f 0e m i n _ b i t _ l e n g t h 0e 03 0f 0e m a x _ b
i t _ l e n g t h 0e 03 0f 0a a t t r i b u t e s 11 0e 02 0e 15 15 1e 04
0f 2e a t o m _ a t t r i b u t e 01 03 06 08 04 17 18 19 1a 16 1d 04 0a 2e a
t t r i b u t e 01 06 17 1e 16 05 2e s i z e 01 03 0a 10 01 0f 04 s i z e 0e
03 18 1e 16 08 2e i n t e g e r 01 03 02 10 00 19 1e 16 09 2e u n s i g n e d
01 03 02 10 00 1a 1e 16 0a 2e b i g e n d i a n 01 03 02 10 00 1b 1d 01 0a d i c
t i o n a r y 01 06 1c 1e 1b 05 2e b a s e 01 03 02 10 00 1d 1e 1b 05 2e n a m
e 01 03 0a 10 01 0f 04 n a m e 0e 0a 1e 1e 1b 0b 2e d e f i n i t i o n 01 03
15 10 02 0f 04 n a m e 0e 0a 0f 07 v e r s i o n 0e 0b 1f 1e 1b 09 2e r e l a
t i o n 01 03 0f 10 02 0f 02 i d 0e 05 0f 03 t a g 0e 09 20 1e 1b 09 2e l o c a
t i o n 01 03 06 08 04 1c 1d 1e 1f 21 1e 1b 14 2e d e f i n i t i o n _ e n
v e l o p e 01 03 05 12 0e 03 0e 0c 22 1e 1b 06 2e e n t r y 01 03 1e 10 03 0f 02
i d 0e 03 0f 04 n a m e 0e 20 0f 0a d e f i n i t i o n 0e 21 23 1e 1b 0b 2e
e n t r y _ l i s t 01 03 07 10 01 11 0e 03 0e 22