Friday 29 August 2008

Binary Object REst Distributed (BORED) system - Part 4 - Constraints & Assumptions

At this point I've introduced the BORED idea, the blueprint and provided a rough 0.1 version of the protocol. The next step is to test the message structure against various constraints & assumptions of REST.

The first constraints and assumptions to be tested are based on the constraints defined by REST. They set the ground work for the protocol and provide the constraints required to define the request/response headers.

Lossless Communication Stream
The very first assumption is that the solution will operate on a lossless bi-directional communication channel that supports streams (i.e. TCP). This assumes the transport will take care of the connection set-up and tear down. The transport will ensure that the data is received in order and provides a byte stream interface. This is a rather obvious assumption to make, however, it is important to get the basics right.

For embedded devices we will assume that if it doesn't support TCP, then another transport protocol will be provided. If the messages are small enough the protocol should also operate on UDP style network protocol. The protocol may also operate on an asynchronous transport such as message queuing and email systems.

Client-Server
The second part of the requirements is that of client-server. This is REST's first requirement. Fielding describes client-server as:
"The client-server style is the most frequently encountered of the architectural styles for network-based applications. A server component, offering a set of services, listens for requests upon those services. A client component, desiring that a service be performed, sends a request to the server via a connector. The server either rejects or performs the request and sends a response back to the client."
The client-server style requires that request data is sent to the server and it responds with response data. The initial definition of the protocol's request and response data is the following. request:
preamble - BORED
version
...
;

The response structure is the same:
prefix - BORED
version
...
;

The request and response headers look the same. It contains a preamble that notifies the receiver that the message is using the BORED protocol. The preamble also provides a point where if the receiver is out of sync with the send, it provides a point where the start of the next message can be found. The client sets the version of the protocol. The server sets the version to the version it is currently using. The server must not respond with a version that is greater than the client.

Asynchronous Client-Server
One of the interesting parts of Fielding's dissertation is the REST mismatch with HTTP. Fielding states:
"HTTP/1.1, though defined to be independent of the transport protocol, still assumes that communication takes place on a synchronous transport. It could easily be extended to work on an asynchronous transport, such as e-mail, through the addition of a request identifier. Such an extension would be useful for agents in a broadcast or multicast situation, where responses might be received on a channel different from that of the request. Also, in a situation where many requests are pending, it would allow the server to choose the order in which responses are transferred, such that smaller or more significant responses are sent first."
To support asynchronous requests, a request identifier needs to be added to the request and response data structures. i.e. The request:
prefix - BORED
version
request identifier
...
;
and response:
prefix - BORED
version
request identifier
...
;
The request identifier is set by the client. The server must respond with the same request identifier in the response. This allows a client and server to use a single channel and interleave requests and responses. This improves the channel usage and reduces latency which leads to a better user experience. Using a single channel for multiple requests also aligns well with the direction of CPUs containing many cores. Many threads can be assigned to a single channel.

The response can come from either a cache, server proxy, or server containing the object. The important thing is that by introducing a request identifier the protocol no longer needs to conform strictly to synchronous request/response semantics.

Specifying a "request identifier" is a rather simplistic approach to allowing asynchronous request/response message processing. One problem with this approach is that the server has no way of letting the client know how many messages it is able to process at one time. A possible solution to this would be for the server to response with how many message slots it has available. ie response:
prefix - BORED
version
available request slots
request identifier
For a server with constrained resources the request slots value may always be 1. Using the response message to provide the number of request slots requires that the client receive at least one response before it can know how many requests it can send. A simple solution to this would be that the server notifies the client upon initial request. This will need to be explored further in the future.

The other feature suggested by Fielding is that an asynchronous request could use different channels for receipt of the request. To allow this, additional optional headers could be provided to specify a "return address" and "time to live". The "time to live" allows the client to specify how long it is willing to wait for a response. If the server is unable to provide a response before the given time it should drop the request and not deliver the response. This type of feature is added to the protocol via the optional headers because it likely to be used rarely.

Introducing the concept of asynchronous requests and responses introduces a number of new challenges that must be explored. The proof of how well each of these ideas will work in BORED will be explored when implementing the protocol.

No comments:

Post a Comment

Note: only a member of this blog may post a comment.