I collect transport-layer protocols. NONE MORE DORK.
See also under Low-Level Protocols.
For general notes on flow control in IP networks (essential in large, heterogenous networks, like the Internet), see RFC 2581, RFC 2309, RFC 3448, and, above all, RFC 2914.
Anyway, in order of increasing obscurity ...
|Specification||RFC 793 (amended by RFC 1122)|
The protocol which transports 99% of the world's network data (not counting phone calls).
There are a bunch of specifications extending RFC 793; the only one which officially updates it is RFC 3168, which adds support for ECN, but there are specs for high-performance options, SACK, DSACK, and a bunch of other stuff.
The single big weakness of TCP, to my mind, is that it's a stream-oriented protocol, when almost all application protocols are message-oriented in some way (the only one i can think of that isn't is telnet). This means that every application-layer protocol has to provide its own messaging sublayer (usually an implicit one), which is a lot of wasted effort. Also, the invisibility of the message boundaries to the TCP layer means it can't use them to organise its transmissions, so you end up with hacks like Nagle's algorithm to make it work smoothly. Yes, being a stream fits naturally with the unix programming model, but then the unix programming model is cracked anyway.
Another weakness of TCP is its setup overhead. TCP carries out an exchange of packets (the 'three-way handshake') before the endpoints get to exchange data. In addition, the flow control algorithm for TCP involves a 'slow start', where transmission starts slowly, and ramps up to the capacity of the route over time. These factors combine to mean that a TCP connection does not become efficient until quite a number of packets in; whilst this is not a problem for long-lived connections (as used by connection-oriented application layer protocols, or those making large transfers), it makes TCP very unwieldy for short-lived connections, as used by many service protocols (like DNS, SNMP, etc).
RFC 1644 specifies a modification of TCP (which never really took off) which allows a TCP connection to start carrying data earlier, partially overcoming the setup overhead.
The lack of message demarcation is addressed by my modest proposal for sequenced packets over ordinary TCP.
The protocol which transports the other 1% of the world's traffic.
UDP's killer problem is its unreliability; messages are guaranteed to be delivered intact if at all, but there's no guarantee that they'll actually be delivered. Other problems are lack of in-order delivery, lack of duplicate prevention, lack of connections, and the limitation of message size to the network layer MTU. If you don't need those, though, UDP is boss.
RFC 3828 specifies UDP Lite, a minor modification of UDP which allows delivery of damaged messages. This may be useful for error-tolerant application layer protocols, such as streaming audio or video protocols.
Big, scary protocol with more options than you can shake a stick at. It was essentially designed as a successor to TCP, although it's not intended to replace it. The major changes, from the application point of view, are that it provides a message-oriented connection, and messages can optionally be delivered out of order ('order of arrival') in a fairly flexible way. Other changes include multiplexing of several streams of messages within a connection, multihoming of connections (so connections can be spread over several networking interfaces at either end), and bundling of multiple messages into a single network-layer packet. Internally, SCTP uses more complex mechanisms for flow control and validation than TCP.
SCTP messages can be larger than the network layer MTU.
See also RFC 3286 for a gentle introduction to SCTP.
|Specification||'The IL Protocol' (Plan 9 Manual)|
This is the transport-layer protocol used for RPC in the Plan 9 operating system. It's used to transport a reliable, duplicate-free ordered stream smallish (up to MTU sized) messages from one host to another. IL doesn't really have any flow control, although a rudimentary form could probably be added, using the information used for reliable delivery.
IL packets sit inside IP packets (with protocol number 40 = 0x28), and look like this:
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Checksum | Packet Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Packet Type | Special | Source Port | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Destination Port | ?!?!?! +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Sequence Identifier | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Sequence Acknowledgement | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Where the fields are:
If i were designing IL2, i'd reform the checksum (CRC16 over the whole IL packet, with only the checksum set to zero, plus a pseudo-header as in TCP), drop the packet length (it's available from the network layer, dammit!), and shuffle the fields to lose the padding. But i'm not.
Anyway, the packet structure and the meanings of the fields are all fairly straightforward (ie fairly similar to TCP!). There are four things to explain: the use of sequence numbers, the different types of packet, the handshake and closing exchanges, and the reliability mechanism.
Sequence numbers are easy: every message (not every byte, as in TCP) in a connection has a unique one (unique within each side of the connection, that is - the 5-tuple (source address, source port, destination address, destination port, sequence number) globally uniquely identifies a message), with the first message having an arbitrary number (not zero, please, to give some protection against packets from dead connections), and each subsequent message having a number one higher than the previous one. A packet carrying a message bears the sequence number of that message as an identifier; packets not bearing messages (for which, see below), use the next number due to be assigned to a message. Every packet (with the exception of an opening sync packet) also carries an acknowledgement, which is the sequence number of the last message successfully received by the sender, where 'successfully' means 'intact, and with all preceding messages also successfully received'. Sequence numbers are the basis of IL's flow control mechanism, for which, see below.
There are seven packet types: sync, data, dataquery, ack, query, state and close.
Only data and dataquery packets carry messages; the other types of packets do not.
The opening handshake for IL is as follows:
The spec is hazy on what to do if packets get lost. I am by no means a transport-layer protocol expert, but my thinking is:
AIUI, sync messages don't carry messages. I don't see why they couldn't, though, and this would allow a fast T/TCP style setup.
Closing a connection is as follows:
Again, packet loss must be considered:
Finally, reliability. The key thing is that each end of a connection keeps track of which messages the other end has received, by maintaining an awareness of the acknowledged sequence number. During normal, rapid two-way traffic, this occurs simply through the exchange data packets, which carry a sequence acknowledgement. If only one end is actively sending, then the other end should periodically send an ack packet (using an ack timeout, reset on sending of any kind of packet), purely to communicate its sequence acknowledgement. This is highly straightforward.
It's when packets go missing that things get interesting. Two mechanisms come into play:
As an optimisation, a host can send a dataquery packet; this is simply a packet which is both a data and a query - it carries a message, and asks for a state packet to be sent back.
I don't understand why ack and state are separate. Maybe it's so the querier can know that the packet is a response to its query, and not just a delayed acknowledgement.
There is probably a hell of a lot more information about the state of the network and the peer that can be wrung out of these exchanges by a clever implementation. Suggestions on a postcard to Bell Labs, please!
|Ordered?||Y (sort of)|
Is realtime. For media stuff.
Runs on top of UDP, or another protocol; sort of a transport decorator. It claims to be "a new style of protocol following the principles of application level framing and integrated layer processing proposed by Clark and Tennenhouse". Make of that what you will.