User:Graham.Fountain/Deterministic Ethernet Fault Tolerant Network Concept

The deterministic Ethernet fault tolerant network (DEFTNet) concept has been developed and patented by BAE Systems PLC (EU patent: EP2721785A1 - US application: US20140310354) as a means of proving that data transported between standard IEEE 802.3 Ethernet network interfaces over a packet switched network is transported reliably, within a deadline, and only between the expected source and destination(s) of the data. The use of standard IEEE 802.3 Ethernet interfaces was considered an extremely significant goal in the development of this concept. Its description as a "concept", as opposed to an alternative protocol, is intended to reflect this support for standard interfaces, and thus computing equipment, etc., provided with these interfaces. The DEFTNet concept is primarily intended for avionic applications, e.g. in mission systems; essentially as a means to replace MIL-STD-1553B and STANAG 3910 with a modern PSN providing very much higher bandwidth and connectivity suitable for use in Integrated modular avionics (IMA).

If implemented correctly, the proofs of reliability given for a DEFTNet compliant network will be tolerant of faults in or malicious actions by any other equipment connected to the network. By transmitting data over two identical (dual redundant) networks, the transport of these transmissions will also be tolerant of faults in any one switch in the path between source and destination. These data may, therefore, be transmitted using unassured transport protocols such as UDP, with a sufficiently high probability of success – i.e. limited by the physical layer bit error rates, etc. - for a firm real-time system such as IMA.

As with several other protocols providing deterministic data transport (though not between standard Ethernet interfaces), the data requiring these properties is treated as a critical data class, with any other data treated as a non-critical class. The critical data is necessarily connection oriented. The non-critical data may be connection oriented, using any means of identification that is compatible with Ethernet, or may be connectionless.

History
The DEFTNet concept is based on the application of techniques developed with ATM, and using features of that protocol, to Ethernet. These methods, as applied to ATM, were described in principle in UK submissions to ASAAC II in the 1990s. Their application to Ethernet was in response to Ethernet actually becoming the truly ubiquitous network protocol that ATM was only promised to become. However, it was seen as highly important that this process should retain full compatibility with standard IEEE 802.3 Ethernet interfaces (IEEE 802.3-2002 and later) and no additional hardware functionally should be involved.

Basis of Proof
Reliable, timely data transport is made provable by preventing any source of network congestion from affecting the critical data class. This is done through a combination of bandwidth management in the network switches (UPC-NCP), to limit the data that can need queueing by any critical connection, specifying the minimum size of the buffer available for queuing the data routed through a switch output, calculating the buffer requirement from the applied limits, and showing the buffer requirement is necessarily no greater than the available capacity.

This calculation of the buffer requirement also allows the calculation of the maximum delay that can be caused in the queue. This will be the time required to empty the queue from its maximum level at the transmission rate of the associated output, allowing for interframe gaps, preambles, and delineators. These buffering delays will be the only components of end-to-end delays that cannot be predicted from the specifications or measurement of the performance of the network interfaces and switches. Hence, once calculated, the maximum end-to-end delay can also be calculated and compared with the deadline requirement for the critical data to show that this is always met.

The DEFTNet concept is based on calculations of the maximum buffer requirements from the maximum traffic that can pass through the bandwidth management functions. If the buffer space provided by the switch is enough for the maximum critical data that the switch can physically queue (i.e. not based on any probabilistic or statistical analysis of the relationships of the actions of their sources), then these queues will never overflow in any possible situation. If these queues are managed by tail-drop only and the transmission of critical data has simple priority over non-critical data, then there can be no loss of critical data from them, and all that passes through the bandwidth management functions will be transmitted. The only remaining source of loss of critical data should then be in the physical layers, primarily due to bit errors. These can be made very low rate and random by the use of optical media, etc.

Hence, the reliability of the transmissions of critical data being delivered within their deadlines cannot be affected by transmissions of other data, critical or non-critical.

The transport of critical data can also be protected from the effects of faults and failures in the network by the use of several networks in parrallel. Generally, these component networks would be identical copies, with each switch and link between connected comoponents replicated. As a result any fault or failure in one component network cannot prevent reliable, timely delivery over the other or others. This requires all connected components that use critical data to provide multiple, independan network interfaces. These components must transmit copies of all critical data through these multiple interfaces and consolidate all critical data received through them.

Traffic Policing in the network
The bandwidth management algorithm used, and whether this is applied to the byte or frame rate, is not important – though, the use of either the leaky or token bucket algorithm, applied as traffic policing of the frame rates, is recommended. However, while the exact nature of the bandwidth management algorithm is not important, the method for calculating the buffer requirement must exactly reflect the mechanism used. That is, if the proofs needed are that critical data can never be lost or excessively delayed by network congestion, then these methods must allow for the maximum critical data that the bandwidth management functions permit.

This bandwidth management will require sources of critical data to limit their transmissions so that they are not dropped by the network. Excessive frame rates being assumed to have a greater impact on the real-time performance of the destinations of critical data, e.g. through interrupt requests, than the excess memory requirements. The token or leaky bucket algorithm requires that traffic arrive with a maximum jitter around a nominal rate. When applied to frame rates, this requires that the frames arrive no earlier than the jitter tolerance before their expected arrival, which is calculated from the last arrival to conform to this requirement. This jitter tolerance in the traffic policing must allow for all sources in delay variation in the transmission and upstream transport of the frame, i.e. variations in scheduling and delay. Hence, the source must transmit data frames at a nominal rate and with a jitter no greater than the jitter tolerance of the traffic policing function less any variation in transport delay between.

The location for this scheduling of transfers in not specified by the DEFTNet concept, i.e. is not required to be a function of the network interface. It may, therefore, be implemented in the higher level scheduling function that would normally be part of any firm real-time system component, and may be perfromed using any method that allows the sources to transmit within the limits imposed by the network. Hence, they are not restricted to using the cyclic executive, etc., and may be pre-emptivly scheduled etc. However, it does have to be applied to the individual data frames.

Support for standard Ethernet interfaces
Support for standard Ethernet interfaces meeting IEEE 802.3-2002 and later is obtained by using IEEE 802.1Q VLANs to identify the critical connections; implying at least one VLAN Id must be reserved for non-critical traffic. Separation of the critical and non-critical traffic relies on the network switches providing separate queues for the critical and non-critical data, with the option to use several different queues fir non-critical data. These are commonly provided by switches supporting IEEE 802.1Q CoS. However, the critical data must be identified from its VLAN Id, not the PCP value, the critical data queue associated with a switch output must be serviced with simple priority over all others associated with the same output, and must be managed by tail-drop only.

Support for standard interfaces that do not or cannot support VLANs may be provided by additional functionality in the switches, e.g. access control. For example, rather than apply a single VLAN Id to all untagged frames ingressing a port, packet inspection may be used to identify critical connections from, e.g. the destination IP address and port value – these parameters being transmitted with all frames associated with transmissions on a critical connection. The treatment of ARP frames to and from the IP destination as critical data would, however, be essential in this.

Use of Ethernet Switches
The main intention in the development of the DEFTNet concept was the support of Ethernet interfaces, so that COTS components could be leveraged in systems using this concept. However, the switches require functions that are not covered by Ethernet.

The methods of proving that data is reliably delivered within their deadlines assumes that the only queueing in the switch applied to critical data is directly associated with the output, i.e. there are no input buffers and no distributed buffering. It applies, therefore, to wire speed switches that provide either output buffers or shared, centralized buffers for these output queues. Wire speed operation and shared, centralized buffers are largely the norm in COTS Ethernet switches.

Traffic policing in the switches is an increasingly common feature of Ethernet switches. The niche application of Software Defined Networking (SDN) is one where traffic policing and the access control features are seen as significant. Hence it is possible that these requirements can be met by switches that implement version 1.3 or later of the OpenFlow SDN protocol.

There has been some evaluation of this version of OpenFlow by the Computer Science Dept. of Loughborough University. However, this was primarily in relation to the functionalities needed of switches for DEFTNet, and did not address the question of modelling or calculating the buffer requirement or buffering delays.