User:ConorMcD1/Recursive Inter-Network Architecture (RINA)

The Recursive InterNetwork Architecture (RINA) is a computer network architecture that unifies distributed computing and telecommunications. RINA's fundamental principle is that computer networking is just Inter-Process Communication or IPC. RINA reconstructs the overall structure of the Internet, forming a model that comprises a single repeating layer, the DIF (Distributed IPC Facility), which is the minimal set of components required to allow distributed IPC between application processes. RINA supports inherently and without the need of extra mechanisms mobility, multi-homing and Quality of Service, provides a secure and programmable environment, motivates for a more competitive marketplace and allows for a seamless adoption.

History and Motivation
The principles behind RINA, were first presented by John_Day_(computer_scientist) in his book “Patterns in Network Architecture: A return to Fundamentals”. This work is a start afresh, taking into account lessons learned in the 35 years of TCP/IP’s existence, as well as the lessons of OSI’s failure and the lessons of other network technologies of the past few decades, such as CYCLADES, DECnet or Xerox Network Systems.

From the early days of telephony to the present, the telecommunications and computing industries have evolved significantly. However, they have been following separate paths, without achieving full integration that can optimally support distributed computing; the paradigm shift from telephony to distributed applications is still not complete. Telecoms have been focusing on connecting devices, perpetuating the telephony model where devices and applications are the same. A look at the current Internet protocol suite shows many symptoms of this thinking :


 * The network routes data between interfaces of computers, as the public switched telephone network switched calls between phone terminals. However, it is not the source and destination interfaces that wish to communicate, but the distributed applications.
 * Applications have no way of expressing their desired service characteristics to the network, other than choosing a reliable (Transmission Control Protocol) or unreliable (User Datagram Protocol) type of transport. The network assumes that applications are homogeneous by providing only a single quality of service.
 * The network has no notion of application names, and has to use a combination of the interface address and transport layer port number to identify different applications. In other words, the network uses information on “where” an application is located to identify “which” application this is. Every time the application changes its point of attachment, it seems different to the network, greatly complicating multi-homing, mobility, and security.

Several attempts have been made to propose architectures that overcome the current Internet limitations, under the umbrella of the Future Internet research efforts. However, most proposals argue that requirements have changed and therefore the Internet is no longer capable to cope with them. While it is true that the environment in which the technologies that support the Internet today live is very different from when they where conceived in the late 1970s, changing requirements are not the only reason behind the Internet's problems related to multihoming, mobility, security or QoS to name a few. The root of the problems may be attributed to the fact the current Internet is based on a tradition focused on keeping the original ARPANET demo working and fundamentally unchanged, as illustrated by the following paragraphs.

1972. Multi-homing not supported by the ARPANET. In 1972 the Tinker Air Force Base wanted connections to two different IMPs (Interface Message Processors, the predecessors of today's routers) for redundancy. ARPANET designers realized that they couldn't support this feature because host addresses were the addresses of the IMP port number the host was connected to (borrowing from telephony). To the ARPANET two interfaces of the same host had different addresses therefore it had no way of knowing that they belong to the same host. The solution was obvious, as in operating systems, a logical address space naming the nodes (hosts and routers) was required on top of the physical interface address space. However, the implementation of this solution was left for future work, and it is still not done today: “IP addresses of all types are assigned to interfaces, not to nodes”. As a consequence, routing table sizes are orders of magnitude bigger than they would need to be, and multi-homing and mobility are complex to achieve, requiring both special protocols and point solutions.

1978. Transmission Control Protocol (TCP) split from the Internet Protocol (IP). Initial TCP versions performed the error and flow control (current TCP) and relaying and multiplexing (IP) functions in the same protocol. In 1978 TCP was split from IP, although the two layers had the same scope. This would not be a problem if: i) the two layers were independent and ii) the two layers didn't contain repeated functions. However none of both is right: in order to operate effectively IP needs to know what TCP is doing. IP fragmentation and the workaround of MTU discovery that TCP does in order to avoid it to happen is a clear example of this issue. In fact, as early as in 1987 the networking community was well aware of the IP fragmentation problems, to the point of considering it harmful. However, it was not understood as a symptom that TCP and IP were interdependent and therefore splitting it into two layers of the same scope was not a good decision.

1981. Watson's fundamental results ignored. Richard Watson in 1981 provided a fundamental theory of reliable transport, whereby connection management requires only timers bounded by a small factor of the Maximum Packet Lifetime (MPL). Based on this theory, Watson et al. developed the Delta-t protocol in which the state of a connection at the sender and receiver can be safely removed once the connection-state timers expire without the need for explicit removal messages. And new connections are established without an explicit handshaking phase. On the other hand, TCP uses both explicit handshaking as well as more limited timer-based management of the connection’s state. Had TCP incorporated Watson's results it would be more efficient, robust and secure, eliminating the use of SYNs and FINs and therefore all the associated complexities and vulnerabilities to attack (such as SYN flood).

1983. Internetwork layer lost, the Internet ceases to be an Internet. Early in 1972 the International Network Working Group (INWG) was created to bring together the nascent network research community. One of the early tasks it accomplished was voting an international network transport protocol, which was approved in 1976. A remarkable aspect is that the selected option, as well as all the other candidates, had an architecture composed of 3 layers of increasing scope: data link (to handle different types of physical medias), network (to handle different types of networks) and internetwork (to handle a network of networks), each layer with its own addresses. In fact when TCP/IP was introduced it run at the internetwork layer on top of the Network Control Program and other network technologies. But when NCP was shut down, TCP/IP took the network role and the internetwork layer was lost. As a result, the Internet ceased to be an Internet and became a concatenation of IP networks with an end-to-end transport layer on top. A consequence of this decision is the complex routing system required today, with both intra-domain and inter-domain routing happening at the network layer or the use of NAT, Network Address Translation, as a mechanism for allowing independent address spaces within a single network layer.



1983. First opportunity to fix addressing missed. The need for application names and distributed directories that mapped application names to internetwork addresses was well understood since mid 1970s. They were not there at the beginning since it was a major effort and there were very few applications, but they were expected to be introduced once the “host file” was automated (the host file was centrally maintained and mapped human-readable synonyms of addresses to its numeric value). However, application names were not introduced and DNS, the Domain Name System, was designed and deployed, continuing to use well-known ports to identify applications. The advent of the web and HTTP caused the need for application names, introducing URLs. However the URL format ties each application instance to a physical interface of a computer and a specific transport connection (since the URL contains the DNS name of an IP interface and TCP port number), making multi-homing and mobility very hard to achieve.

1986. Congestion collapse takes the Internet by surprise. Despite the fact that the problem of congestion control in datagram networks had been known since the very beginning (in fact there had been several publications during the 70s and early 80s, ) the congestion collapse in 1986 caught the Internet by surprise. What is worse, it was decided to adopt the congestion avoidance scheme from Ethernet networks with a few modifications, but it was put in TCP. The effectiveness of a congestion control scheme is determined by the time-to-notify, i.e. reaction time. Putting congestion avoidance in TCP maximizes the value of the congestion notification delay and its variance, making it the worst place it could be. Moreover, congestion detection is implicit, causing several problems: i) congestion avoidance mechanisms are predatory: by definition they need to cause congestion to act; ii) congestion avoidance mechanisms may be triggered when the network is not congested, causing a downgrade in performance.

1992. Second opportunity to fix addressing missed. In 1992 the Internet Architecture Board (IAB) produced a series of recommendations to resolve the scaling problems of the IPv4 based Internet: address space consumption and routing information explosion. Three types of solutions were proposed: introduce CIDR (Classless Inter-Domain Routing) to mitigate the problem, design the next version of IP (IPv7) based on CLNP (ConectionLess Network Protocol) and continue the research into naming, addressing and routing. CNLP was an OSI based protocol that addressed nodes instead of interfaces, solving the old multi-homing problem introduced by the ARPANET, and allowing for better routing information aggregation. CIDR was introduced but the IETF didn't accept an IPv7 based on CLNP. IAB reconsidered its decision and the IPng process started, culminating with IPv6. One of the rules for IPng was not to change the semantics of the IP address, which continues to name the interface perpetuating the multi-homing problem.

There are still more wrong decisions that have resulted in long-term problems for the current Internet, such as:


 * In 1988 IAB recommended using the Simple Network Management Protocol (SNMP) as the initial network management protocol for the Internet to later transition to the object-oriented approach of the Common Management Information Protocol (CMIP) . SNMP was a step backwards in network management, justified as a temporal measure while the required more sophisticated approaches were implemented, but the transition never happened.
 * Since IPv6 didn’t solve the multi-homing problem and naming the node was not accepted, the major theory pursued by the field is that the IP address semantics are overloaded with both identity and location information, and therefore the solution is to separate the two, leading to the work on [[Locator/Identifier Separation Protocol] (LISP). However all approaches based on LISP have scaling problems because i) it is based on a false distinction (identity vs. location) and ii) it is not routing packets to the end destination (LISP is using the locator for routing, which is an interface address; therefore the multi-homing problem is still there).
 * The recent discovery of bufferbloat due to the use of large buffers in the network. Since the beginning of the 80s it was already known that the buffer size should be the minimal to damp out transient traffic bursts, but no more since buffers increase the transit delay of packets within the network.
 * The inability to provide efficient solutions to security problems such as authentication, access control, integrity and confidentiality, since they were not part of the initial design. As stated in “experience has shown that it is difficult to add security to a protocol suite unless it is built into the architecture from the beginning”.

Terminology

 * Application Entity. A task within a DAP directly involved with exchanging application information with other DAPs.
 * Common Distributed Application Process (CDAP). CDAP enables distributed applications to deal with communications at an object level, rather than forcing applications to explicitly deal with serialization and input/output operations. CDAP provides the application protocol component of a Distributed Application Facility (DAF) that can be used to construct arbitrary distributed applications, of which the DIF is an example. CDAP provides a straightforward and unifying approach to sharing data over a network without having to create specialized protocols.
 * Distributed Application Facility (DAF). A collection of two or more cooperating DAPs in one or more processing systems, which exchange information using IPC and maintain shared state. In some Distributed Applications, all members will be the same, i.e. a homogeneous DAF, or may be different, a heterogeneous DAF.
 * Distributed Application Process (DAP). The instantiation of a computer program executing in a processing system intended to accomplish some purpose. A Distributed Application Process contains one or more tasks or Application-Entities, as well as functions for managing the resources (processor, storage, and IPC) allocated to this DAP.
 * Distributed IPC Facility (DIF), Layer. A collection of two or more DAPs cooperating to provide Interprocess Communication (IPC). A DIF is a DAF that does IPC. The DIF provides IPC services to Applications via a set of API primitives that are used to exchange information with the Application’s peer.
 * IPC Process (IPCP). An Application-Process, which is a member of a DIF and implements locally the functionality to support and manage IPC using multiple sub-tasks.
 * Processing System. The hardware and software capable of executing programs instantiated as DAPs that can coordinate with the equivalent of a “test and set” instruction, i.e. the tasks can all atomically reference the same memory.
 * Protocol Data Unit (PDU). The string of octets exchanged among the Protocol Machines (PM). PDUs contain two parts: the PCI, which is understood and interpreted by the DIF, and User-Data, that is incomprehensible to this PM and is passed to its user.
 * Resource Information Base (RIB). For the DAF, the RIB is the logical representation of the local repository of the objects. Each member of the DAF maintains a RIB. A Distributed Application may define a RIB to be its local representation of its view of the distributed application. From the point of view of the OS model, this is storage.
 * Service Data Unit (SDU). The amount of data passed across the (N)-DIF interface to be transferred to the destination application process. The integrity of an SDU is maintained by the (N)-DIF. An SDU may be fragmented or combined with other SDUs for sending as one or more PDUs.

Introduction to RINA
RINA is the result of an effort that tries to work out the general principles in computer networking that applies to everything. RINA is the specific architecture, implementation, testing platform and ultimately deployment of the theory. This theory is informally known as the Inter-Process Communication “IPC model” although it also deals with concepts and results that are generic for any distributed application and not just for networking.

The IPC model captures the common elements of distributed applications, called DAFs (Distributed Application Facilities), as illustrated in the Figure to the right. A DAF is composed by two or more Distributed Application Processes or DAPs, which collaborate to perform a task. These DAPs communicate using a single application protocol called CDAP (Common Distributed Application Protocol), which enables two DAPs to exchange structured data in the form of objects. All of the DAP’s externally visible information is represented by objects and structured in a Resource Information Base (RIB), which provides a naming schema and a logical organization to the objects known by the DAP (for example a naming tree). CDAP allows the DAPs to perform six remote operations on the peer’s objects (create, delete, read, write, start and stop).

In order to exchange information, DAPs need an underlying facility that provides communication services to them. This facility is another DAF whose task is to provide and manage Inter Process Communication services over a certain scope ; hence this DAF is called DIF: Distributed IPC Facility - the DIF can be thought of as a layer. A DIF enables a DAP to allocate flows to one or more DAPs, by just providing the names of the targeted DAPs and the characteristics required for the flow (bounds on data loss and delay, in-order delivery of data, reliability, etc.). DAPs may not trust the DIF they are using, therefore may decide to protect their data before writing it to the flow - for example using encryption - via the SDU (Service Data Unit) Protection module.



DIFs can also be the users of other underlying DIFs, creating in this way the recursive structure of the RINA architecture. The DAPs that are members of a DIF are called IPC Processes or IPCPs. They have the same generic DAP structure shown in Figure 1, plus some specific tasks to provide and manage IPC. These tasks, as shown in Figure 2, can be divided into three categories: data transfer, data transfer control and layer management. The elements are ordered in increasing complexity and frequency of use, with elements at the far left being used the most (per packet processing) but the least complex, and elements to the right being not often used, but very complex. All the layers provide the same functions and have the same structure and components, however these components are configured via policies in order to adapt to different operating environments.

As depicted in Figure 2 RINA networks are usually structured in DIFs of increasing scope, starting from the so-called lower layers and going up closer to the applications. A provider network can be formed by a hierarchy of DIFs multiplexing and aggregating traffic from upper layers into the provider’s backbone. None of the provider internal layers need to be externally visible. Multi-provider DIFs (such as the public Internet or others) float on top of the ISP layers. Only three types of systems are required: hosts (which contain applications), interior routers (systems that are internal to a layer) and border routers (systems at the edges of a layer, which go one layer up or down). In short, RINA has the following features:


 * It builds on a very basic premise, yet fresh perspective that networking is not a layered set of different functions but rather a single layer of distributed Inter-Process Communication (IPC) that repeats over different scopes. Each instance of this repeating IPC layer implements the same functions/mechanisms but policies are tuned to operate over different ranges of the performance space (e.g. capacity, delay, loss).


 * It is based on a comprehensive theory of networking; it does not represent another patch, or point-solution to a piece of the problem. RINA does not propose to simply add a new “session layer” to perform some extra functionality for bridging ISP networks. Instead it takes a clean slate approach and begins with a comprehensive general theory of IPC where the number of IPC layers (DIFs) may vary at different parts of the Internet depending on the range of the resource allocation problem that must be addressed. The greater the operating ranges in a network, the more IPC layers it may have. Thus configuring the appropriate number of IPC layers allows for more predictable services to be delivered to users.


 * This repeating structure scales indefinitely, thus it avoids current problems of growing routing tables, and supports features such as multi-homing and mobility, with little or no cost. By indefinitely we mean that the nature of RINA does not impose any limits. There may be physical limits and other constraints.


 * An application process using a DIF only knows the name of the destination application process. It has no knowledge of addresses and there are no so-called “well-known ports”. Joining a DIF requires that the new member must be authenticated according to the policies of this particular facility. This yields a far more secure architecture.


 * Stacking DIFs on top of each other allows networks to be built from smaller and more manageable layers of limited scope. This divide-and-conquer strategy gives providers more resource management options than just over-provisioning. It also provides the basis for operating subnetworks at much higher utilization than in the current Internet.


 * RINA leverages the well-known concept of separating mechanism from policy in operating systems . Applying this separation to network protocols allows a DIF to provide a common minimal set of mechanisms that once instantiated with the appropriate policies allows any transport solution to be realised . Not only the transport functions of a DIF benefit from this approach, but also other ones such as management, authentication or access control; making the DIF a fully-configurable container capable of effectively operating on top of heterogeneous physical medias and to provide differentiated levels of QoS to different types of applications.


 * DIFs can be configured to not only provide the fundamental services of the traditional networking lower layers but also the services of application relaying (e.g. mail distribution and similar services), transaction processing, content distribution and peer-to-peer. This removes the barrier created by the Transport Layer in the current Internet, opening potential new markets for ISPs to provide IPC services directly to their customers leveraging their expertise in resource management of lower layers.


 * It turns out that private networks (with private addresses) are the norm. IPC processes are identified by addresses internal to the DIF and public networks are simply a degenerate case of a private network. This lays the foundation for major competition and innovation and avoids the rigidness of the current Internet structure. There's not just a single network where everybody has to be attached to; with RINA network operators, service providers and users have a choice of which networks to provide and which networks to join.

Research Projects
From the publishing of the PNA book in 2008 until 2014 a lot of RINA research and development work has been done. There is a clear need for an international authority that coordinates the different ongoing activities, make sure their results are integrated in the basic reference model but at the same time are able to incorporate new knowledge or fix inconsistencies. An informal group known as the Pouzin Society (PSOC) – named after Louis Pouzin, the inventor of datagrams and connectionless networking - has been taking this role.

BU Research Team
The RINA research team at Boston University is lead by John Day. BU has been awarded a number of grants from the National Science Foundation in order to continue investigating the fundamentals of RINA, develop an open source prototype implementation over UDP/IP for Java and experiment with it on top of the GENI infrastruct. BU is also a member of the Pouzin Society and an active contributor to the FP7 IRATI and PRISTINE projects. In addition to this, BU has incorporated the RINA concepts and theory in their computer networking courses.

FP7 IRATI
IRATI is an FP7-funded project with 5 partners: i2CAT, Nextworks, iMinds, Interoute and Boston University, whose main goal is to produce an open source RINA implementation for the Linux OS on top of Ethernet,. FP7 IRATI has already open-sourced the first release of the RINA implementation, called as the project “IRATI”. The implementation will be further enhanced by the PRISTINE and IRINA projects.

FP7 PRISTINE
PRISTINE is an FP7-funded project with 15 partners: WIT-TSSG, i2CAT, Nextworks, Telefonica I+D, Thales, Nexedi, B-ISDN, Atos, University of Oslo, Juniper Networks, Brno University, IMT-TSP, CREATE-NET, iMinds and UPC; whose main goal is to explore the programmability aspects of RINA to implement innovative policies for congestion control, resource allocation, routing, security and network management.

GEANT3+ Open Call winner IRINA
IRINA was funded by the GEANT3+ open call, and is a project with four partners: iMinds, WIT-TSSG, i2CAT and Nextworks. The main goal of IRINA is to study the use of the Recursive InterNetwork Architecture (RINA) as the foundation of the next generation NREN and GÉANT network architectures. IRINA builds on the open source RINA prototype developed by the FP7 IRATI project. IRINA will compare RINA against current networking state of the art and relevant clean-slate architecture under research; perform a use-case study of how RINA could be better used in the NREN scenarios; and showcase a laboratory trial of the study.