History of network traffic models

Design of robust and reliable networks and network services relies on an understanding of the traffic characteristics of the network. Throughout history, different models of network traffic have been developed and used for evaluating existing and proposed networks and services.

Demands on computer networks are not entirely predictable. Performance modeling is necessary for deciding the quality of service (QoS) level. Performance models in turn, require accurate traffic models that have the ability to capture the statistical characteristics of the actual traffic on the network. Many traffic models have been developed based on traffic measurement data. If the underlying traffic models do not efficiently capture the characteristics of the actual traffic, the result may be the under-estimation or over-estimation of the performance of the network. This impairs the design of the network. Traffic models are hence, a core component of any performance evaluation of networks and they need to be very accurate.

“Teletraffic theory is the application of mathematics to the measurement, modeling, and control of traffic in telecommunications networks. The aim of traffic modeling is to find stochastic processes to represent the behavior of traffic. Working at the Copenhagen Telephone Company in the 1910s, A. K. Erlang famously characterized telephone traffic at the call level by certain probability distributions for arrivals of new calls and their holding times. Erlang applied the traffic models to estimate the telephone switch capacity needed to achieve a given call blocking probability. The Erlang blocking formulas had tremendous practical interest for public carriers because telephone facilities (switching and transmission) involved considerable investments. Over several decades, Erlang’s work stimulated the use of queuing theory, and applied probability in general, to engineer the public switched telephone network. Teletraffic theory for packet networks has seen considerable progress in recent decades. Significant advances have been made in long-range dependence, wavelet, and multifractal approaches. At the same time, traffic modeling continues to be challenged by evolving network technologies and new multimedia applications. For example, wireless technologies allow greater mobility of users. Mobility must be an additional consideration for modeling traffic in wireless networks. Traffic modeling is an ongoing process without a real end. Traffic models represent our best current understanding of traffic behavior, but our understanding will change and grow over time.”

Network traffic models usage
Measurements are useful and necessary for verifying the actual network performance. However, measurements do not have the level of abstraction that makes traffic models useful. Traffic models can be used for hypothetical problem solving whereas traffic measurements only reflect current reality. In probabilistic terms, a traffic trace is a realization of a random process, whereas a traffic model is a random process. Thus, traffic models have universality. A traffic trace gives insight about a particular traffic source, but a traffic model gives insight about all traffic sources of that type. Traffic models have three major uses. One important use of traffic models is to properly dimension network resources for a target level of QoS. It was mentioned earlier that Erlang developed models of voice calls to estimate telephone switch capacity to achieve a target call blocking probability. Similarly, models of packet traffic are needed to estimate the bandwidth and buffer resources to provide acceptable packet delays and packet loss probability. Knowledge of the average traffic rate is not sufficient. It is known from queuing theory that queue lengths increase with the variability of traffic. Hence, an understanding of traffic burstiness or variability is needed to determine sufficient buffer sizes at nodes and link capacities. A second important use of traffic models is to verify network performance under specific traffic controls. For example, given a packet scheduling algorithm, it would be possible to evaluate the network performance resulting from different traffic scenarios. For another example, a popular area of research is new improvements to the TCP congestion avoidance algorithm. It is critical that any algorithm is stable and allows multiple hosts to share bandwidth fairly, while sustaining a high throughput. Effective evaluation of the stability, fairness, and throughput of new algorithms would not be possible without realistic source models. A third important use of traffic models is admission control. In particular, connection oriented networks such as ATM depends on admission control to block new connections to maintain QOS guarantees. A simple admission strategy could be based on the peak rate of a new connection; a new connection is admitted if the available bandwidth is greater than the peak rate. However, that strategy would be overly conservative because a variable bit-rate connection may need significantly less bandwidth than its peak rate. A more sophisticated admission strategy is based on effective bandwidths. The source traffic behavior is translated into an effective bandwidth between the peak rate and average rate, which is the specific amount of bandwidth required to meet a given QoS constraint. The effective bandwidth depends on the variability of the source.

Network traffic models steps
Traffic modeling consists of three steps: Parameter estimation is based on a set of statistics (e.g. mean, variance, density function or auto covariance function, multifractal characteristics) that are measured or calculated from observed data. The set of statistics used in the inference process depends on the impact they may have in the main performance metrics of interest.
 * (i) selection of one or more models that may provide a good description of the traffic type
 * (ii) estimation of parameters for the selected models
 * (iii) statistical testing for election of one of the considered models and analysis of its suitability to describe the traffic type under analysis.

Network traffic models parameter
In recent years several types of traffic behavior, that can have significant impact on network performance, were discovered: long-range dependence, self-similarity and, more recently, multifractality. There are two major parameters generated by network traffic models: packet length distributions and packet inter-arrival distributions. Other parameters, such as routes, distribution of destinations, etc., are of less importance. Simulations that use traces generated by network traffic models usually examine a single node in the network, such as a router or switch; factors that depend on specific network topologies or routing information are specific to those topologies and simulations. The problem of packet size distribution is fairly well-understood today. Existing models of packet sizes have proven to be valid and simple. Most packet size models do not consider the problem of order in packet sizes. For example, a TCP datagram in one direction is likely to be followed by a tiny ACK in the other direction about half of one Round-Trip Time (RTT) later. The problem of packet inter-arrival distribution is much more difficult. Understanding of network traffic has evolved significantly over the years, leading to a series of evolutions in network traffic models.

Self-similar traffic models
One of the earliest objections to self-similar traffic models was the difficulty in mathematical analysis. Existing self-similar models could not be used in conventional queuing models. This limitation was rapidly overturned and workable models were constructed. Once basic self-similar models became feasible, the traffic modeling community settled into the “detail” concerns. TCP’s congestion control algorithm complicated the matter of modeling traffic, so solutions needed to be created. Parameter estimation of self-similar models was always difficult, and recent research addresses ways to model network traffic without fully understanding it. When self-similar traffic models were first introduced, there were no efficient, analytically tractable processes to generate the models. Ilkka Norros devised a stochastic process for a storage model with self-similar input and constant bit-rate output. While this initial model was continuous rather than discrete, the model was effective, simple, and attractive. All self-similar traffic models suffer from one significant drawback: estimating the self-similarity parameters from real network traffic requires huge amounts of data and takes extended computation. The most modern method, wavelet multi-resolution analysis, is more efficient, but still very costly. This is undesirable in a traffic model. SWING uses a surprisingly simple model for the network traffic analysis and generation. The model examines characteristics of users, Request-Response Exchanges (RREs), connections, individual packets, and the overall network. No attempt is made to analyze self-similarity characteristics; any self-similarity in the generated traffic comes naturally from the aggregation of many ON/OFF sources. The Pareto distribution process produces independent and identically distributed (IID) inter-arrival times. In general if X is a random variable with a Pareto distribution, then the probability that X is greater than some number x is given by P(X > x) = (x/x_m)-k for all x ≥ x_m where k is a positive parameter and x_m is the minimum possible value of Xi The probability distribution and the density functions are represented as: F(t) = 1 – (α/t)β where α,β ≥ 0 & t ≥ α f(t) = βαβ t-β-1 The parameters β and α are the shape and location parameters, respectively. The Pareto distribution is applied to model self-similar arrival in packet traffic. It is also referred to as double exponential, power law distribution. Other important characteristics of the model are that the Pareto distribution has infinite variance, when β ≥ 2 and achieves infinite mean, when β ≤ 1. The Weibull distributed process is heavy-tailed and can model the fixed rate in ON period and ON/OFF period lengths, when producing self-similar traffic by multiplexing ON/OFF sources. The distribution function in this case is given by: F(t) = 1 – e-(t/β)α t > 0 and the density function of the weibull distribution is given as: f(t) = αβ-α tα-1 e -(t/β)α t > 0 where parameters β ≥ 0 and α > 0 are the scale and location parameters respectively. The Weibull distribution is close to a normal distribution. For β ≤ 1 the density function of the distribution is L-shaped and for values of β > 1, it is bell-shaped. This distribution gives a failure rate increasing with time. For β > 1, the failure rate decreases with time. At, β = 1, the failure rate is constant and the lifetimes are exponentially distributed. The Autoregressive model is one of a group of linear prediction formulas that attempt to predict an output y_n of a system based on previous set of outputs {y_k} where k < n and inputs x_n and {x_k} where k < n. There exist minor changes in the way the predictions are computed based on which, several variations of the model are developed. Basically, when the model depends only on the previous outputs of the system, it is referred to as an auto-regressive model. It is referred to as a Moving Average Model (MAM), if it depends on only the inputs to the system. Finally, Autoregressive-Moving Average models are those that depend both on the inputs and the outputs, for prediction of current output. Autoregressive model of order p, denoted as AR(p), has the following form: Xt = R1 Xt-1 + R2 Xt-2 + ... + Rp Xt-p + Wt where Wt is the white noise, Ri are real numbers and Xt are prescribed correlated random numbers. The auto-correlation function of the AR(p) process consists of damped sine waves depending on whether the roots (solutions) of the model are real or imaginary. Discrete Autoregressive Model of order p, denoted as DAR(p), generates a stationary sequence of discrete random variables with a probability distribution and with an auto-correlation structure similar to that of the Autoregressive model of order p.[3] Regression models define explicitly the next random variable in the sequence by previous ones within a specified time window and a moving average of a white noise.[5] Transform-expand-sample (TES) models are non-linear regression models with modulo-1 arithmetic. They aim to capture both auto-correlation and marginal distribution of empirical data. TES models consist of two major TES processes: TES+ and TES–. TES+ produces a sequence which has positive correlation at lag 1, while TES– produces a negative correlation at lag 1.
 * Fractional Brownian motion:
 * SWING:
 * Pareto distribution process:
 * Weibull distribution process:
 * Autoregressive models:
 * Regression models:
 * TES models :

Non-self-similar traffic models
Early traffic models were derived from telecommunications models and focused on simplicity of analysis. They generally operated under the assumption that aggregating traffic from a large number of sources tended to smooth out bursts; that burstiness decreased as the number of traffic sources increased. One of the most widely used and oldest traffic models is the Poisson Model. The memoryless Poisson distribution is the predominant model used for analyzing traffic in traditional telephony networks. The Poisson process is characterized as a renewal process. In a Poisson process the inter-arrival times are exponentially distributed with a rate parameter λ: P{An ≤ t} = 1 – exp(-λt). The Poisson distribution is appropriate if the arrivals are from a large number of independent sources, referred to as Poisson sources. The distribution has a mean and variance equal to the parameter λ. The Poisson distribution can be visualized as a limiting form of the binomial distribution, and is also used widely in queuing models. There are a number of interesting mathematical properties exhibited by Poisson processes. Primarily, superposition of independent Poisson processes results in a new Poisson process whose rate is the sum of the rates of the independent Poisson processes. Further, the independent increment property renders a Poisson process memoryless. Poisson processes are common in traffic applications scenarios that consist of a large number of independent traffic streams. The reason behind the usage stems from Palm's Theorem which states that under suitable conditions, such large number of independent multiplexed streams approach a Poisson process as the number of processes grows, but the individual rates decrease in order to keep the aggregate rate constant. Traffic aggregation need not always result in a Poisson process. The two primary assumptions that the Poisson model makes are: 1. The number of sources is infinite 2. The traffic arrival pattern is random. In the compound Poisson model, the base Poisson model is extended to deliver batches of packets at once. The inter-batch arrival times are exponentially distributed, while the batch size is geometric. Mathematically, this model has two parameters, λ, the arrival rate, and ρ in (0,1), the batch parameter. Thus, the mean number of packets in a batch is 1/ ρ, while the mean inter-batch arrival time is 1/ λ. Mean packet arrivals over time period t are tλ/ ρ. The compound Poisson model shares some of the analytical benefits of the pure Poisson model: the model is still memoryless, aggregation of streams is still (compound) Poisson, and the steady-state equation is still reasonably simple to calculate, although varying batch parameters for differing flows would complicate the derivation. Markov models attempt to model the activities of a traffic source on a network, by a finite number of states. The accuracy of the model increases linearly with the number of states used in the model. However, the complexity of the model also increases proportionally with increasing number of states. An important aspect of the Markov model - the Markov Property, states that the next (future) state depends only on the current state. In other words, the probability of the next state, denoted by some random variable Xn+1, depends only on the current state, indicated by Xn, and not on any other state Xi, where i<n. The set of random variables referring to different states {Xn} is referred to as a Discrete Markov Chain. Another attempt at providing a bursty traffic model is found in Jain and Routhier’s Packet Trains model. This model was principally designed to recognize that address locality applies to routing decisions; that is, packets that arrive near each other in time are frequently going to the same destination. In generating a traffic model that allows for easier analysis of locality, the authors created the notion of packet trains, a sequence of packets from the same source, traveling to the same destination (with replies in the opposite direction). Packet trains are optionally sub-divided into tandem trailers. Traffic between a source and a destination usually consists of a series of messages back and forth. Thus, a series of packets go one direction, followed by one or more reply packets, followed by a new series in the initial direction. Traffic quantity is then a superposition of packet trains, which generates substantial bursty behavior. This refines the general conception of the compound Poisson model, which recognized that packets arrived in groups, by analyzing why they arrive in groups, and better characterizing the attributes of the group. Finally, the authors demonstrate that packet arrival times are not Poisson distributed, which led to a model that departs from variations on the Poisson theme. The packet train model is characterized by the following parameters and their associated probability distributions: The train model is designed for analyzing and categorizing real traffic, not for generating synthetic loads for simulation. Thus, little claim has been made about the feasibility of packet trains for generating synthetic traffic. Given accurate parameters and distributions, generation should be straightforward, but derivation of these parameters is not addressed.
 * Poisson distribution model:
 * Compound Poisson traffic models:
 * Markov and Embedded Markov Models:
 * Packet trains:
 * mean inter-train arrival time
 * mean inter-car arrival time
 * mean truck size (in the tandem trailer model)
 * mean train size.

Traffic models today
NS-2 is a popular network simulator; PackMimeHTTP is a web traffic generator for NS-2, published in 2004. It does take long-range dependencies into account, and uses the Weibull distribution. Thus, it relies on heavy tails to emulate true self-similarity. Over most time scales, the effort is a success; only a long-running simulation would allow a distinction to be drawn. This follows suggestions from where it is suggested that self-similar processes can be represented as a superposition of many sources each individually modeled with a heavy-tailed distribution. It is clear that self-similar traffic models are in the mainstream.