Traffic classification

Traffic classification is an automated process which categorises computer network traffic according to various parameters (for example, based on port number or protocol) into a number of traffic classes. Each resulting traffic class can be treated differently in order to differentiate the service implied for the data generator or consumer.

Typical uses
Packets are classified to be differently processed by the network scheduler. Upon classifying a traffic flow using a particular protocol, a predetermined policy can be applied to it and other flows to either guarantee a certain quality (as with VoIP or media streaming service ) or to provide best-effort delivery. This may be applied at the ingress point (the point at which traffic enters the network, typically an edge device) with a granularity that allows traffic management mechanisms to separate traffic into individual flows and queue, police and shape them differently.

Classification methods
Classification is achieved by various means.

Port numbers

 * Fast
 * Low resource-consuming
 * Supported by many network devices
 * Does not implement the application-layer payload, so it does not compromise the users' privacy
 * Useful only for the applications and services, which use fixed port numbers
 * Easy to cheat by changing the port number in the system

Deep Packet Inspection

 * Inspects the actual payload of the packet
 * Detects the applications and services regardless of the port number, on which they operate
 * Slow
 * Requires a lot of processing power
 * Signatures must be kept up to date, as the applications change very frequently
 * Encryption makes this method impossible in many cases

Matching bit patterns of data to those of known protocols is a simple widely used technique. An example to match the BitTorrent protocol handshaking phase would be a check to see if a packet began with character 19 which was then followed by the 19-byte string 'BitTorrent protocol'.

A comprehensive comparison of various network traffic classifiers, which depend on Deep Packet Inspection (PACE, OpenDPI, 4 different configurations of L7-filter, NDPI, Libprotoident, and Cisco NBAR), is shown in the Independent Comparison of Popular DPI Tools for Traffic Classification.

Statistical classification

 * Relies on statistical analysis of attributes such as byte frequencies, packet sizes and packet inter-arrival times.
 * Very often uses Machine Learning Algorithms, as K-Means, Naive Bayes Filter, C4.5, C5.0, J48, or Random Forest
 * Fast technique (compared to deep packet inspection classification)
 * It can detect the class of yet unknown applications

Encrypted traffic classification
Nowadays the traffic is more complex, and more secure, for this, we need a method to classify the encrypted traffic in a different way than the classic mode (based on IP traffic analysis by probes in the core network). A form to achieve this is by using traffic descriptors from connection traces in the radio interface to perform the classification.

This same problem with traffic classification is also present in multimedia traffic. It has been generally proven that using methods based on neural networks, vector support machines, statistics, and the nearest neighbors are a great way to do this traffic classification, but in some specific cases some methods are better than others, for example: neural networks work better when the whole observation set is taken into account.

Implementation
Both, the Linux network scheduler and Netfilter contain logic to identify and mark or classify network packets.

Typical traffic classes
Operators often distinguish two broad types of network traffic: time-sensitive and best-effort.

Time-sensitive traffic
Time-sensitive traffic is traffic the operator has an expectation to deliver on time. This includes VoIP, online gaming, video conferencing, and web browsing. Traffic management schemes are typically tailored in such a way that the quality of service of these selected uses is guaranteed, or at least prioritized over other classes of traffic. This can be accomplished by the absence of shaping for this traffic class, or by prioritizing sensitive traffic above other classes.

Best-effort traffic
Best-effort traffic is all other kinds of traffic. This is traffic that the ISP deems isn't sensitive to quality of service metrics (jitter, packet loss, latency). A typical example would be peer-to-peer and email applications. Traffic management schemes are generally tailored so best-effort traffic gets what is left after time-sensitive traffic.

File sharing
Peer-to-peer file sharing applications are often designed to use any and all available bandwidth which impacts QoS-sensitive applications (like online gaming) that use comparatively small amounts of bandwidth. P2P programs can also suffer from download strategy inefficiencies, namely downloading files from any available peer, regardless of link cost. The applications use ICMP and regular HTTP traffic to discover servers and download directories of available files.

In 2002, Sandvine Incorporated determined, through traffic analysis, that P2P traffic accounted for up to 60% of traffic on most networks. This shows, in contrast to previous studies and forecasts, that P2P has become mainstream.

P2P protocols can and are often designed so that the resulting packets are harder to identify (to avoid detection by traffic classifiers), and with enough robustness that they do not depend on specific QoS properties in the network (in-order packet delivery, jitter, etc. - typically this is achieved through increased buffering and reliable transport, with the user experiencing increased download time as a result). The encrypted BitTorrent protocol does for example rely on obfuscation and randomized packet sizes in order to avoid identification. File sharing traffic can be appropriately classified as Best-Effort traffic. At peak times when sensitive traffic is at its height, download speeds will decrease. However, since P2P downloads are often background activities, it affects the subscriber experience little, so long as the download speeds increase to their full potential when all other subscribers hang up their VoIP phones. Exceptions are real-time P2P VoIP and P2P video streaming services who need permanent QoS and use excessive overhead and parity traffic to enforce this as far as possible.

Some P2P applications can be configured to act as self-limiting sources, serving as a traffic shaper configured to the user's (as opposed to the network operator's) traffic specification.

Some vendors advocate managing clients rather than specific protocols, particularly for ISPs. By managing per-client (that is, per customer), if the client chooses to use their fair share of the bandwidth running P2P applications, they can do so, but if their application is abusive, they only clog their own bandwidth and cannot affect the bandwidth used by other customers.