Universally unique identifier

A Universally Unique Identifier (UUID) is a 128-bit label used for information in computer systems. The term Globally Unique Identifier (GUID) is also used, mostly in Microsoft systems.

When generated according to the standard methods, UUIDs are, for practical purposes, unique. Their uniqueness does not depend on a central registration authority or coordination between the parties generating them, unlike most other numbering schemes. While the probability that a UUID will be duplicated is not zero, it is generally considered close enough to zero to be negligible.

Thus, anyone can create a UUID and use it to identify something with near certainty that the identifier does not duplicate one that has already been, or will be, created to identify something else. Information labeled with UUIDs by independent parties can therefore be later combined into a single database or transmitted on the same channel, with a negligible probability of duplication.

Adoption of UUIDs is widespread, with many computing platforms providing support for generating them and for parsing their textual representation.

History
In the 1980s, Apollo Computer originally used UUIDs in the Network Computing System (NCS). Later, the Open Software Foundation (OSF) used UUIDs for their Distributed Computing Environment (DCE). The design of the DCE UUIDs was partly based on the NCS UUIDs, whose design was in turn inspired by the (64-bit) unique identifiers defined and used pervasively in Domain/OS, an operating system designed by Apollo Computer. Later, the Microsoft Windows platforms adopted the DCE design as "Globally Unique IDentifiers" (GUIDs).

registered a URN namespace for UUIDs and recapitulated the earlier specifications, with the same technical content. When in July 2005 was published as a proposed IETF standard, the ITU had also standardized UUIDs, based on the previous standards and early versions of. On May 7, 2024, was published, introducing 3 new "versions" and clarifying some ambiguities.

Standards
UUIDs are standardized by the Open Software Foundation (OSF) as part of the Distributed Computing Environment (DCE).

UUIDs are documented as part of ISO/IEC 11578:1996 "Information technology – Open Systems Interconnection – Remote Procedure Call (RPC)" and more recently in ITU-T Rec. X.667 | ISO/IEC 9834-8:2014.

The Internet Engineering Task Force (IETF) published the Standards-Track from the "Revise Universally Unique Identifier Definitions Working Group" as revision for. is technically equivalent to ITU-T Rec. X.667 | ISO/IEC 9834-8, but is now obsolete.

Binary wire format
A UUID is a 128 bit label. Initially, Apollo Computer designed the UUID with the following wire format:


 * {| class="wikitable"

! Name ! Offset ! Length ! Description
 * + The legacy wire format
 * time_high
 * 0x00
 * 4 octets / 32 bits
 * rowspan="2" | The first 6 octets are the number of four-microsecond (μs) units of time that have passed since 1980-01-01 00:00 UTC. The time 248 × 4 μs after 1980 started was 2015-09-05 05:58:26.84262 UTC. Thus, the last time at which UUIDs could be generated in this original format was in 2015.
 * time_low
 * 0x04
 * 2 octets / 16 bits
 * reserved
 * 0x06
 * 2 octets / 16 bits
 * These octets are reserved for future use.
 * family
 * 0x08
 * 1 octet / 8 bits
 * This octet is an address family.
 * node
 * 0x09
 * 7 octets / 56 bits
 * These octets are a host ID in the form allowed by the specified address family.
 * }
 * node
 * 0x09
 * 7 octets / 56 bits
 * These octets are a host ID in the form allowed by the specified address family.
 * }

Later, the UUID was extended by combining the legacy family field with the new variant field. Because the family field only had used the values ranging from 0 to 13 in the past, it was decided that a UUID with the most significant bit set to 0 was a legacy UUID. This gives the following table for the family group:


 * {| class="wikitable"

! MSB 0 ! MSB 1 ! MSB 2 ! Legacy family field value range ! In hex ! Description
 * + Family / variant field
 * 0
 * x
 * x
 * 0–127 (Only 0–13 are used)
 * 0x00–0x7f
 * The legacy Apollo NCS UUID
 * 1
 * 0
 * x
 * 128–191
 * 0x80–0xbf
 * OSF DCE UUID
 * 1
 * 1
 * 0
 * 192–223
 * 0xc0–0xdf
 * Microsoft COM / DCOM UUID
 * 1
 * 1
 * 1
 * 224–255
 * 0xe0–0xff
 * Reserved for future definition
 * }
 * 224–255
 * 0xe0–0xff
 * Reserved for future definition
 * }

The legacy Apollo NCS UUID has the format described in the previous table. The OSF DCE UUID variant is described in. The Microsoft COM / DCOM UUID has its variant described in the Microsoft documentation.

To assist human reading the groups are typically presented in hexadecimal format where the groups are separated with the dash (-) symbol

Textual representation
Because a UUID is a 128 bit label, it can be represented in different formats.

In most cases, UUIDs are represented as hexadecimal values. The most used format is the 8-4-4-4-12 format,, where every   represents 4 bits. Other well-known formats are the 8-4-4-4-12 format with braces,, like in Microsoft's systems, e.g. Windows, or  , where all hyphens are removed. In some cases, it is also possible to have  with the "0x" prefix or the "h" suffix to indicate hexadecimal values. The format with hyphens was introduced with the newer variant system. Before that, the legacy Apollo format used a slightly different format:. The first part is the time (time_high and time_low combined). The reserved field is skipped. The family field comes directly after the first dot, so in this case  (13 in decimal) for DDS (Data Distribution Service). The remaining parts, each separated with a dot, are the node bytes.

The lowercase form of the hexadecimal values is the generally preferred format. Specifically in some contexts such as those defined in ITU-T Rec. X.667, lowercase is required when the text is generated, but the uppercase version must also be accepted.

A UUID can be represented as a 128 bit integer. For example, the UUID  can also be represented as 113059749145936325402354257176981405696. Note that it is possible to have both signed and unsigned values if the first bit of the UUID is set to 1.

A UUID can be represented as a 128 bit binary number. For example, the UUID  can also be represented as 01010101000011101000010000000000111000101001101101000001110101001010011100010110010001000110011001010101010001000000000000000000.

registers the "uuid" namespace. This makes it possible to make URNs out of UUIDs, like. The normal 8-4-4-4-12 format is used for this. It is also possible to make a OID URN out of UUIDs, like. In that case, the unsigned decimal format is used. The "uuid" URN is recommended over the "oid" URN.

Variants
The variant field indicates the format of the UUID (and in case of the legacy UUID also the address family used for the node field). The following variants are defined:


 * The Apollo NCS variant (indicated by the one-bit pattern 0xxx2) is for backwards compatibility with the now-obsolete Apollo Network Computing System 1.5 UUID format developed around 1988. Though different in detail, the similarity with modern UUIDv1 is evident. The variant bits in the current UUID specification coincide with the high bits of the address family octet in NCS UUIDs. Though the address family could hold values in the range 0..255, only the values 0..13 were ever defined. Accordingly, the bit pattern  avoids conflicts with historical NCS UUIDs, should any still exist in databases. This variant defines "families" as subtype.
 * The OSF DCE variant (10xx2) are referred to as RFC 4122/DCE 1.1 UUIDs, or "Leach–Salz" UUIDs, after the authors of the original Internet Draft. This variant defines "versions" as subtype.
 * The Microsoft COM/DCOM variant (110x2) is characterized in the RFC as "reserved, Microsoft Corporation backward compatibility" and was used for early GUIDs on the Microsoft Windows platform.
 * The Reserved variant space is not currently used by any specification.

Versions
The OSF DCE variant defines eight "versions" in the standard, and each version may be more appropriate than the others in specific use cases. The version is indicated by the value of the higher nibble (higher 4 bits, or higher hexadecimal digit) of the 7th byte of the UUID. In hex, this is the character after the second dash. For example, the UUID  is version 4, because of the digit after the second dash is 4 in.

Versions 1 and 6 (date-time and MAC address)
Version 1 concatenates the 48-bit MAC address of the "node" (that is, the computer generating the UUID), with a 60-bit timestamp, being the number of 100-nanosecond intervals since midnight 15 October 1582 Coordinated Universal Time (UTC), the date on which the Gregorian calendar was first adopted by the bulk of Europe, which at that time was dominated by Roman Catholic Spain. RFC 4122 states that the time value rolls over around 3400 AD, depending on the algorithm used, which implies that the 60-bit timestamp is a signed quantity. However some software, such as the libuuid library, treats the timestamp as unsigned, putting the rollover time in 5623 AD. The rollover time as defined by ITU-T Rec. X.667 is 3603 AD.

A 13-bit or 14-bit "uniquifying" clock sequence extends the timestamp in order to handle cases where the processor clock does not advance fast enough, or where there are multiple processors and UUID generators per node. When UUIDs are generated faster than the system clock could advance, the lower bits of the timestamp fields can be generated by incrementing it every time a UUID is being generated, to simulate a high-resolution timestamp. With each version 1 UUID corresponding to a single point in space (the node) and time (intervals and clock sequence), the chance of two properly generated version-1 UUIDs being unintentionally the same is practically nil. Since the time and clock sequence total 74 bits, 274 (1.8, or 18 sextillion) version-1 UUIDs can be generated per node ID, at a maximal average rate of 163 billion per second per node ID.

In contrast to other UUID versions, version-1 and -2 UUIDs based on MAC addresses from network cards rely for their uniqueness in part on an identifier issued by a central registration authority, namely the Organizationally Unique Identifier (OUI) part of the MAC address, which is issued by the IEEE to manufacturers of networking equipment. The uniqueness of version-1 and version-2 UUIDs based on network-card MAC addresses also depends on network-card manufacturers properly assigning unique MAC addresses to their cards, which like other manufacturing processes is subject to error. Additionally some operating systems permit the end user to customise the MAC address, notably OpenWRT.

Usage of the node's network card MAC address for the node ID means that a version-1 UUID can be tracked back to the computer that created it. Documents can sometimes be traced to the computers where they were created or edited through UUIDs embedded into them by word processing software. This privacy hole was used when locating the creator of the Melissa virus.

does allow the MAC address in a version-1 (or 2) UUID to be replaced by a random 48-bit node ID, either because the node does not have a MAC address, or because it is not desirable to expose it. In that case, the RFC requires that the least significant bit of the first octet of the node ID should be set to 1. This corresponds to the multicast bit in MAC addresses, and setting it serves to differentiate UUIDs where the node ID is randomly generated from UUIDs based on MAC addresses from network cards, which typically have unicast MAC addresses.

Version 6 is the same as version 1 except all time bits are placed in the opposite order. This will give systems the opportunity to sort in order of creation by UUID, where this wasn't possible with version 1.

Version 2 (date-time and MAC address, DCE security version)
reserves version 2 for "DCE security" UUIDs; but it does not provide any details. For this reason, many UUID implementations omit version 2. However, the specification of version-2 UUIDs is provided by the DCE 1.1 Authentication and Security Services specification.

Version-2 UUIDs are similar to version 1, except that the least significant 8 bits of the clock sequence are replaced by a "local domain" number, and the least significant 32 bits of the timestamp are replaced by an integer identifier meaningful within the specified local domain. On POSIX systems, local-domain numbers 0 and 1 are for user ids (UIDs) and group ids (GIDs) respectively, and other local-domain numbers are site-defined. On non-POSIX systems, all local domain numbers are site-defined.

The ability to include a 40-bit domain/identifier in the UUID comes with a tradeoff. On the one hand, 40 bits allow about 1 trillion domain/identifier values per node ID. On the other hand, with the clock value truncated to the 28 most significant bits, compared to 60 bits in version 1, the clock in a version 2 UUID will "tick" only once every 429.49 seconds, a little more than 7 minutes, as opposed to every 100 nanoseconds for version 1. And with a clock sequence of only 6 bits, compared to 14 bits in version 1, only 64 unique UUIDs per node/domain/identifier can be generated per 7-minute clock tick, compared to 16,384 clock sequence values for version 1. Thus, Version 2 may not be suitable for cases where UUIDs are required, per node/domain/identifier, at a rate exceeding about one every seven minutes.

Versions 3 and 5 (namespace name-based)
Version-3 and version-5 UUIDs are generated by hashing a namespace identifier and name. Version 3 uses MD5 as the hashing algorithm, and version 5 uses SHA-1.

The namespace identifier is itself a UUID. The specification provides UUIDs to represent the namespaces for URLs, fully qualified domain names, object identifiers, and X.500 distinguished names; but any desired UUID may be used as a namespace designator.

To determine the version-3 UUID corresponding to a given namespace and name, the UUID of the namespace is transformed to a string of bytes, concatenated with the input name, then hashed with MD5, yielding 128 bits. Then 6 or 7 bits are replaced by fixed values, the 4-bit version (e.g. 00112 for version 3), and the 2- or 3-bit UUID "variant" (e.g. 102 indicating a UUIDs, or 1102 indicating a legacy Microsoft GUID). Since 6 or 7 bits are thus predetermined, only 121 or 122 bits contribute to the uniqueness of the UUID.

Version-5 UUIDs are similar, but SHA-1 is used instead of MD5. Since SHA-1 generates 160-bit digests, the digest is truncated to 128 bits before the version and variant bits are replaced.

Version-3 and version-5 UUIDs have the property that the same namespace and name will map to the same UUID. However, neither the namespace nor name can be determined from the UUID, even if one of them is specified, except by brute-force search. recommends version 5 (SHA-1) over version 3 (MD5), and warns against use of UUIDs of either version as security credentials.

Version 4 (random)
A version 4 UUID is randomly generated. As in other UUIDs, 4 bits are used to indicate version 4, and 2 or 3 bits to indicate the variant (102 or 1102 for variants 1 and 2 respectively). Thus, for variant 1 (that is, most UUIDs) a random version 4 UUID will have 6 predetermined variant and version bits, leaving 122 bits for the randomly generated part, for a total of 2122, or 5.3 (5.3 undecillion) possible version-4 variant-1 UUIDs. There are half as many possible version 4, variant 2 UUIDs (legacy GUIDs) because there is one less random bit available, 3 bits being consumed for the variant.

Per, the seventh octet's most significant 4 bits indicate which version the UUID adheres to. This means that the first hexadecimal digit in the third group always starts with a  in UUIDv4s. Visually, this looks like this, where   is the UUID version field. The upper two or three bits of digit  encode the variant. For example, a random UUID version 4, variant 2 could be.

Version 7 (timestamp and random)
Version 7 UUIDs (UUIDv7) are designed for keys in high-load databases and distributed systems.

UUIDv7 begins with a 48 bit big-endian Unix Epoch timestamp with approximately millisecond granularity. The timestamp can be shifted by any time shift value. Directly after the timestamp follows the version nibble, that must have a value of 7. The variant bits have to be. Remaining 74 bits are random seeded counter (optional, at least 12 bits but no longer than 42 bits) and random.

Two counter rollover handling methods can be used together:


 * Zero seeded most significant, leftmost guard bit of the counter.
 * Increment of the timestamp ahead of the actual time and reinitialize the counter when it overflows.

In DBMS UUIDv7 generator can be shared between threads (tied to a table or to a DBMS instance) or can be thread-local (with worse monotonicity, locality and performance).

Version 8 (custom)
Version 8 only has two requirements:


 * The variant bits have to be.
 * The version nibble has to be the value of 8.

Those requirements tell the system that it is a version 8 UUID. The remaining 122 bits are up to the vendor to customize. The difference with version 4 is that those 122 bits are random, but the 122 bits in UUID version 8 are not, because they follow vendor specific rules.

Special UUIDs
The "nil" UUID is the UUID ; that is, all bits set to zero.

The "max" UUID, sometimes also called the "omni" UUID, is the UUID ; that is, all bits set to one.

Encoding
The binary encoding of UUIDs varies between systems. Variant 1 UUIDs, nowadays the most common variant, are encoded in a big-endian format. For example,  is encoded as the bytes.

Variant 2 UUIDs, historically used in Microsoft's COM/OLE libraries, use a little-endian format, but appear mixed-endian with the first three components of the UUID as little-endian and last two big-endian, due to the missing byte dashes when formatted as a string. For example,  is encoded as the bytes.

Collisions
Collision occurs when the same UUID is generated more than once and assigned to different referents. In the case of standard version-1 and version-2 UUIDs using unique MAC addresses from network cards, collisions are unlikely to occur, with an increased possibility only when an implementation varies from the standards, either inadvertently or intentionally.

In contrast to version-1 and version-2 UUIDs generated using MAC addresses, with version-1 and -2 UUIDs which use randomly generated node ids, hash-based version-3 and version-5 UUIDs, and random version-4 UUIDs, collisions can occur even without implementation problems, albeit with a probability so small that it can normally be ignored. This probability can be computed precisely based on analysis of the birthday problem.

For example, the number of random version-4 UUIDs which need to be generated in order to have a 50% probability of at least one collision is 2.71 quintillion, computed as follows:


 * $$n \approx \frac{1}{2} + \sqrt{\frac{1}{4} + 2 \times \ln(2) \times 2^{122}} \approx 2.71 \times 10^{18}.$$

This number is equivalent to generating 1 billion UUIDs per second for about 86 years. A file containing this many UUIDs, at 16 bytes per UUID, would be about 45 exabytes.

The smallest number of version-4 UUIDs which must be generated for the probability of finding a collision to be p is approximated by the formula


 * $$\sqrt{2 \times 2^{122} \times \ln\frac{1}{1 - p}}.$$

Thus, the probability to find a duplicate within 103 trillion version-4 UUIDs is one in a billion.

Collisions have occurred when manufacturers assign a default UUID to a product, such as a motherboard, and then fail to over-write the default UUID later in the manufacturing process. For example, UUID 03000200-0400-0500-0006-000700080009 occurs on many different units of Gigabyte-branded motherboards.

Uses
Significant uses include ext2/ext3/ext4 filesystem userspace tools (e2fsprogs uses libuuid provided by util-linux), LVM, LUKS encrypted partitions, GNOME, KDE, and macOS, most of which are derived from the original implementation by Theodore Ts'o. One of the uses of UUIDs in Solaris (using Open Software Foundation implementation) is identification of a running operating system instance for the purpose of pairing crash dump data with Fault Management Event in the case of kernel panic. The "partition label" and the "partition UUID" are both stored in the superblock. They are both part of the file system rather than of the partition. For example, ext2–4 contain a UUID, while NTFS or FAT32 do not. The superblock is a part of the file system, thus fully contained within the partition, hence doing dd if=/dev/sda1 of=/dev/sdb1 leaves both sda1 and sdb1 with the same label and UUID.

There are several flavors of GUIDs used in Microsoft's Component Object Model (COM):


 * IID – interface identifier; (The ones that are registered on a system are stored in the Windows Registry at [HKEY_CLASSES_ROOT\Interface] )
 * CLSID – class identifier; (Stored at [HKEY_CLASSES_ROOT\CLSID])
 * LIBID – type library identifier; (Stored at [HKEY_CLASSES_ROOT\TypeLib] )
 * CATID – category identifier; (its presence on a class identifies it as belonging to certain class categories, listed at [HKEY_CLASSES_ROOT\Component Categories] )

UUIDs are commonly used as a unique key in database tables. The NEWID function in Microsoft SQL Server version 4 Transact-SQL returns standard random version-4 UUIDs, while the NEWSEQUENTIALID function returns 128-bit identifiers similar to UUIDs which are committed to ascend in sequence until the next system reboot. The Oracle Database SYS_GUID function does not return a standard GUID, despite the name. Instead, it returns a 16-byte 128-bit RAW value based on a host identifier and a process or thread identifier, somewhat similar to a GUID. PostgreSQL contains a UUID datatype and can generate most versions of UUIDs through the use of functions from modules. MySQL provides a UUID function, which generates standard version-1 UUIDs. The random nature of standard UUIDs of versions 3, 4, and 5, and the ordering of the fields within standard versions 1 and 2 may create problems with database locality or performance when UUIDs are used as primary keys. For example, in 2002 Jimmy Nilsson reported a significant improvement in performance with Microsoft SQL Server when the version-4 UUIDs being used as keys were modified to include a non-random suffix based on system time. This so-called "COMB" (combined time-GUID) approach made the UUIDs non-standard and significantly more likely to be duplicated, as Nilsson acknowledged, but Nilsson only required uniqueness within the application. By reordering and encoding version 1 and 2 UUIDs so that the timestamp comes first, insertion performance loss can be averted.

Some web frameworks, such as Laravel, have support for "timestamp first" UUIDs that may be efficiently stored in an indexed database column. This makes a COMB UUID using version 4 format, but where the first 48-bits make up a timestamp laid out like in UUIDv1. More specified formats based on the COMB UUID idea include:


 * "ULID", which ditches the 4 bits used to indicate version 4, and uses a base32 encoding by default.
 * UUID versions 6 through 8, a formal proposal of three COMB UUID formats.