Nucleotide universal IDentifier

The nucleotide universal IDentifier (nuID) in molecular biology, is designed to uniquely and globally identify oligonucleotide microarray probes.

Background
Oligonucleotide probes of microarrays that are sequence identical may have different identifiers between manufacturers and even between different versions of the same company's microarray; and sometimes the same identifier is reused and represents a completely different oligonucleotide, resulting in ambiguity and potentially mis-identification of the genes hybridizing to that probe. This also makes data interpretation and integration of different batches of data difficult. nuID was designed to solve these problems. It is a unique, non-degenerate encoding scheme that can be used as a universal representation to identify an oligonucleotide across manufacturers. The design of nuID was inspired by the fact that the raw sequence of the oligonucleotide is the true definition of identity for a probe, the encoding algorithm uniquely and non-degenerately transforms the sequence itself into a compact identifier (a lossless compression). In addition, a redundancy check (checksum) was added to validate the integrity of the identifier. These two steps, encoding plus checksum, result in an nuID, which is a unique, non-degenerate, permanent, robust and efficient representation of the probe sequence. For commercial applications that require the sequence identity to be confidential, encryption schema can also be added for nuID. The utility of nuIDs has been implemented for the annotation of Illumina microarrays, which can be downloaded from Bioconductor website. It also has universal applicability as a source-independent naming convention for oligomers.

The nuID schema has three significant advantages over using the oligo sequence directly as an identifier: first it is more compact due to the base-64 encoding; second, it has a built-in error detection and self-identification; and third, it can be encrypted in cases where the sequences are preferred not to be disclosed. For more details, please refer to the nuID paper. The implementation nuID encoding and decoding algorithms can be found in the lumi package or at