Pseudo K-tuple nucleotide composition

The Pseudo K-tuple nucleotide composition or PseKNC, was extended from Chou's Pseudo amino acid composition (PseAAC). Both PseAAC and PseKNC are of vector descriptor, but the former represents protein or peptide sequences while the latter represents DNA or RNA sequences.

To avoid completely losing the sequence-order information for protein and peptide sequences, the PseAAC was proposed by Kuo-Chen Chou. To address the problem of DNA and RNA sequences, the pseudo K-tuple nucleotide composition or PseKNC was proposed. For the convenience scientific community, a freely available web server called PseKNC and an open source package called PseKNC-General were developed in 2013 and 2014, respectively, that could convert large-scale sequence datasets to pseudo nucleotide compositions with numerous choices of physicochemical property combinations. PseKNC-General can generate several modes of pseudo nucleotide compositions, including conventional k-tuple nucleotide compositions, Moreau–Broto autocorrelation coefficient, Moran autocorrelation coefficient, Geary autocorrelation coefficient, Type I PseKNC and Type II PseKNC.

Like PseAAC in computational proteomics and proteome analysis, PseKNC has also been increasingly used in computational genomics and performing various genome analyses.