Talk:Protein sequencing

DNA sequences
"However, it is rare that the DNA sequence of a newly isolated protein will be known, " - can someone wo knows more about protein sequencing explain this please? we currently in 2008 have 40 animal genomes with a total size of 40 gb and at our hands and thousands of bacteria are seqeunced. I don`t see who is still working on proteins that don`t have cDNA or genomic squences avaiable yet... --Maximilianh (talk) 23:14, 4 November 2008 (UTC)

Possible copyright issue
The content of this page appears to have been cut and pasted from somewhere, but I have been unable to find a source. Does anybody know where the text of this article came from, and why it is formatted so strangely (with many hard line breaks, hypenated words broken over two lines, unwikified headings)? See Copyrights. &mdash;AlanBarrett 19:12, 6 Nov 2004 (UTC)
 * It seems well referenced, I took out the attempt at footnotes- and made it more -wiki- it still needs more work and clarification.--nixie 03:26, 9 Nov 2004 (UTC)

The content was cut and pasted from somewhere (a project I did for an assignment), but this is not a copyright violation because I wrote it and therefore own the copyright. you cannot find it anywhere else because I have not put it online anywhere else. At some point it might be put online by the online science journal "Phlogiston" of my school, Winchester College Gingekerr 18:06, 6 Dec 2004 (UTC)


 * Meanwhile, this article was marked as a copyright violation from e-paranoids.com. I have reverted this; e-paranoids is a mirror of Wikipedia. --rbrwr&plusmn; 12:35, 12 Dec 2004 (UTC)


 * I put the suspected copyright violation mark on. As others suggested, some of the artefacts made it look like a copy. Thanks for establishing that it is innocent. And thanks to the author for writing it and releasing it. A little more editing may be useful. I don't understand the word 'derivatised', there may be another way of expressing the same thing. Bobblewik (talk) 20:25, 17 Dec 2004 (UTC)
 * Derivatization is quite a specific biochemical process often used in analytical chemistry/biochemistry - it is the controlled conversion of a 'species' originally present in a sample into another form which allows its seperation by chromotography or detection by another means- there's no alternate term to describe it. This describes how derivatization works in gas chromotography--nixie 05:13, 19 Dec 2004 (UTC)

B and Z aminoacids
I think something about the Edman sequencing ambiguities should be added:

B: D or N  (Asn or Asp, Asparagine or Aspartic acid) Z: E or Q  (Glu or Gln, Glutamic acid or Glutamine)

Some protein databases, e.g. SwissProt, still contain B's and Z's. As many other databases import from SwissProt they also contain B's and Z's. E.g. from the IPI.human protein database:

>IPI:IPI00382474.1|SWISS-PROT:P01762 Tax_Id=9606 Ig heavy chain V-III region TRO QVQLVQSGGGLVKPGGSLRLSCVASGFSFRDFYMSWIRZTPGKGLZWVSYIGGSGSTLYY ADSVKGRFTISRDNAQKSLYLZMBSLRTZBTAVYYCAATBBFBWSTFSLBYWGZGBLVTV SS

Refs:  From : "During acid hydrolysis, asparagine and glutamine are    deamidated to aspartate and glutamate, respectively,     rendering the acid and its corresponding amide     indistinguishable. Tryptophan is destroyed under the     standard hydrolysis conditions, and so is cysteine     although to a lesser extent."

I don't think they are needed here. B & Z are ambiguities that typically result from acid hydrolysis of proteins, not Edman degradation. They are now rarely used. Indeed, the above example has now been revised. The one-letter code (including ambiguities) is shown on the "Proteinogenic amino acid" page. ChiBeta (talk) 06:37, 4 January 2017 (UTC)

Broken refs
References #1 and #3 are broken:





Spelling convention (justifying recent revert)
I'm quite happy to accept "ioniZation" etc, where there is no classical root dictating the correct spelling (and also "color" etc where the classics side with the Americans) but not "hydrolyZe" - that is just plain wrong (anyway - how would you spell "hydrolysis"? "hydrolyzis" just looks stupid) Gingekerr 15:26, 4 August 2006 (UTC)

How can I locate the disufide bonds?
I think it is very important to add a section about locating the disulfide bonds in this article.--Carbon arka (talk) 18:03, 20 May 2008 (UTC)

Current vs superseded technologies
Of the technologies described, only mass spectrometry (often assisted with bioinformatics) is widely used for protein identification (de novo sequencing) in the present day. Older techniques may still be used in special cases such as for quality control of recombinant proteins, e.g. protein pharmaceuticals. I think the article would be improved by providing a historical context for the techniques and by changing the Mass spectrometry section to focus more on the strategies for analysis of proteins and peptides and the type of information generated rather than technical details of ionization etc. which belong in the Mass spectrometry article. ChiBeta (talk) 02:15, 6 January 2017 (UTC)

Protein sequencing vs identification
Regarding protein sequencing using mass spectrometry, there is quite a big difference between the techniques to identify a protein and to sequence a protein. Identifying a protein, whether from a complex mixture or a pure protein sample, do not require good coverage of the protein itself. Usually a handful of unique peptides matched to the protein sequence will make most scientist comfortable saying that the protein is in the sample, aka identified the protein. The purpose for protein sequencing is to deduce every single amino acid in the primary sequence, and therefore a 100% full coverage of the protein is a prerequisite. Because of this different purposes, although the experiment processes have similarity, the processes for protein sequencing using mass spectrometry usually are more complicated.Iceblacktea (talk) 15:19, 21 December 2018 (UTC)