Talk:Protein Data Bank

Untitled
This is more as a word of warning: Viewing structures via the PDB seems to work best with Netscape 4.7x. I have no luck with IE and some of the Chime-dependent display programs warn you that Netscape 6 won't work either. - David M

More detail and references needed in history section
From the article: "The PDB is a key resource in structural biology and is critical to more recent work in structural genomics." Some references here for work that uses the PDB in an interesting/important way would be good.

From the article: "Countless derived databases and projects have been developed to integrate and classify the PDB in terms of protein structure, protein function and protein evolution." Such as? Give some examples for the derived databases.

From the growth section: "The growth rate of the PDB has been the subject of fairly extensive analysis." .. such as? this needs referencing, why was it subject to extensive analysis? where are these?


 * I agree that these statements should be cited, I don't agree that a lack of citation is a reason to 'dumb down' the article. I don't think any structural biologists or crystallographers would have too much trouble accepting the above facts. Nevertheless, it seems Wikipedians are always willing to take it upon themselves to dumb down an article, rather than to educate themselves! I'll try to find some references for these facts (or at least some citeable examples) and re-instate them in the article. --Dan|(talk) 15:59, 23 May 2009 (UTC)

Raw data?
Knowing the amount of modeling which goes to a structure deposited to the PDB, I would hardly call the coordinate files in PDB as 'raw data'.

Protein Data Bank (file format) needed
A new page Protein Data Bank (file format) is needed, which should cross link to Chemical file format and use also the proper category. JKW 15:58, 8 April 2006 (UTC)
 * Initial Protein Data Bank (file format) created and anything related to format discussions on the Protein Data Bank should be moved to this page. JKW 11:20, 22 April 2006 (UTC)
 * I agree we should move junk from the file format section on this article to the file format article. --Dan|(talk) 13:41, 7 March 2008 (UTC)

Rewrite
I believe all the concerns above have been considered in the revision of the article today. I think every thing is referenced, though sometimes one reference covers an entire paragraph.--Christopher King (talk) 03:59, 5 January 2009 (UTC)
 * See my comment above in the section . --Dan|(talk) 16:00, 23 May 2009 (UTC)

Not public domain
I removed from the lead paragraph the claim that all information from PDB.com PDB.org is in the public domain. This claim is simply false. RCSB is partially responsible for this confusion, but it's important to note that nowhere do they indicate that the material is in the "public domain". Much of their material comes from a large variety of sources and there's no evidence that they even have the authority (much less the resources) to place it all into the public domain.

In particular, note these restrictions detailed at "Advisory for the Use of the PDB Archive" that make the content unacceptable for Wikipedia (and Commons):
 * "Redistribution of modified data 		files using the same file name as is on the FTP server is prohibited."
 * "The user assumes all responsibility for insuring that intellectual 		property claims associated with any data set deposited in the PDB 		archive are honored."

And note these restrictions at "Policies & References" &mdash;Danorton (talk) 05:07, 15 April 2009 (UTC)
 * "By using 	the materials available in the PDB archive, the user agrees to abide by 	the conditions described in the PDB 	Advisory 		Notice."
 * "Molecule of the Month illustrations are copyrighted. They are available 	for educational purposes, provided attribution is given to David S. 	Goodsell and the RCSB PDB. Molecule of the Month articles are 	copyrighted by the RCSB PDB and the authors of the article. Text can 	only be reprinted with permission, with attribution, and without the 	right to manipulate or change its content."


 * PDB.com has nothing to do with pdb.org or RCSB (I'm sure that was just a typo). However, I think I might challenge pdb.org about those claims of copyright (not on the "Molecule of the Month" images; the actual PDB files). The have no bases for copyright over PDB files. --Thorwald (talk) 08:03, 15 April 2009 (UTC)
 * The Protein Data Bank provides data under CC0 1.0 Universal (CC0 1.0) Public Domain Dedication:
 * https://www.wwpdb.org/about/usage-policies
 * It is public domain. 128.6.158.89 (talk) 18:00, 27 March 2024 (UTC)
 * Reference to CC0 public domain license added. Will let this sit for some time to give others an opportunity to review, but would eventually like to delete this Talk section now that it is out-of-date.
 * Also, N.B.: It's important not to mix up the PDB with the individual member organizations such as PDBe, PDBj, and RCSB PDB. The structural data in the *PDB* is public domain. However, as pointed out above, some items on the individual member sites may be copyrighted (such as the artwork and articles). But those should not be confused with the core data in the *PDB* archive. Wdpw (talk) 16:51, 16 April 2024 (UTC)

Need expansion/section for PDB Identifier
There needs to be more information about PDB Identifiers. I am a hobbyist editor and not a biochemist so I do not know the origins of the identifier or who started this naming format. Who came up with the identifier design and method? Who gets to decide what proteins get added to the series?

Are there "reserved regions" of the identifier for certain content, or are new sequences just added serially as it arrives from researchers?

As far as I can determine, it is a base-36 naming system using numbers 0-9 and letters A-Z. If it is limited to just four powers, as currently stated in the text of this article, that is only 1,679,616 possible PDB entries.

Does anyone seriously believe that there will never be more than 1.7 million protein structures found and mapped, across the entire history of life on the planet?

It would make more sense if additional powers/digits can be added as needed. A fifth digit will allow 60,466,176 total patterns, and sixth digit alows 2,176,782,336 patterns, etc.

DMahalko (talk) 23:00, 20 May 2009 (UTC)


 * Hey DMahalko, your understanding of the code is correct. I don't know how it was decided on, but that is how it is. In the past authors picked their own codes. These days they are automatically assigned by the submission software. The limited number of codes has been discussed on and off over the years on the PDB-L mailing list https://lists.sdsc.edu/mailman/listinfo.cgi/pdb-l (where I'm sure someone would answer any questions you have on the above. You may also like to search (or even improve) the unofficial PDB FAQ here http://pdbwiki.org/index.php/PDB_FAQ HTH --Dan|(talk) 16:06, 23 May 2009 (UTC)


 * Actually PDBWiki has an article all about the "PDB code" http://pdbwiki.org/index.php/PDB_code --Dan|(talk) 16:08, 23 May 2009 (UTC)

Physical Location
The physical location deserves to be mentioned or listed as a coordinates. I may be mistaken, but I think the sole location for the repository is at Rutgers University (http://www.biomaps.rutgers.edu/index.php?option=com_content&task=section&id=1&Itemid=2 third paragraph). I've seen a wing of a building at Rutgers marked Protein Data Bank: (40.524497°, -74.461634°). Pulu (talk) 18:37, 15 July 2009 (UTC)

Growth trend
The growth trend has a link towards the official PDB site.

However had, this starts in 1976; the plot however starts in 1972. From 1972 to 1976, it would thus have 0 entries. This does not seem to make a lot of sense to me?

If SEARCH was the original database, I have another source that states that PDB started with 7 entries. Could perhaps additional verification be provided to explain why it starts at 1976, and which entries were the first one? 80.110.81.222 (talk) 21:06, 30 May 2015 (UTC)

Assessment comment
Substituted at 03:28, 30 April 2016 (UTC)

A lot, then less, then the same again?
"100 registered structures milestone in 1982, the 1,000 in 1993, the 10,000 in 1999, and the 100,000 in 2014" What? 100, 1, 10 and then 100 again? Is this correct? Or is it somebody who do not know the ISO-Standard of thousand separators who has been writing this? — Preceding unsigned comment added by 78.67.250.137 (talk) 19:34, 21 June 2017 (UTC)