PDB entries with extended CCD IDs

13 views
Skip to first unread message

Jose Duarte

unread,
Aug 1, 2023, 5:30:51 PM8/1/23
to APIs @ RCSB PDB

wwPDB, in collaboration with the PDBx/mmCIF Working Group, has set plans to extend the length of accession codes (IDs) for PDB and Chemical Component Dictionary (CCD) entries in the future. PDB entries containing these extended IDs will not be supported by the legacy PDB file format. (see previous announcement)

CCD entries are currently identified by unique three-character alphanumeric IDs. At current growth rates, we anticipate running out of three-character IDs before 2024. After this point, the wwPDB will issue five-character alphanumeric accession codes for CCD IDs in the OneDep system. To avoid confusion with current four-character PDB IDs, four-character codes will not be used. Owing to limitations of the legacy PDB file format, PDB entries containing the new five character ID codes will only be distributed in PDBx/mmCIF format.

In addition, wwPDB has reserved a set of CCD IDs: 01 - 99, DRG, INH, LIG that will never be used in the PDB. These reserved codes can be used for new ligands during structure determination so that they can be identified as new upon deposition and added to the CCD during biocuration.

wwPDB is asking users and software developers to review their code and remove any current limitations on CCD ID lengths, and to enable use of PDBx/mmCIF format files. Example files with extended PDB and/or CCD IDs are available via github to assist code revisions, see https://github.com/wwPDB/extended-wwPDB-identifier-examples. To learn about PDBx/mmCIF, please visit https://mmcif.wwpdb.org/.

Note that all RCSB APIs will use the extended CCD ids as soon as they become available.

For any further information please contact us at in...@wwpdb.org

Best regards

Jose

----
Jose Duarte
RCSB Protein Data Bank
San Diego Supercomputer Center 
UC San Diego
Reply all
Reply to author
Forward
0 new messages