RCSB PDB has released a new AI-powered 3D Structure Similarity search, enabling faster and more scalable structural comparisons across experimentally determined and Computed Structure Models (CSMs).
This update introduces a streamlined approach for performing protein structure similarity searches through the RCSB Search API. The service identifies proteins with similar three-dimensional shapes using a machine learning–based approach that represents macromolecular structures as embeddings in a high-dimensional vector space. Combined with vector databases, this approach enables efficient large-scale comparison of 3D structures and improves sensitivity for detecting structural similarity.
For methodological details, see the associated publication describing the embedding-based approach: doi.org/10.1093/bioinformatics/btag058.
The service now focuses exclusively on protein chains and assemblies. Searches for nucleic acids are no longer supported. For assemblies containing both proteins and nucleic acids, only the protein chains are considered during similarity comparison.
Additional API parameters are now available to control method-specific behavior. Please review the API reference to ensure the parameters are configured appropriately for your workflow.
A new version of the rcsb-api library has been released with support for the updated 3D similarity search functionality. See the library documentation for usage examples.
Known limitations are described in the corresponding RCSB documentation page.
Full documentation and example queries are available in the RCSB API documentation.