Friday, May 25, 2007

Proteins, Nucleic Acids and Small molecules

Proteins and Nucleic Acids




  • The New PDB: Research Collaboratory for Structural Bioinformatics (RCSB)



  • PDBSELECT provides a representative list of high-resolution structures.
  • PDBFINDER is a database that is constructed using a PERL script from the PDB, DSSP and HSSP databases. Many of the fields contained in the PDBFINDER database are difficult to access from the original databases. Some information is retrieved from the original literature.
  • OBSTRUCT - A service to obtain a largest non-redundant set of protein structures from PDB according to crystallographic resolution and sequence identity as specified by the user. Also NMR-elicited structures can be selected.
  • PROCHECK - Checks the stereochemical quality of a known protein structure, producing a number of graphical plots analysing its overall and residue-by-residue geometry.
  • Dali Domain Dictionary.
  • SCOP - Structural Classification of Proteins in UK and USA.
  • CATH - Protein Structure Classification.
  • FSSP: Fold classification based on Structure-Structure alignment of Proteins
  • HIV Protease Database originally from Alexander Wlodawer's Macromolecular Structure Laboratory at NCI-Frederick Cancer Research and Development Center.
  • Atlas of Protein Side-Chain Interactions - based on the printed atlas of Singh & Thornton (1992)
  • The Backbone Dependent Rotamer Library WebPage and the Protein Sidechain Webpage from Roland L. Dunbrack's group.
  • Molecules R US - from the NIH.
  • NCBI (National Center for Biotechnology Information) Structure Group.

    • MMDB: The Molecular Modeling Database - this is a compilation of all the Brookhaven Protein DataBank 3-dimensional structures of biomolecules from crystallographic and NMR studies. (MMDB is a database of ASN.1 -formatted records, not PDB formatted records.)
    • Structures in MMDB have been compared with one another using VAST (Vector Alignment Search Tool).
    • Entrez (DNA/RNA + Protein + Structures + Medline subset).


  • The WWW Virtual Library: Biomolecules.
  • RNA Secondary Structures.
  • Image Library of Biological Macromolecules.
  • Swiss-3DIMAGE -- protein structure images (many stereo)
  • Antibody Resource Page - compilation of antibody sites and resources for locating antibodies.
  • Protein Structure Prediction Center - hosts the CASP web sites.
  • Molecular Information Agent (MIA) - a web server that searches biological databases to find the existing information about a macromolecule.

  • Small Molecules & Ligands




  • ReLiBase- “a database system for analysing receptor/ligand complexes deposited in the Protein Data Bank”. Search for complexes that contain a specific substructure, either by sketching a molecule or by typing a SMILES string. Alternatively, seach for a specific interaction, or in other ways.
  • 2D and 3D Structural Information from the DTP (Developmental Therapeutics Program) at the National Cancer Institute 3D Database (NCI DIS 3D database. DIS = Drug Information System) contains some 400,000 drugs. Structures for some 200,000 compounds can be downloaded.



  • NCI Data and Online Services where you can search the NCI Database of compounds, convert file formats, and find public chemical data .
  • 127,000 NCI open database compounds built by Corina (35MB file compressed, 200MB uncompressed).
  • 140,000 plated compounds (75 MB file compressed, in MDL SD format).
  • 1,980 diversity set compounds (in MDL SD format) representing the chemical diversity of the plated compounds.
  • Milne, G.W.A., Nicklaus, M.C., Driscoll, J.S., Wang, S. and Zaharevitz. D. ''The NCI Drug Information System 3D Database.'' J. Chem. Inf. Comput. Sci. 34:1219-1224 (1994).

  • ZINC - a free dockable database project for virtual screening sponsored by the Shoichet Laboratory, Pharmaceutical Chemistry, UCSF, and hosted by docking.org.
  • KEGG DRUG - a database of drugs approved in the USA and Japan, part of a bioinformatics resource named KEGG (Kyoto Encyclopedia of Genes and Genomes), produced by the Kanehisa Laboratories in the Bioinformatics Center of Kyoto University and the Human Genome Center of the University of Tokyo.
  • ChemBank - a freely available collection of data about small molecules , from the Institute For Chemistry and Cell Biology, part of Harvard Medical School.
  • The Three-Dimensional Drug Structure Databank - an NIH collection of experimental and approved therapeutic agents whose structures have been experimentally determined or built. There are not many structures right now (20 structures in March 1996).
  • CCDC: Cambridge Crystallographic Data Centre and the Cambridge Structural
    Database, CSD
    , of organic and metal organic compounds.
  • SMILECAS database from Syracuse Research contains more than 103 000 SMILES-Codes and CAS-numbers.
  • CACTVS Online SMILES Translator and Structure File Generator, really useful if you want to convert from a SMILES string into a 3D structure; it works even with many molecules. Check out all the other goodies from the Erlangen/Bethesda Data and Online Services.
  • source : http://mgl.scripps.edu/people/gmm/
    thanks to sam alidrin (advanced diploma in bioinfo MKU) for sending me this link

    No comments:

    Information science has been applied to biology to produce the field called Bioinformatics