Motivation: Proteomics is gearing up towards high-throughput methods for identifying and characterizing all of the proteins, protein domains and protein interactions in a cell and will eventually create more recorded biological information than the Human Genome Project. Each protein expressed in a cell can interact with various other proteins and molecules in the course of its function. A standard data specification is required that can describe and store this information in all its detail and allow efficient cross-platform transfer of data. A complete specification must be the basis for any database or tool for managing and analysing this information.
Results: We have defined a complete data specification in ASN.1 that can describe information about biomolecular interactions, complexes and pathways. Our group is using this data specification in our database, the Biomolecular Interaction Network Database (BIND). An interaction record is based on the interaction between two objects. An object can be a protein, DNA, RNA, ligand, molecular complex or an interaction. Interaction description encompasses cellular location, experimental conditions used to observe the interaction, conserved sequence, molecular location, chemical action, kinetics, thermodynamics, and chemical state. Molecular complexes are defined as collections of more than two interactions that form a complex, with extra descriptive information such as complex topology. Pathways are defined as collections of more than two interactions that form a pathway, with additional descriptive information such as cell cycle stage. A request for proposal of a human readable flat-file format that mirrors the BIND data specification is also tendered for interested parties.
Availability: The ASN.1 data specification for biomolecular interaction, molecular complex and pathway data is available at ftp://bioinfo.mshri.on.ca/pub/BIND/Spec/bind.asn. An interactive browser for this document is available through our homepage at http://bioinfo.mshri.on.ca/BIND/asn-browser/.