Setting the stage for the "SD" file for bioassay definitions and data: Building the BioAssay Research Database
Building on the success of the Molecular Libraries Program (MLP), the Broad Institute MLP team is co-leading with the National Center for Advancing Translational Sciences (NCATS) an NIH-sponsored project across 7 institutions to augment the data in PubChem with the creation of the Bioassay Research Database (BARD). The BARD platform standardizes the representation of bioassays in a next-generation repository and provides a user-friendly interface that supports sophisticated queries and data mining. Data originating from publicly-funded chemical biology research efforts will be presented with appropriate context including structured assay and result annotations. These annotations use relevant ontologies including, for example, the BioAssay Ontology, Gene Ontology, and the Unit Ontology. We simplified the representation of ontologies into a hierarchical data dictionary to enable data producers to more easily create and upload projects, assays, and results, while creating two separate user interfaces for data consumers. The BARD WebQuery Interface leverages a Google-like interface with auto-suggest functionality for complex queries, such as retrieval of all assays, and results for biological pathways such as “DNA repair” or “oxidative stress”; presentation of this information in a rich-user interface that includes spreadsheet support for structure-activity relationship analyses. Compounds, projects, and assays can be exported into an Amazon-like query cart for refining queries, and additional computations can be executed on datasets via community-developed plug-ins including promiscuity analyses via the BioActivity Data Associative Promiscuity Pattern Learning Engine (BADAPPLE) and a CYP450 metabolism site prediction plugin (hgp://www.farma.ku.dk/smartcyp/) using 2D structure fingerprints. Integration between the WebQuery and Desktop clients enables power users to initiate analyses in WebQuery and gain more insight via the Desktop client.
Lastly, as industry and academia work together to innovate in small-molecule therapeutics, we have created an initial specification for the Assay Definition Standard. This standard through the Assay Definition Format has been used as the medium of data file transfer for data upload. We expect that the Chemical Biology community now has an opportunity to leverage this standard to routinely transfer assay and result data within and between information systems and organizations.
This presentation will highlight the BARD platform with a focus on representing the cumulative body of work that exploits the ChemAxon toolkit.