Trust, But Verify: On the Importance of Chemical Structure Curation in Cheminformatics and QSAR Modeling Research

publication · 7 years ago
by Denis Fourches, Eugene Muratov, Alexander Tropsha (University of North Carolina at Chapel Hill)
Standardizer JChem Base Naming
With the recent advent of high-throughput technologies for both compound synthesis and biological screening, there is no shortage of publicly or commercially available data sets and databases1 that can be used for computational drug discovery applications (reviewed recently in Williams et al.2). Rapid growth of large, publicly available databases (such as PubChem3 or ChemSpider4 containing more than 20 million molecular records each) enabled by experimental projects such as NIH’s Molecular Libraries and Imaging Initiative5 provides new opportunities for the development of cheminformatics methodologies and their application to knowledge discovery in molecular databases.
