The molecular mapping of atom-level properties (MOLMAP) descriptor was generated on the basis of chemical bond descriptors of a molecule by Kohonen self-organizing map with a specific algorithm. The bond descriptors were composed of the physiochemical properties of the chemical bond, such as the difference of the charges between the two atoms and topological properties, such as the number of hetero-atoms connected to the two atoms. In this paper, the MOLMAP descriptors were used to predict the mutagenicity of 4075 organic substances (2305 mutagens and 1770 nonmutagens in Ames test). Random forests were used to construct mathematical models with three kinds of descriptors: (1) MOLMAP descriptors of different size; (2) global molecular descriptors; (3) the combination of MOLMAP descriptors and global molecular descriptors. The correct prediction percentage of out of bag (OOB) cross-validation of the whole data set reached 85.4%. To test the stability of the prediction model, it was used to predict the properties of a test set that was composed of 472 compounds collected from another database. The percentage of correct prediction of the test set was 86.7%. The prediction results were improved compared with the results of previous work.
Visit publication