MolDiA: A Novel Molecular Diversity Analysis Tool. 1. Principles and Architecture

publication · 7 years ago
by Ana G. Maldonado, Jean-Pierre Doucet, Michel Petitjean, Bo-Tao Fan (Université Paris Diderot (Paris 7))
We introduce the principles and the architecture of a user-friendly software named MOLDIA (Molecular Diversity Analysis) which aims to the comparison of diverse molecular data sets through an XML structured database of predefined fragments. The MOLDIA descriptors are composed of complex fingerprint-like structures, which enclose not only structural information but also physicochemical property data. The system architecture includes the use of customizable weights on molecular descriptors and different choices of similarity/diversity measures to analyze the given data sets. Intermolecular comparisons using Ullmann's algorithm were optimized by the use of fuzzy logic, generic atoms, and a whole system of chemical graph analysis. We have found that customizing the similarity/diversity computation using structural and/or properties weights and choosing the level of fuzziness of the molecular comparison allow the user to adapt the tool to particular needs and increases the possibilities of MolDiA applications. The implementation of XML Web technologies has proven to improve and ease the extraction, processing, and analysis of chemical information.
