An overview of major methods for calculation of lipophilicity of chemical compounds estimated as logarithm of 1-octanol/water partition coefficient, log P, is provided. An interest in the field contributes to an increasing growth of methodological articles devoted to prediction of this important property. The substructure approaches e. g., fragmental or atom contribution methods, generate a great number of descriptors and preferably use linear methods to predict this distribution coefficient. The nonlinear methods, such as neural networks, tend to use more complex representation of molecules such as topological indices or quantum-chemical parameters. A theoretical analysis of thermodynamic properties determining this coefficient explains the success of frequently reported empirical correlations of log P values to their molar volume, surface area, or related properties. Several popular programs to predict lipophilicity of chemicals are benchmarked on a data set of eight neutral and twelve ionized (at pH of measurement) sets of compounds collected from the recent chemical and medicinal literature. All methods demonstrated reasonable accuracy for the prediction of the neutral but not the ionized series (log D). The use of the self-learning feature of some programs makes it possible to build a local model with fairly good prediction accuracy. Thus, the use of local rather than global models may provide a necessary accuracy of results for practical applications. Since the development of global models is difficult, an important feature of prediction programs should be their ability to estimate correctly the accuracy of their prediction i. e., their applicability domain.
Visit publication