Iterative fragment selection: A group contribution approach to predicting fish biotransformation half-lives
There are regulatory needs to evaluate thousands of chemicals for potential hazard and risk with limited available information. An automated method is presented for developing and evaluating Quantitative Structure?Activity Relationships (QSARs) for a range of chemical properties that can be applied for screening level chemical assessments. The method is an integrated algorithm for descriptor generation, data set splitting, cross validation, and model selection. Resulting QSARs are two-dimensional (2D) fragment based group contribution models. The QSAR development and evaluation method does not require previous expert knowledge for selecting 2D fragments associated with the chemical property of interest. The method includes information on the domain of applicability (structural similarity to the training set) and estimates of the uncertainty in the QSAR predictions. As a demonstration, the method is applied to generate novel QSARs for fish primary biotransformation half-lives (HLN). Results from the new HLN QSARs are compared to another 2D fragment-based HLN QSAR developed with expert judgment, and the predictive powers of the models are found to be similar. The relative merits and limitations of each method are investigated and the new QSAR is found to make comparable predictions with significantly fewer fragments. A coefficient of determination (R2) of 0.789 and a root mean squared error (RMSE) of 0.526 were obtained for the training data set and an R2 of 0.748 and an RMSE of 0.584 were obtained for the validation data set, along with a concordance correlation coefficient (CCC) of 0.857 showing good predictive power.