Drug-likeness and increased hydrophobicity of commercially available compound libraries for drug screening

Johannes Zuegg, Matthew A Cooper
Current Topics in Medicinal Chemistry 2012, 12 (14): 1500-13
Most drug discovery programs today originate by selection of 'hit' molecules resulting from assays against large compound screening libraries. The chemical space in which these hits reside has implications for its biological activity in vivo and likelihood of progression to a drug candidate. We have created a database of commercially available screening compounds and natural products in order to analyse the drug- and lead-likeness of commercial screening compounds and compare them with i) orally administered drugs, ii) non-orally administered drugs, and iii) compounds with significant biological activity but unspecified or not yet determined route of administration from the public databases DrugBank and ChEMBL. The data set contained 15.5 million entries from 102 vendors, which resulted in just over 8 million unique chemical structures. We review these data for current drug/lead-likeness, then utilise substructure-based filters for promiscuity and unwanted groups, and finally compare chemical properties for structures within the different sub-sets. While the majority of the commercial compounds satisfy various drug-likeness rules, they show a larger molecular weight and higher hydrophobicity compared to orally available drugs, with generally higher aromaticity and lower solubility. This 'right shift' of chemical properties has also been found in the majority of the compounds with significant biological activity in ChEMBL, reflecting a common trend in current drug discovery, towards larger, more hydrophobic compounds and fewer drug-like compounds. In particular, successful drugs were found to possess much lower median logD values than those found for compound collections. In addition, commercial compounds show a quite narrow distribution in molecular weight, with a median absolute deviation of only 78 Da around a median of 387 Da. For high-throughput screening a highly stringent combination of several lead-likeness and substructure filters against unwanted groups could be applied, resulting in 2 million lead-like structures. For fragment based screening approaches the rule of three (Ro3) would select around 400,000 structures.

Full Text Links

Find Full Text Links for this Article


You are not logged in. Sign Up or Log In to join the discussion.

Related Papers

Remove bar
Read by QxMD icon Read

Save your favorite articles in one place with a free QxMD account.


Search Tips

Use Boolean operators: AND/OR

diabetic AND foot
diabetes OR diabetic

Exclude a word using the 'minus' sign

Virchow -triad

Use Parentheses

water AND (cup OR glass)

Add an asterisk (*) at end of a word to include word stems

Neuro* will search for Neurology, Neuroscientist, Neurological, and so on

Use quotes to search for an exact phrase

"primary prevention of cancer"
(heart or cardiac or cardio*) AND arrest -"American Heart Association"