Lead Generation
Can computer-based methods generate
leads? The answer is yes – but the computer needs some
information as input to generate meaningful output. If multiple
series of active compounds are available, then derived models
describing common features can be used to identify novel
chemistries that possess these features. Derived models can be
in the form of 3D pharmacophores – arrangements of functional
groups in space that are thought to give rise to activity – or
more abstracted patterns of topological and chemical properties
as they are distributed throughout the molecule. Pharmacophores
can be derived and understood by human beings. They can be used
to identify novel compounds in large libraries, real or virtual,
of available compounds. Topological and chemical property
patterns are derived by learning methods. They similarly can be
used to identify novel compounds in large libraries. The more
general the pharmacophore or pattern, the more likely it is to
retrieve candidates that are dissimilar to the compounds that
were used to derive them. Biopredict has developed learning
methods as mentioned above that can select among and utilize
thousands of molecular descriptors in the generation of a model
for activity (see
Technologies).
A different lead-generation method
that has come into widespread use is that of docking screens. A
docking screen is a structure-based method that can only be used
if an experimentally determined structure is available for the
target protein or a homolog. When only homologs are available
the method requires as a preliminary step that a homology model
be built for the target protein. Docking screens can be
effectively performed using homology models when the overall
sequence identity of the target to the nearest homolog whose
structure is known is on the order of 35% (lower in families
where extensive numbers of structures are available).
If the 100 highest scoring compounds
from a 100,000 compound virtual screen are immediately tested
the hit rate is typically on the order of a few percent. There
is however a way to derive more useful information from the
virtual screen and to drive the effective hit rate higher. To
do this we cluster multiple conformations of energetically
reasonably docked compounds based on their interactions with
the target active site. This is done by constructing a
description of these interactions called a “footprint”.
Footprints can then compared with those of known active
compounds against the target class: where there is significant
footprint overlap the cluster has an increased likelihood of
being active. Selected clusters are examined and used to
derive pharmacophores. These pharmacophores are then used to
identify additional compounds from our corporate library of
purchasable compounds or from other sources for testing. The
philosophy behind this strategy is similar to that used for the
interpretation of high throughput screens: each docked compound
is treated as a separate experiment. When multiple compounds
vote for a particular mode of binding then that mode of binding
has increased credibility. |