NOTE
this blog post can be run as a jupyter notebook. I re-ordered the cells to make it easier to read; to re-produce all the plots see instructions at the end of the post.
Background¶
"Diverse optimization" has been a popular topic in machine learning conferences
for a few years now, particularly in the "AI for drug discovery" sub-field.
In this context, the goal of "optimization" algorithms is to suggest
promising drug candidates,
where "promising" means maximizing one (or more) objective functions.
An example of an objective function could be a docking score
(an approximate simulation of the interactions between a protein and a molecule).
"Diverse" optimization further requires that an algorithm produce multiple
distinct candidate solutions.
This is typically desired when the objective functions don't fully capture
everything we want (for example, a drug candidate also having low toxicity).
The hope is that a diverse set of candidates will have a higher chance of
one useful candidate compared to a non-diverse sets.
Read more…