Research

Current work (as of 2026)

I am working on algorithms to decide which molecules to test in drug discovery campaigns. Some specific problems I am working on (or have worked on recently) are:

  • Applying Bayesian optimization to drug screening (and making it work in practice)
  • Molecular property prediction with quantified uncertainty
  • Retrosynthetic planning (how to synthesize novel molecules)
  • Meaningful evaluation of ML algorithms in chemistry (average loss on a test set can be misleading)
WarningDisclaimer

I have signed a confidentiality agreement as part of my current position and therefore will not be able to disclose everything I am working on.

General research interests

I am generally interested in how machine learning can enhance scientific discovery (for positive applications). Due to issues of robustness, generalization, and small dataset sizes I generally do not believe that replacing existing systems with deep neural networks is the best strategy for most practical problems (at least with current technology). Instead, I think that the way forward is to identify small parts of larger systems in science where machine learning could have an advantage over existing approaches, then try to develop methods that are well-suited to these niches. Currently the niche I am focusing on is early-stage drug candidate generation.

  • Sequential decision making (e.g. Bayesian optimization)
  • Gaussian processes and other kernel methods
  • Learning on small datasets
  • How can large language models help discovery? Until recently I thought they had limited potential for this, but I have started to change my mind.
  • Model robustness and reliability
  • Uncertainty quantification
  • Evaluation of machine learning algorithms in a way that reflects their practical usage
  • AI safety/alignment

Publications

For a complete list see my Google Scholar page. For some of my papers I have a dedicated webpage with more details (and retrospective thoughts). Click any paper title below for more detail.

Title Year Venue Categories
Hash Collisions in Molecular Fingerprints: Effects on Property Prediction and Bayesian Optimization 2025 arXiv preprint preprint
Basic Bayesian Optimization is Underrated for Molecule Design 2024 ICML 2024 AI for Science Workshop workshop
Retro-fallback: retrosynthetic planning in an uncertain world 2024 The Twelfth International Conference on Learning Representations paper
Meta-learning Adaptive Deep Kernel Gaussian Processes for Molecular Property Prediction 2023 The Eleventh International Conference on Learning Representations paper
Genetic algorithms are strong baselines for molecule generation 2023 arXiv preprint preprint
Re-evaluating Retrosynthesis Algorithms with Syntheseus 2023 arXiv preprint preprint
Tanimoto Random Features for Scalable Molecular Machine Learning 2023 Advances in Neural Information Processing Systems paper
DOCKSTRING: easy molecular docking yields better benchmarks for ligand design 2022 Journal of Chemical Information and Modeling paper
Sample-Efficient Optimization in the Latent Space of Deep Generative Models via Weighted Retraining 2020 Advances in Neural Information Processing Systems paper
A study of EHVI vs fixed scalarization for molecule design 2025 arXiv preprint preprint
Batched Bayesian Optimization by Maximizing the Probability of Including the Optimum 2025 Journal of Chemical Information and Modeling paper
Stochastic Gradient Descent for Gaussian Processes Done Right 2024 The Twelfth International Conference on Learning Representations paper
Retrosynthetic Planning with Dual Value Networks 2023 Proceedings of the 40th International Conference on Machine Learning paper
GAUCHE: A Library for Gaussian Processes in Chemistry 2023 Advances in Neural Information Processing Systems paper
Nonequilibrium sensing of volatile compounds using active and passive analyte delivery 2023 Proceedings of the National Academy of Sciences paper
Petroleomic analysis of the treatment of naphthenic organics in oil sands process-affected water with buoyant photocatalysts 2018 Water Research paper
No matching items

Retrospective thoughts on my papers

On each paper’s page I have tried to provide an honest evaluation of the paper. I added this because:

  1. It might be hard to tell at a glance whether something in an academic paper will work in real life.
  2. Because updating scientific papers is relatively uncommon, there isn’t really a proper place for an author to state how their work looks in hindsight.

If you would like me to write about a specific article that is missing or would like an update on an existing article, please send me an email and I will write an update.