Thoughts on Google Vizier

Vizier, described in a recent paper from Google, is a black-box optimization algorithm deployed for "numerous research and production systems at Google". Allegedly, this one algorithm works well on a wide range of settings (something which the "no-free-lunch-theorem" suggests might not be possible). In this post I provide my thoughts on what key design decisions likely make this algorithm work well.

Read more…

Being 'data-driven' does not mean that you should use bad data.

Relying on data rather than intuitions to make decisions is usually a good thing, but is not always better. When one needs to make a decision about things for which there is no good data it might be better to rely on intuition rather than the best proxy available. Here are some examples where I think an intuition-based approach can be better than a data-driven approach (but still worse than a data-driven approach with good data):

Read more…

Problems with the top-k diversity metric for diverse optimization

NOTE this blog post can be run as a jupyter notebook. I re-ordered the cells to make it easier to read; to re-produce all the plots see instructions at the end of the post.

Background

"Diverse optimization" has been a popular topic in machine learning conferences for a few years now, particularly in the "AI for drug discovery" sub-field. In this context, the goal of "optimization" algorithms is to suggest promising drug candidates, where "promising" means maximizing one (or more) objective functions. An example of an objective function could be a docking score (an approximate simulation of the interactions between a protein and a molecule). "Diverse" optimization further requires that an algorithm produce multiple distinct candidate solutions. This is typically desired when the objective functions don't fully capture everything we want (for example, a drug candidate also having low toxicity). The hope is that a diverse set of candidates will have a higher chance of one useful candidate compared to a non-diverse sets.

Read more…

How I chose a static site generator

Recently, I wanted to update my website to look a bit more polished (and support additional features such as automatically generating pages for my publications). In the end I decided to completely switch from building my website with Jekyll to nikola instead. This post explains my thought process for this (in case anybody else is considering a similar switch).

Read more…

Reasons to have a website

I created this website because I thought (and continue to think) that having a website can benefit one's career. Essentially, a professional website serves as an accessible source of information about oneself for prospective employers, coworkers, and employees. Unless you put something horrible on your website, the effect should at worst be neutral, so there is essentially no downside to having one.

In the remainder of the post I will lay out a more detailed case for having a website and address some potential hesitations people might have about creating one.

Read more…

A Quick Tutorial on Bash Quotes

Today I learned way more about quotations in bash than I ever thought I needed to know. I thought I would highlight the interesting use case that I discovered, which requires some special trickery to write a script that executes arbitrary commands. First, let's quickly review some facts about bash quotes.

Read more…

How to Keep a Communal Fridge Clean

Last month, my class decided that we should get a fridge for the class study room. This brought up an important question: how would the fridge be cleaned? I thought this was an interesting problem and deserved some discussion, both from a practical and a theoretical standpoint.

Read more…

Language Travel Logs: Japanese 2018

One dream I have always had since I started learning languages is to be able to go to another country and use that language to communicate. This August I had the first opportunity to do that during a 2 week trip to Japan. In this post, I will outline the preparation I did before going, where I was able to use it when I was there, and evaluate my success.

Read more…