Simplifying
Assumptions

Hyperparameter search results

by Tim Cooijmans on 2017-05-01

I performed a hyperparameter search over models trained with the squared EMD loss and the contiguish masking I’ve been talking about.

I used to do grid searches, but I’ve been experimenting with random search recently. The rationale is that most hyperparameters don’t matter, so changing multiple hyperparameters at a time is more efficient. I’m still working out my workflow, but for now I like scatterplots:

Each subplot corresponds to a hyperparameter. Within each subplot, the horizontal axis ranges over hyperparameter values and the vertical axis over loss values (on a log scale to get more resolution at the bottom). There are three kinds of results in each plot. Blue dots are reliable results from runs that didn’t crash or end prematurely. Orange dots show results from unfinished runs. Red lines show runs that didn’t even manage to start.

The red runs are evenly distributed over the hyperparameter space, which is good. If they were bunched up in a particular region, then that region would be underrepresented. If I silently ignored the crashed runs, I would never know about this.

Overall, I don’t see any slam dunk in terms of directions to move in to improve the loss. Three runs are ahead of the pack, but they look like outliers to me.

Legend: