Image source: https://pxhere.com/en/photo/712481

The Ax API (https://ax.dev/) is a nice alternative to the last article I posted, using the Ray for Bayesian optimization.

We still use the same example of MNIST classification network, and see how the usage slightly differs. A quick comparison of the results is given at the end of this article to see the effectiveness of using the Bayesian Optimization (BO) for hyperparameter search. The code used in this article is written as Jupyter notebook, adapted from this post and this Ax tutorial.

Let’s begin.

First, let’s load all the necessary modules.

Then the main part for the classification network is kept unchanged as my last article, except the train_mnist function (we change the name to evaluate_mnist here, to show that the result is for evaluation purposes):

Then we show how to use the AxClient class:

  1. create the experiment with parameters and the corresponding objective function (choose minimize=False as we want to maximize the classification accuracy)
  2. run 50 trials (you need to specify parallelization in the initialization manually)

The best parameters will be printed as below:

{‘lr’: 0.007584330389670517, ‘momentum’: 0.9}

Let’s Compare the result above to our result from using the Ray:

We can see there’s a slight difference between the results, but the values look pretty close. One reason might be the BO is a random approach, so the user might want to fix the seed at initialization for a reproducible output. Another reason might be, we need more trial # to obtain a precise optimization.

Overall, the Ax is a neat alternative for the Ray, which is easy to work around with. Hope this is useful to you!