Notes on AWS SageMaker hyperparameter tuning jobs

by Nikolay Donets – Feb. 25, 2019 · 3 min read

Hyperparameter optimisation is the process of tuning the hyperparameters of a machine learning model to optimise its performance on a given dataset. This can be done in a number of ways, such as using grid search, which involves specifying a grid of hyperparameters and then training and evaluating a model for each combination of hyperparameters to find the combination that performs the best. Another way to perform hyperparameter optimisation is to use random search, which involves randomly sampling from a specified distribution of hyperparameters and then training and evaluating the model for each set of hyperparameters to find the best combination. Evolutionary optimization is a method for hyperparameter optimization that uses a population-based approach to search for the best hyperparameters. This involves creating a population of candidate solutions, which are then evolved through a series of mutations and crossovers to produce new, potentially better solutions. Gradient-based optimization is a method for hyperparameter optimization that uses gradient information to guide the search for the best hyperparameters. This means that the optimization algorithm uses the gradient of the performance metric with respect to the hyperparameters to move in the direction of better solutions. This can allow the optimization process to move more efficiently, allowing it to find good solutions faster and with fewer trials. Ultimately, the goal of hyperparameter optimisation is to find the best set of hyperparameters for a given model and dataset, which can help improve the model's performance and accuracy.

Amazon SageMaker is a fully managed machine learning platform that includes built-in tools for hyperparameter optimisation. With SageMaker, you can use the built-in hyperparameter tuning functionality to automatically search for the best hyperparameters for your model. This can be done by specifying a range of values for each hyperparameter, and then SageMaker will train and evaluate your model using a variety of combinations of hyperparameters to find the combination that yields the best performance. SageMaker's hyperparameter tuning functionality allows easily scale up the search process by using multiple workers, which can speed up the optimisation process.

Hyperparameter tuning job in AWS SageMaker is essentially a composition of n training jobs. For each of them supervisor pass parameters from the predefined range (for integer and continuous values) or set (for categorical values) and control execution. For the case of using custom algorithms, parameters to pass are saved into the /opt/ml/input/config/hyperparameters.json. All values into it are strings even if they pretend to be integers or floats. So it is necessary to parse and validate the hyperparameters.json to not fail a training job with obscure errors right after the job start.

Important parameters for hyperparameter jobs are concurrent training jobs per hyperparameter tuning job and a number of training jobs per hyperparameter tuning job. For each run supervisor chose parameters based on information it has so far and to run a successful search supervisor has to have enough information to build a probability density function to chose hyperparameters. A good guess is to set the number of concurrent training jobs as N and the number of training jobs per hyperparameter tuning jobs as at least 3N. Otherwise, a result of the tuning job might be not fine-tuned to produce the best possible result.

Although hyperparameter tuning might not improve the performance of the algorithm, as mentioned in the documentation, it is worth to run it systematically. In my experience, hyperparameter tuning might give up to 2% of the advance for the parameter optimized. It is a tiny change, but for highly optimized algorithms the improvement is notable.