Nielsen: RMNIST with annealing and ensembling.

(Short notes.)

Lately there’s been growing interest in automating the deep neural network model hyperparameter search (hyperparameter being e.g. network structure).

Nielsen’s fun idea:

  1. Optimize the network hyperparameters with an algorithm called simulated annealing.
  2. For each set of hyperparameters, an ensemble of multiple networks was created, trained and evaluated.
  3. Cost function includes l2 regularization on weights (‘weight decay’).

The reason I thought spotting SA was fun: ahem.

Things I want to look into: Reduced MNIST.