A simple breakdown of ways to influence the output of an ML training process

The three things that influence the output of an ML training process are:

  • Model priors (architecture, initialization)
  • Training task (loss function, training data)
  • Solution search technique (optimization/parameter search method)

In more detail, before an ML training process, you’ll need to decide on the following:

  • Priors
    • What architecture are you using
      • How do the parameters relate to each other
      • How many parameters will there be
    • What values will the parameters be initialized to
  • Training task
    • The loss function(s) you’ll be using in training
    • What data will you be training on
      • Data type
      • How will it be fed into the model
        • Batching
        • How many times will the model see data points (epochs)
        • Ordering of how data shown
  • Solution search technique
    • How will you update the model parameters after seeing the output and loss value on some input data
      • What optimization algorithm you’ll use
        • What the parameters of the algorithm will be 

After an ML training process, you will learn some stuff:

  • Scores on your performance metric(s)
    • Score on the training data and any held-out validation/test data
    • Scores on datasets along the course of the training process
  • Parameter values
    • In the final model
    • Along the course of the training process
  • Outputs of your model given an arbitrary input chosen from any dataset of a valid input type
  • Intermediate states of the model’s computation given an arbitrary input chosen from any dataset of a valid input type

An ML experiment should involve varying something from the first list and measuring the impact on things in the second list.

However, the information you get after an ML training process differs from what we care about. We are interested in the effects of a model on the world and people around us. Figuring out how to map this to the measurable outputs of an ML training process is a significant problem.

8

New Comment

New to LessWrong?