Page History

...

MLP can be used for both regression and classification. For both tasks, we need first initialize the MLP model by specifying the parameters. The parameters are listed as follows:

...

Parameter

...

Description

...

model path

...

The path to specify the location to store the model.

For training, the following things need to be specified:

The model topology: including the number of neurons (besides the bias neuron) in each layer; whether current layer is the final layer; the type of squashing function.
The learning rate: Specify how aggressive the model learning the training instances. A large value can accelerate the learning process but decrease the chance of model convergence. Recommend

...

learningRate

...

in range (0,

...

0.5].
The momemtum weight: Similar to learning rate, a large momemtum weight can accelerate the learning process but decrease the chance of model convergence. Recommend

...

regularization

...

in range (0, 0.

...

momentum

...

Control the speed of training. A big momentum can accelerate the training speed, but it may
<ac:structured-macro ac:name="unmigrated-wiki-markup" ac:schema-version="1" ac:macro-id="9fd8de1c-df07-4ef0-89d5-7cbc00a4d812"><ac:plain-text-body><![CDATA[ also mislead the model update. Typically in range [0.5, 1)

...

]]></ac:plain-text-body></ac:structured-macro>

...

squashing function

...

Activate function used by MLP. Candidate squashing function: sigmoid, tanh.

...

cost function

...

Evaluate the error made during training. Candidate cost function: squared error, cross entropy (logistic).

5].
The regularization weight: A large value can decrease the variance of the model but increase the bias at the same time. As this parameter is sensitive, it's better to set it as a very small value, say, 0.001

...

layer size array

...

.

The following is the sample code regarding how to train a model initialization.

No Format

  SmallLayeredNeuralNetwork ann = new SmallLayeredNeuralNetwork();

  ann.setLearningRate(0.1); // set the learning rate
  ann.setMomemtumWeight(0.1); // set the momemtum weight

  // initialize the topology of the model, a three-layer model is created in this example
  ann.addLayer(featureDimension, false, FunctionFactory.createDoubleFunction("Sigmoid"));
  ann.addLayer(featureDimension, false, FunctionFactory.createDoubleFunction("Sigmoid"));
  ann.addLayer(labelDimension, true, FunctionFactory.createDoubleFunction("Sigmoid"));

  // set the cost function to evaluate the error
  ann.setCostFunction(FunctionFactory.createDoubleDoubleFunction("CrossEntropy")); 
  String trainedModelPath = ...;
  ann.setModelPath(trainedModelPath); // set the path to store the trained model

  // add training parameters
  Map<String, String> trainingParameters = new HashMap<String, String>();
  trainingParameters.put("tasks", "5"); // the number of concurrent tasks
  trainingParameters.put("training.max.iterations", "" + iteration); // the number of maximum iterations
  trainingParameters.put("training.batch.size", "300");  // the number of training instances read per update
  ann.train(new Path(trainingDataPath), trainingParameters);

The parameters related to training are listed as follows: (All these parameters are optional)

Parameter	Description
training.max.iterations	The maximum number of iterations (a.k.a. epoch) for training.
training.batch.size	As the mini-batch update is leveraged for training, this parameter specify how many training instances are used in one batch.
convergence.check.interval	If this parameters is set, then the model will be checked every time when the iteration is a multiple of this parameter. If the convergence condition is satisfied, the training will terminate immediately.
tasks	The number of concurrent tasks.

Two class learning problem

...

Page tree

Versions Compared

Old Version 57

New Version 58

Key

Two class learning problem