...
MLP can be used for both regression and classification. For both tasks, we need first initialize the MLP model by specifying the parameters. The parameters are listed as follows:
...
Parameter
...
Description
...
model path
...
The path to specify the location to store the model.
...
learningRate
...
Control the aggressive of learning. A big learning rate can accelerate the training speed,
but may also cause oscillation. Typically in range (0, 1).
Train the model
For training, the following things need to be specified:
- The model topology: including the number of neurons (besides the bias neuron) in each layer; whether current layer is the final layer; the type of squashing function.
- The learning rate: Specify how aggressive the model learning the training instances. A large value can accelerate the learning process but decrease the chance of model convergence. Recommend
...
regularization
...
- in range (0, 0.
...
momentum
...
Control the speed of training. A big momentum can accelerate the training speed, but it may
<ac:structured-macro ac:name="unmigrated-wiki-markup" ac:schema-version="1" ac:macro-id="fee66c43-2699-49ec-ade0-5a9920b9f423"><ac:plain-text-body><![CDATA[ also mislead the model update. Typically in range [0.5, 1)
...
]]></ac:plain-text-body></ac:structured-macro>
...
squashing function
...
Activate function used by MLP. Candidate squashing function: sigmoid, tanh.
...
cost function
...
Evaluate the error made during training. Candidate cost function: squared error, cross entropy (logistic).
- 5].
- The momemtum weight: Similar to learning rate, a large momemtum weight can accelerate the learning process but decrease the chance of model convergence. Recommend in range (0, 0.5].
- The regularization weight: A large value can decrease the variance of the model but increase the bias at the same time. As this parameter is sensitive, it's better to set it as a very small value, say, 0.001
...
layer size array
...
- .
The following is the sample code regarding how to train a model initialization.
No Format |
---|
SmallLayeredNeuralNetwork String modelPath = "/tmp/xorModel-training-by-xor.data"; double learningRate = 0.6; double regularization = 0.02; // no regularization double momentum = 0.3; // no momentum String squashingFunctionName = "Tanh"; String costFunctionName = "SquaredError"; int[] layerSizeArray = new int[] { 2, 5, 1 }; SmallMultiLayerPerceptron mlp = new SmallMultiLayerPerceptron(learningRate, regularization, momentum, squashingFunctionName, costFunctionName, layerSizeArray); |
Two class learning problem
ann = new SmallLayeredNeuralNetwork();
ann.setLearningRate(0.1); // set the learning rate
ann.setMomemtumWeight(0.1); // set the momemtum weight
// initialize the topology of the model, a three-layer model is created in this example
ann.addLayer(featureDimension, false, FunctionFactory.createDoubleFunction("Sigmoid"));
ann.addLayer(featureDimension, false, FunctionFactory.createDoubleFunction("Sigmoid"));
ann.addLayer(labelDimension, true, FunctionFactory.createDoubleFunction("Sigmoid"));
// set the cost function to evaluate the error
ann.setCostFunction(FunctionFactory.createDoubleDoubleFunction("CrossEntropy"));
String trainedModelPath = ...;
ann.setModelPath(trainedModelPath); // set the path to store the trained model
// add training parameters
Map<String, String> trainingParameters = new HashMap<String, String>();
trainingParameters.put("tasks", "5"); // the number of concurrent tasks
trainingParameters.put("training.max.iterations", "" + iteration); // the number of maximum iterations
trainingParameters.put("training.batch.size", "300"); // the number of training instances read per update
ann.train(new Path(trainingDataPath), trainingParameters);
|
The parameters related to training are listed as follows: (All these parameters are optional)
Parameter | Description |
training.max.iterations | The maximum number of iterations (a.k.a. epoch) for training. |
training.batch.size | As the mini-batch update is leveraged for training, this parameter specify how many training instances are used in one batch. |
convergence.check.interval | If this parameters is set, then the model will be checked every time when the iteration is a multiple of this parameter. If the convergence condition is satisfied, the training will terminate immediately. |
tasks | The number of concurrent tasks. |
Use the trained model
Once the model is trained and stored, it can be reused later.
No Format |
---|
String modelPath = ...; // the location of the existing model
DoubleVector features = ...; // the features of an instance
SmallLayeredNeuralNetwork ann = new SmallLayeredNeuralNetwork(modelPath);
DoubleVector labels = ann.getOutput(instance); // the label evaluated by the model
|
Two class learning problem
In machine learning, two class learning is a kind of supervised learning problem. Given the instances, the goal of the classifier is to classify them into two classesTo be added...
Example: XOR problem
To be added...
Multi class learning problem
To be added...In machine learning, multiclass or multinomial classification is the problem of classifying instances into more than two classes.
Example:
To be added...
Regression problem
To be added..From the machine learning perspective, regression problem can be considered as the classification problem where the class label is a continuous value.
Example: Predict the sunspot activity
...