Hallo,
ich mache von der Uni aus ein Projekt, in dem ein sehr großes Neuronales Netz trainiert werden muss (alles in allem 84000 in und outputunits) Vielleicht kann man mir nur helfen, wenn man Wissen über Künstlich Neuronale Netze besitzt, aber vielleicht kann man den Fehler allein aus dem Code erschließen. Ich bin leider zu schlecht in Java um den Fehler zu erschließen. Anyway...
Ich habe nun den Source Code eines Netzes, das bis jetzt folgendermaßen aussieht, es besitzt: 7 Input units, 2 Layer Hiddenunits mit a 10 units und 1 outputunit. Im Source code ist das folgendermaßen implementiert:
die unteren Angaben bedeuten die Lernrate, Momentum usw... Wenn ich nun die Anzahl der Hidden Units ändere ist das kein Problem, das Netz funktioniert weiterhin, aber sobald ich die Anzahl der In oder Outputunits ändere, auf sagen wir 7 input und 2 outputunits, lege ich dazu folgenden Inoput an:
------------
41.5;-0.5;5.017;1.6615;10.6945;10.0855;5.5695;2.234;-0.768
------------
Die ersten 7 stehen für den Inout und "2.234;-0.768" werden an den Output angelegt
Auf jeden Fall bekomme ich folgende Fehlermeldung:
bei einer anderen Konfiguration der in oder outputunits habe ich aber auch schon diese Fehlermeldung bekommen:
Da es sich hier um die drei Files "GoogleNet.java"; "PatternSet.java" und NeuralNet.java" handelt, die anscheinend Probleme machen, habe ich diese unten angehängt.
GoogleNet.java
PatternSet.java:
und zuletzt:
NeuralNet.java
Füre jedwede Hilfe wäre ich ungemein dankbar...
Viele Grüße,
Stephan
hab noch etwas vergessen...
wenn jemand das Netz zum testen selber laufen lassen will, hier die restlichen Files
LineReader.java
Neuron.java
Pattern.java
Randomizer.java
Synapse.java
is ne ganze Menge, ich weiß... :-(
ich mache von der Uni aus ein Projekt, in dem ein sehr großes Neuronales Netz trainiert werden muss (alles in allem 84000 in und outputunits) Vielleicht kann man mir nur helfen, wenn man Wissen über Künstlich Neuronale Netze besitzt, aber vielleicht kann man den Fehler allein aus dem Code erschließen. Ich bin leider zu schlecht in Java um den Fehler zu erschließen. Anyway...
Ich habe nun den Source Code eines Netzes, das bis jetzt folgendermaßen aussieht, es besitzt: 7 Input units, 2 Layer Hiddenunits mit a 10 units und 1 outputunit. Im Source code ist das folgendermaßen implementiert:
Code:
// create a multilayer perceptron with four layers:
// one input layer with seven units; two hidden layers
// each with ten units, using tanh function;
// one output layer with one unit using tanh function.
// except for noofneurons, all parameters for the input layer
// are ineffectual.
int[] noofneurons = {7,10,10,1};
double[] learnratecoeff = {1, 1, 1, 1};
char[] axonfamily = {'t', 't', 't', 't'};
double[] momentumrate = {0, .6, .5, .4};
double[] flatness = {1, 1.2, 1.1, 1};
die unteren Angaben bedeuten die Lernrate, Momentum usw... Wenn ich nun die Anzahl der Hidden Units ändere ist das kein Problem, das Netz funktioniert weiterhin, aber sobald ich die Anzahl der In oder Outputunits ändere, auf sagen wir 7 input und 2 outputunits, lege ich dazu folgenden Inoput an:
------------
41.5;-0.5;5.017;1.6615;10.6945;10.0855;5.5695;2.234;-0.768
------------
Die ersten 7 stehen für den Inout und "2.234;-0.768" werden an den Output angelegt
Auf jeden Fall bekomme ich folgende Fehlermeldung:
Code:
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 49305
at PatternSet.<init>(PatternSet.java:33)
at GoogleNet.main(GoogleNet.java:54)
Code:
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 7
at NeuralNet.FeedForward(NeuralNet.java:202)
at NeuralNet.Output(NeuralNet.java:215)
at NeuralNet.CrossValErrorRatio(NeuralNet.java:232)
at GoogleNet.main(GoogleNet.java:59)
GoogleNet.java
Code:
import java.io.*;
public class GoogleNet {
public static void main(String args[]) {
int No = 1;
Randomizer randomizer = new Randomizer();
// create a multilayer perceptron with four layers:
// one input layer with seven units; two hidden layers
// each with ten units, using tanh function;
// one output layer with one unit using tanh function.
// except for noofneurons, all parameters for the input layer
// are ineffectual.
int[] noofneurons = {7,2,2};
double[] learnratecoeff = {1, 1, 1};
char[] axonfamily = {'t', 't', 't'};
double[] momentumrate = {0, .6, .4};
double[] flatness = {1, 1.2, 1};
System.out.println("Creating the net");
NeuralNet mynet = new NeuralNet(noofneurons, learnratecoeff, axonfamily, momentumrate, flatness, randomizer);
// Save the configuration to a file
System.out.println("Saving the configuration");
try{mynet.SaveConfig("example2.nnc");}catch(IOException e){}
// create three pattern sets with 7 input and 1 output values.
System.out.println("Loading patterns");
// first create a pattern set for training
PatternSet trainingpatterns = new PatternSet("example2_training.csv", 7, 1, 1, 0, 0, randomizer);
// then create a pattern set for cross validation
PatternSet crossvalpatterns = new PatternSet("example2_crossval.csv", 7, 1, 0, 1, 0, randomizer);
// and then create a pattern set for testing
PatternSet testpatterns = new PatternSet("example2_test.csv", 7, 1, 0, 0, 1, randomizer);
// show the error ratio before training
System.out.println("Error ratio before training: " + mynet.CrossValErrorRatio(crossvalpatterns) );
// train the net using mini batch training
System.out.println("Beginning mini batch training");
double temp_err;
temp_err = mynet.CrossValErrorRatio(crossvalpatterns);
while (temp_err > .02) {
System.out.println("Zyklen: " + No + " Training the net. Error ratio: " + temp_err );
mynet.MinibatchTrainPatterns(trainingpatterns.trainingpatterns, .1, 20);
temp_err = mynet.CrossValErrorRatio(crossvalpatterns);
No++;
}
/*
// or you can use incremental training:
System.out.println("Beginning incremental training");
double temp_err;
temp_err = mynet.CrossValErrorRatio(crossvalpatterns);
while (temp_err > .02) {
System.out.println("Training the net. Error ratio: " + temp_err );
mynet.IncrementalTrainPatterns(trainingpatterns.trainingpatterns, .01);
temp_err = mynet.CrossValErrorRatio(crossvalpatterns);
}
*/
// finally, check the error using test data
System.out.println("Error ratio of the test data: " + mynet.TestErrorRatio(testpatterns) );
System.out.println("Training is over");
// now that the training is over, save the weights of the net.
System.out.println("Saving the weights\n");
try{mynet.SaveWeights("example2.nnw");}catch(IOException e){}
// clean up the objects
trainingpatterns = null;
crossvalpatterns = null;
testpatterns = null;
mynet = null;
randomizer = null;
///Ø\_/Ø\_/Ø\_/Ø\_/Ø\_/Ø\_/Ø\_/Ø\_/Ø\_/Ø\_/Ø\_/Ø\_/Ø\_/Ø\_/Ø\_/Ø\_/
// now recreate the net using previously saved data and
// test it.
// recreate the net
randomizer = new Randomizer();
System.out.println("Recreating the net");
mynet = new NeuralNet("example2.nnc", randomizer);
mynet.LoadWeights("example2.nnw");
// and test it
double[] inputs = {-19.5,0.5,-0.683,0.4615,-2.0055,-0.8145,-2.0305};
System.out.println("Feeding the net using data taken from a female crab");
double[] outputs = mynet.Output(inputs);
if ( outputs[0] < 0 ) {
System.out.println("Result: Male");
}
else {
System.out.println("Result: Female");
}
mynet = null;
}
}
PatternSet.java:
Code:
class PatternSet {
Pattern[] patterns; // whole pattern set including all patterns
Pattern[] trainingpatterns; // patterns to be used during training
Pattern[] crossvalpatterns; // patterns to be used during cross validation
Pattern[] testpatterns; // patterns to be used using testing
double[] crossvaldeviations;
double[] testdeviations;
private Randomizer randomizer;
// constructor.
// parameters should sum to 1
public PatternSet (String sourceFile, int noofinputs, int nooftargets, double ratiotraining, double ratiocrossval, double ratiotest, Randomizer randomizer) {
// load patterns from the file
// 1st determine how many patterns there are
LineReader linereader = new LineReader(sourceFile);
int counter = 0;
double temp_double;
while (linereader.NextLineSplitted()){
try {
// count numeric data only
for ( int i = 0; i < (noofinputs + nooftargets); i++) {
temp_double = Double.parseDouble(linereader.column[i]);
}
counter++;
}
catch (NumberFormatException e) {}
}
linereader = null;
patterns = new Pattern[counter];
// then read each pattern and place it into the array
double[] temp_in = new double[noofinputs];
double[] temp_tar = new double[nooftargets];
linereader = new LineReader(sourceFile);
counter = 0;
while (linereader.NextLineSplitted()){
try { // count numeric data only
for (int i = 0; i < noofinputs; i++) {
temp_in[i] = Double.parseDouble(linereader.column[i]);
}
for (int i = noofinputs; i < noofinputs+nooftargets; i++) {
temp_tar[i-noofinputs] = Double.parseDouble(linereader.column[i]);
}
patterns[counter++] = new Pattern(temp_in, temp_tar);
}
catch (NumberFormatException e) {}
}
linereader = null;
// now determine the no. of training, cross val. and test patterns
trainingpatterns = new Pattern[(int)(Math.round(patterns.length * ratiotraining))];
crossvalpatterns = new Pattern[(int)(Math.round(patterns.length * ratiocrossval))];
testpatterns = new Pattern[patterns.length - trainingpatterns.length - crossvalpatterns.length];
int patterntoselect;
int patternsnotselected = patterns.length;
// first create training patterns
for ( int i = 0; i < trainingpatterns.length; i++ ) {
patterntoselect = randomizer.random.nextInt(patternsnotselected);
counter = 0;
for ( int j = 0; j < patterns.length; j++ ) {
if ( !patterns[j].selected ) {
if ( counter == patterntoselect ) {
trainingpatterns[i] = patterns[j];
patterns[j].selected = true;
patternsnotselected--;
break;
}
counter++;
}
}
}
// then create cross validation patterns
for ( int i = 0; i < crossvalpatterns.length; i++ ) {
patterntoselect = randomizer.random.nextInt(patternsnotselected);
counter = 0;
for ( int j = 0; j < patterns.length; j++ ) {
if ( !patterns[j].selected ) {
if ( counter == patterntoselect ) {
crossvalpatterns[i] = patterns[j];
patterns[j].selected = true;
patternsnotselected--;
break;
}
counter++;
}
}
}
// and then create test patterns
for ( int i = 0; i < testpatterns.length; i++ ) {
patterntoselect = randomizer.random.nextInt(patternsnotselected);
counter = 0;
for ( int j = 0; j < patterns.length; j++ ) {
if ( !patterns[j].selected ) {
if ( counter == patterntoselect ) {
testpatterns[i] = patterns[j];
patterns[j].selected = true;
patternsnotselected--;
break;
}
counter++;
}
}
}
// and now switch all patterns as !selected, so that they are ready for training
for ( int i = 0; i < patterns.length; i++ ) {
patterns[i].selected = false;
}
// calculate average deviations for cross val data as well as test data
double[] averages = new double[nooftargets];
crossvaldeviations = new double[nooftargets];
testdeviations = new double[nooftargets];
for (int i = 0; i < nooftargets; i++) {
// first calculate crossval deviations
averages[i] = 0;
for (int j = 0; j < crossvalpatterns.length; j++) {
averages[i] += crossvalpatterns[j].target[i];
}
averages[i] /= crossvalpatterns.length;
// now calculate deviations
crossvaldeviations[i] = 0;
for (int j = 0; j < crossvalpatterns.length; j++) {
crossvaldeviations[i] += Math.abs(crossvalpatterns[j].target[i] - averages[i]);
}
crossvaldeviations[i] = crossvaldeviations[i] * 2 / crossvalpatterns.length;
// then calculate test deviations
averages[i] = 0;
for (int j = 0; j < testpatterns.length; j++) {
averages[i] += testpatterns[j].target[i];
}
averages[i] /= testpatterns.length;
// now calculate deviations
testdeviations[i] = 0;
for (int j = 0; j < testpatterns.length; j++) {
testdeviations[i] += Math.abs(testpatterns[j].target[i] - averages[i]);
}
testdeviations[i] = testdeviations[i] * 2 / testpatterns.length;
}
}
}
und zuletzt:
NeuralNet.java
Code:
import java.io.*;
public class NeuralNet {
Neuron[] neurons;
Synapse[] synapses;
int nolayers; // no of layers, inc. input and output layers
Layer[] layers;
private Randomizer randomizer;
// constructor
// opens the configuration file and creates a net according to it.
public NeuralNet (String path, Randomizer randomizer) {
this.randomizer = randomizer;
LineReader linereader = new LineReader(path);
while (linereader.NextLineSplitted()){
// if it declares # of objects, dimension the appropriate array
if (linereader.column[0].compareTo("#neurons") == 0) { neurons = new Neuron[Integer.parseInt(linereader.column[1])]; }
if (linereader.column[0].compareTo("#synapses") == 0) { synapses = new Synapse[Integer.parseInt(linereader.column[1])]; }
// if it represents an input neuron, create a neuron object
if (linereader.column[0].compareTo("i") == 0) { neurons[Integer.parseInt(linereader.column[1])] = new Neuron(Integer.parseInt(linereader.column[1])); }
// if it represents a neuron, create a neuron object
if (linereader.column[0].compareTo("n") == 0) { neurons[Integer.parseInt(linereader.column[1])] = new Neuron( Integer.parseInt(linereader.column[1]), Integer.parseInt(linereader.column[2]), Double.parseDouble(linereader.column[3]), linereader.column[4].charAt(0), Double.parseDouble(linereader.column[5]), Double.parseDouble(linereader.column[6]), randomizer ); }
// if it represents a synapse, create a synapse object
if (linereader.column[0].compareTo("s") == 0) { synapses[Integer.parseInt(linereader.column[1])] =
new Synapse(
neurons[Integer.parseInt(linereader.column[2])],
neurons[Integer.parseInt(linereader.column[3])],
randomizer
); }
}
linereader = null;
// first find out how many layers there are
int temp_maxlayer = 0;
for (int i = 0; i < neurons.length; i++) {
if (neurons[i].layer > temp_maxlayer) {temp_maxlayer = neurons[i].layer;}
}
nolayers = temp_maxlayer+1;
// then create layer objects
layers = new Layer[nolayers];
for (int i = 0; i < nolayers; i++) {layers[i] = new Layer(i);}
NeuronsInOut();
}
// another constructor. creates a MULTILAYER PERCEPTRON with
// given no. of layers, no of neurons, learning rates, momentum parameters,
// axonfamilies, flatness. except for noofneurons, all parameters are
// ineffectual for the first layer.
public NeuralNet (int[] noofneurons, double[] learningratecoefficient, char[] axonfamily, double[] momentumrate, double[] axonfuncflatness, Randomizer randomizer) {
this.randomizer = randomizer;
int temp_nooflayers = noofneurons.length;
nolayers = noofneurons.length;
// determine the no of neurons and create the array
int temp_noofneurons = 0;
for ( int i = 0; i < temp_nooflayers; i++ ) {
temp_noofneurons += noofneurons[i];
}
neurons = new Neuron[temp_noofneurons];
// determine the no of synapses and create the array
int temp_noofsynapses = 0;
for ( int i = 0; i < temp_nooflayers-1; i++ ) {
temp_noofsynapses += noofneurons[i] * noofneurons[i+1];
}
synapses = new Synapse[temp_noofsynapses];
// instantiate neurons:
int temp_neuronidcounter = 0;
// first instantiate input neurons
for ( int i = 0; i < noofneurons[0]; i++ ) {
neurons[temp_neuronidcounter] = new Neuron(temp_neuronidcounter);
temp_neuronidcounter++;
}
// then instantiate hidden and output neurons
for ( int i = 1; i < temp_nooflayers; i++ ) {
for ( int j = 0; j < noofneurons[i]; j++ ) {
neurons[temp_neuronidcounter] = new Neuron(temp_neuronidcounter, i, axonfuncflatness[i], axonfamily[i], momentumrate[i], learningratecoefficient[i], randomizer);
temp_neuronidcounter++;
}
}
// then create layer objects
layers = new Layer[temp_nooflayers];
for (int i = 0; i < temp_nooflayers; i++) {layers[i] = new Layer(i);}
// instantiate synapses
int temp_synapseidcounter = 0;
for ( int i = 0; i < temp_nooflayers-1; i++) {
for ( int j = 0; j < layers[i].neurons.length; j++ ) {
for ( int k = 0; k < layers[i+1].neurons.length; k++ ) {
synapses[temp_synapseidcounter++] = new Synapse(layers[i].neurons[j], layers[i+1].neurons[k], randomizer);
}
}
}
NeuronsInOut();
}
// This method is used by constructors only.
// It determines the incoming and outgoing neurons / synapses for each neuron
// and set them in the neuron. This information is to be used later during feed forward and back propagation.
private void NeuronsInOut () {
// and then create neuronsin, neuronsout, synapsesin, synapsesout arrays in the neuron objects
// in order to determine relationships between neurons
Neuron[] temp_neuronsin;
Neuron[] temp_neuronsout;
Synapse[] temp_synapsesin;
Synapse[] temp_synapsesout;
int incounter; int outcounter;
for (int i = 0; i < neurons.length; i++) {
// first determine the dimension of the arrays
temp_neuronsin = null;
temp_neuronsout = null;
incounter = 0; outcounter = 0;
for (int j = 0; j < synapses.length; j++) {
if (synapses[j].sourceunit == neurons[i]) {outcounter++;}
if (synapses[j].targetunit == neurons[i]) {incounter++;}
}
temp_neuronsin = new Neuron[incounter];
temp_synapsesin = new Synapse[incounter];
temp_neuronsout = new Neuron[outcounter];
temp_synapsesout = new Synapse[outcounter];
// then fill each array
incounter = 0; outcounter = 0;
for (int j = 0; j < synapses.length; j++) {
if (synapses[j].sourceunit == neurons[i]) {
temp_neuronsout[outcounter] = synapses[j].targetunit;
temp_synapsesout[outcounter++] = synapses[j];
}
if (synapses[j].targetunit == neurons[i]) {
temp_neuronsin[incounter] = synapses[j].sourceunit;
temp_synapsesin[incounter++] = synapses[j];
}
}
// set them in the neuron
neurons[i].InsOuts(temp_neuronsin, temp_neuronsout, temp_synapsesin, temp_synapsesout);
}
}
// saves the configuration of the net to a file
public void SaveConfig (String path) throws IOException {
File outputFile = new File(path);
FileWriter out = new FileWriter(outputFile);
out.write("// Input units:\n");
// no of neurons
out.write("#neurons;"+neurons.length+"\n");
out.write("// type;ID;layer;flatness;axonfamily;momentum;learningrate\n");
// neurons
for (int i = 0; i < neurons.length; i++) {
if (neurons[i].layer == 0) {
out.write("i;"+i+";0\n");
}
else {
out.write("n;"+i+";"+neurons[i].layer+";"+neurons[i].axonfuncflatness+";"+neurons[i].axonfamily+";"+neurons[i].momentumrate+";"+neurons[i].learningratecoefficient+"\n");
}
}
// synapses
out.write("#synapses;"+synapses.length+"\n");
out.write("// type; ID; sourceunit; targetunit\n");
for (int i = 0; i < synapses.length; i++) {
out.write("s;"+i+";"+synapses[i].sourceunit.id+";"+synapses[i].targetunit.id+"\n");
}
out.close();
}
// loads weights of the net from a file
public void LoadWeights (String path) {
LineReader linereader = new LineReader(path);
while (linereader.NextLineSplitted()) {
// if it's a synapse weight
if (linereader.column[0].compareTo("w") == 0) { synapses[Integer.parseInt(linereader.column[1])].weight = Double.parseDouble(linereader.column[2]); }
// if it's a neuron threshold
if (linereader.column[0].compareTo("t") == 0) { neurons[Integer.parseInt(linereader.column[1])].threshold = Double.parseDouble(linereader.column[2]); }
}
linereader = null;
}
// saves weights to a file
public void SaveWeights (String path) throws IOException {
File outputFile = new File(path);
FileWriter out = new FileWriter(outputFile);
// first write weight of each synapse
for (int i = 0; i < synapses.length; i++) {
out.write("w; "+i+"; "+synapses[i].weight+"\n");
}
out.write("\n");
// then threshold of each neuron
for (int i = 0; i < neurons.length; i++) {
out.write("t; "+i+"; "+neurons[i].threshold+"\n");
}
out.close();
}
// feeds the network forward and updates all the neurons.
public void FeedForward (double[] inputs) {
// feed input values
for (int i = 0; i < layers[0].neurons.length; i++) {
layers[0].neurons[i].output = inputs[i];
}
// begin from the first layer and propagate through layers.
for (int i = 1; i < nolayers; i++) {
// update the output of each neuron in this layer
for (int j = 0; j < layers[i].neurons.length; j++) {
layers[i].neurons[j].UpdateOutput();
}
}
}
// takes an array of input values, put them to the input neurons, feeds the net forward and returns the outputs of the output layer
public double[] Output (double[] inputs) {
FeedForward(inputs);
double[] tempoutputs = new double[layers[nolayers-1].neurons.length];
for(int i = 0; i < layers[nolayers-1].neurons.length; i++) {
tempoutputs[i] = layers[nolayers-1].neurons[i].output;
}
return tempoutputs;
}
// calculates a std error for this net using given cross validation patterns
public double CrossValErrorRatio (PatternSet patternset) {
int noofoutputunits = layers[nolayers-1].neurons.length;
double[] abserrors = new double[noofoutputunits];
for ( int i = 0; i < noofoutputunits; i++ ) { abserrors[i] = 0; }
// calculate avg error for each neuron
double errorratio = 0;
double[] temp_output = new double[noofoutputunits];
for (int j = 0; j < patternset.crossvalpatterns.length; j++) {
temp_output = Output(patternset.crossvalpatterns[j].input);
for (int i = 0; i < noofoutputunits; i++) {
abserrors[i] += Math.abs( temp_output[i] - patternset.crossvalpatterns[j].target[i] );
}
}
for (int i = 0; i < noofoutputunits; i++) {
abserrors[i] /= patternset.crossvalpatterns.length;
errorratio += ( abserrors[i] / patternset.crossvaldeviations[i] );
}
errorratio /= noofoutputunits;
return errorratio;
}
// calculates a std error for this net using given test patterns
public double TestErrorRatio (PatternSet patternset) {
int noofoutputunits = layers[nolayers-1].neurons.length;
double[] abserrors = new double[noofoutputunits];
for ( int i = 0; i < noofoutputunits; i++ ) { abserrors[i] = 0; }
// calculate avg error for each neuron
double errorratio = 0;
double[] temp_output = new double[noofoutputunits];
for (int j = 0; j < patternset.testpatterns.length; j++) {
temp_output = Output(patternset.testpatterns[j].input);
for (int i = 0; i < noofoutputunits; i++) {
abserrors[i] += Math.abs( temp_output[i] - patternset.testpatterns[j].target[i] );
}
}
for (int i = 0; i < noofoutputunits; i++) {
abserrors[i] /= patternset.testpatterns.length;
errorratio += ( abserrors[i] / patternset.testdeviations[i] );
}
errorratio /= noofoutputunits;
return errorratio;
}
// Training methods Ø\_/Ø\_/Ø\_/Ø\_/Ø\_/Ø\_/Ø\_/Ø\_/Ø\_/Ø\_/Ø\_/Ø\_/Ø\_/Ø\_/Ø\_/Ø\_/Ø\_/
// takes all patterns one by one (with random order) and trains the net
// using each one.
public void IncrementalTrainPatterns(Pattern[] patterns, double rate) {
int patternsnottrained = patterns.length; // no of patterns used
int patterntotrain;
int indexofpatterntotrain = -1;
int counter;
// turn all "selected" flags off
for (int i = 0; i < patterns.length; i++) {
patterns[i].selected = false;
}
for (int i = 0; i < patterns.length; i++) {
patterntotrain = randomizer.random.nextInt(patternsnottrained);
// find the index of the pattern to train
counter = 0;
for (int j = 0; j < patterns.length; j++) {
if (!patterns[j].selected) {
if (counter != patterntotrain) {
counter++;
}
else if (counter == patterntotrain) {
indexofpatterntotrain = j;
break;
}
}
}
// train the net using the selected pattern
IncrementalTrain(rate, patterns[indexofpatterntotrain]);
patterns[indexofpatterntotrain].selected = true;
patternsnottrained--;
}
// turn all "selected" flags off again
for (int i = 0; i < patterns.length; i++) {
patterns[i].selected = false;
}
}
// trains the net incrementally.
public void IncrementalTrain(double rate, Pattern pattern) {
// feed fw
FeedForward(pattern.input);
// train the output layer first
for (int j = 0; j < layers[nolayers-1].neurons.length; j++) {
layers[nolayers-1].neurons[j].OutputIncrementalTrain(rate, pattern.target[j]);
}
// train hidden layers
for (int i = nolayers-2; i > 0; i--) {
for (int j = 0; j < layers[i].neurons.length; j++) {
layers[i].neurons[j].HiddenIncrementalTrain(rate);
}
}
}
// selects patterns (quantity: nopatterns) randomly and trains the net using those patterns.
// repeats this until all patterns in the pattern array have been used for training
public void MinibatchTrainPatterns(Pattern[] patterns, double rate, int nopatterns) {
int patternsnottrained = patterns.length; // no of patterns used
if (nopatterns > patterns.length) {nopatterns = patterns.length;}
if (nopatterns < 1) {nopatterns = 1;}
int patterntotrain;
int noofpatternsselected;
Pattern[] patternsselected;
int indexofpatterntotrain = -1;
int[] indexesofpatternstotrain = new int[nopatterns];
int counter;
// turn all "selected" flags off
for (int i = 0; i < patterns.length; i++) {
patterns[i].selected = false;
}
while ( patternsnottrained > 0 ) {
// choose patterns to be used for training and put them in the temp. pattern array
noofpatternsselected = 0;
while ( noofpatternsselected < nopatterns && patternsnottrained > 0 ) {
patterntotrain = randomizer.random.nextInt(patternsnottrained);
patternsnottrained--;
// find the index of the pattern to be used
counter = 0;
for (int i = 0; i < patterns.length; i++) {
if (!patterns[i].selected) {
if (counter != patterntotrain) {
counter++;
}
else if (counter == patterntotrain) {
indexofpatterntotrain = i;
break;
}
}
}
noofpatternsselected++;
indexesofpatternstotrain[noofpatternsselected-1] = indexofpatterntotrain;
patterns[indexofpatterntotrain].selected = true;
}
// train the net using the temp. pattern array
patternsselected = null;
patternsselected = new Pattern[noofpatternsselected];
for (int i = 0; i < noofpatternsselected; i++) {
patternsselected[i] = patterns[indexesofpatternstotrain[i]];
}
BatchTrainPatterns( patternsselected, rate);
}
// turn all "selected" flags off again
for (int i = 0; i < patterns.length; i++) {
patterns[i].selected = false;
}
}
// trains the net using batch training
// takes a number of patterns
public void BatchTrainPatterns(Pattern[] patterns, double rate) {
for (int i = 0; i < patterns.length; i++) {
BatchTrain(rate, patterns[i]);
}
// update weights using cumulative values obtained during batch training
for ( int i = 0; i < neurons.length; i++ ) {
neurons[i].BatchUpdateWeights(patterns.length);
}
}
// trains the net using batch training
// takes only one pattern
public void BatchTrain(double rate, Pattern pattern) {
// feed fw
FeedForward(pattern.input);
// train the output layer first
for (int j = 0; j < layers[nolayers-1].neurons.length; j++) {
layers[nolayers-1].neurons[j].OutputBatchTrain(rate, pattern.target[j]);
}
// train hidden layers
for (int i = nolayers-2; i > 0; i--) {
for (int j = 0; j < layers[i].neurons.length; j++) {
layers[i].neurons[j].HiddenBatchTrain(rate);
}
}
}
// Ø\_/Ø\_/Ø\_/Ø\_/Ø\_/Ø\_/Ø\_/Ø\_/Ø\_/Ø\_/Ø\_/Ø\_/Ø\_/Ø\_/Ø\_/Ø\_/Ø\_/Ø\_/Ø\_/
// represents an array of neurons belonging to the same layer.
class Layer {
Neuron[] neurons;
// constructs a layer object
public Layer(int layerno) {
int counter = 0;
// see how many neurons there are in this layer
for (int i = 0; i < NeuralNet.this.neurons.length; i++) {
if (NeuralNet.this.neurons[i].layer == layerno) {counter++;}
}
// create an array of neurons
this.neurons = new Neuron[counter];
// place neurons
counter = 0;
for (int i = 0; i < NeuralNet.this.neurons.length; i++) {
if (NeuralNet.this.neurons[i].layer == layerno) {
this.neurons[counter++] = NeuralNet.this.neurons[i];
}
}
}
}
}
Füre jedwede Hilfe wäre ich ungemein dankbar...
Viele Grüße,
Stephan
hab noch etwas vergessen...
wenn jemand das Netz zum testen selber laufen lassen will, hier die restlichen Files
LineReader.java
Code:
import java.io.*;
// this class reads one line at a time and returns it as a string
public class LineReader
{
String[] column;
FileInputStream fis;
BufferedInputStream bis;
// constructor
public LineReader (String path) {
try {
fis = new FileInputStream(path);
bis = new BufferedInputStream(fis);
}
catch (IOException e) {}
/*
this.inputFile = new File(path);
try {in = new FileReader(inputFile);} catch (IOException e) {}
*/
}
// read the next line, split it and return the parts as a string array
public boolean NextLineSplitted () {
column = null;
column = NextLine().split(";");
if (column[0] != "#EOF#") {
for (int i = 0; i < column.length; i++) {
column[i] = column[i].trim();
}
return true;
}
else{return false;}
}
// read the next line, return the line as string
public String NextLine() {
int i;
char[] temp_array = new char[50000];
char[] temp_array2;
boolean last_line;
int counter;
String temp_line = "";
do {
temp_array2 = null;
counter = 0;
last_line = true;
// read a line
try {
while ( (i = bis.read()) != -1 ) {
last_line = false;
if (i == 13 || i == 10) {
break;
}
else if( i != 10 && i != 13) {
temp_array[counter++] = (char)i;
}
}
}
catch (IOException e) {}
// put the array into a string
if (last_line) {
temp_line = "#EOF#";
}
else if (counter != 0) {
temp_array2 = new char[counter];
boolean all_spaces = true;
for (int j = 0; j < counter; j++) {
if (temp_array[j] != ' ') {all_spaces = false;}
temp_array2[j] = temp_array[j];
}
if (all_spaces) {temp_line = "";}
else {temp_line = new String(temp_array2);}
if (temp_line.length() >= 2 && temp_line.charAt(0) == '/' && temp_line.charAt(1) == '/') {
temp_line = "";
}
}
else {
temp_line = "";
}
} while (temp_line == "");
return temp_line.trim();
}
}
Neuron.java
Code:
public class Neuron {
public int id; // to be used in saveconfig method
public double threshold;
private double prevthreshold;
public int layer;
public double output;
public char axonfamily; // logistic? hyperbolic tangent? linear?
protected double momentumrate;
protected double axonfuncflatness; // if the axon func. is a curve like sigmoid, this indicates the flatness paramater.
protected double learningratecoefficient; // i.e.: if the learning rate is .1 and this value is 1.5, actual learning rate will be .1 * 1.5 = .15
public Neuron[] neuronsout; // array of neurons which take this neuron's output. To be used during back propagation
public Neuron[] neuronsin; // array of neurons from which this neuron takes outputs. To be used during feedforward
public Synapse[] synapsesout; // array of synapses which take this neuron's output. To be used during back propagation
public Synapse[] synapsesin; // array of synapses from which this neuron takes outputs. To be used during feedforward
protected double error; // to be used during bp.
protected double cumulthresholddiff; // cumulate changes in threshold here during batch training
// constructor for input neurons
public Neuron (int id) {
this.id = id;
this.layer = 0;
}
// another constructor
public Neuron (int id, int layer, double axonfuncflatness, char axonfamily, double momentumrate, double learningratecoefficient, Randomizer randomizer) {
output = 0;
this.axonfamily = axonfamily;
threshold = randomizer.Uniform(-1,1);
prevthreshold = threshold;
this.id = id;
this.layer = layer;
this.momentumrate = momentumrate;
this.axonfuncflatness = axonfuncflatness;
this.learningratecoefficient = learningratecoefficient;
cumulthresholddiff = 0;
}
// this method constructs neuronin and neuronout arrays in order to determine the relationships of this neuron with others.
// should be called during the construction of the net
public void InsOuts (Neuron[] neuronsin, Neuron[] neuronsout, Synapse[] synapsesin, Synapse[] synapsesout) {
this.neuronsin = neuronsin;
this.neuronsout = neuronsout;
this.synapsesin = synapsesin;
this.synapsesout = synapsesout;
}
// updates the output and the activation according to the inputs
public void UpdateOutput () {
// first sum inputs and find the activation
double activation = 0;
for (int i = 0; i < neuronsin.length; i++) {
activation += neuronsin[i].output * synapsesin[i].weight;
}
activation += -1 * threshold;
// calculate the output using the activation function of this neuron
switch (axonfamily) {
case 'g': // logistic
output = 1 / ( 1 + Math.exp( - activation / axonfuncflatness ) );
break;
case 't': // hyperbolic tangent (tanh)
output = ( 2 / ( 1 + Math.exp( - activation / axonfuncflatness ) ) ) - 1;
/* // alternatively,
double temp = Math.exp( 2 * ( activation / axonfuncflatness ) ) ; // so that the computation is faster
output = ( temp - 1 ) / ( temp + 1 );
*/
break;
case 'l': // linear
output = activation;
break;
}
}
// Incremantal train ------------------------------------------
// trains the output neurons using incremental training
public void OutputIncrementalTrain (double rate, double target) {
this.error = (target - output) * Derivative();
IncrementalUpdateWeights(rate);
}
// trains the hidden neurons using incremental training
public void HiddenIncrementalTrain (double rate) {
// first compute the error
double temp_diff = 0;
for (int i = 0; i < neuronsout.length; i++) {
temp_diff += neuronsout[i].error * synapsesout[i].prevweight;
}
error = temp_diff * Derivative();
IncrementalUpdateWeights(rate);
}
// updates weights according to the error
private void IncrementalUpdateWeights (double rate) {
double temp_weight;
for (int i = 0; i < synapsesin.length; i++) {
temp_weight = synapsesin[i].weight;
synapsesin[i].weight += (rate * learningratecoefficient * error * neuronsin[i].output) + ( momentumrate * ( synapsesin[i].weight - synapsesin[i].prevweight ) );
synapsesin[i].prevweight = temp_weight;
if (synapsesin[i].cumulweightdiff != 0) {synapsesin[i].cumulweightdiff = 0;}
}
temp_weight = threshold;
threshold += ( rate * learningratecoefficient * error * -1 ) + ( momentumrate * ( threshold - prevthreshold ) );
prevthreshold = temp_weight;
if (cumulthresholddiff != 0) {cumulthresholddiff = 0;}
}
// Batch train ------------------------------------------------
// trains the output neurons using batch training
public void OutputBatchTrain (double rate, double target) {
this.error = (target - output) * Derivative();
BatchCumulateWeights(rate);
}
// trains the hidden neurons using batch training
public void HiddenBatchTrain (double rate) {
// first compute the error
double temp_diff = 0;
for (int i = 0; i < neuronsout.length; i++) {
temp_diff += neuronsout[i].error * synapsesout[i].weight;
}
error = temp_diff * Derivative();
BatchCumulateWeights(rate);
}
// cumulates weights according to the error
private void BatchCumulateWeights (double rate) {
double temp_diff;
for (int i = 0; i < synapsesin.length; i++) {
synapsesin[i].cumulweightdiff += rate * learningratecoefficient * error * neuronsin[i].output;
}
cumulthresholddiff += rate * learningratecoefficient * error * -1;
}
// updates weights according to the cumulated weights
public void BatchUpdateWeights (int noofepochs) {
double temp_weight;
for (int i = 0; i < synapsesin.length; i++) {
temp_weight = synapsesin[i].weight;
synapsesin[i].weight += ( synapsesin[i].cumulweightdiff / noofepochs ) + ( momentumrate * ( synapsesin[i].weight - synapsesin[i].prevweight ) );
synapsesin[i].prevweight = temp_weight;
synapsesin[i].cumulweightdiff = 0;
}
temp_weight = threshold;
threshold += ( cumulthresholddiff / noofepochs ) + ( momentumrate * ( threshold - prevthreshold ) );
prevthreshold = temp_weight;
cumulthresholddiff = 0;
}
// ------------------------------------------------------------
// returns the value of the derivative of the activation function for the last activation value
public double Derivative () {
double temp_derivative;
switch (axonfamily) {
case 'g': // logistic
temp_derivative = ( output * ( 1 - output ) ) / axonfuncflatness; break;
case 't': // hyperbolic tangent
temp_derivative = ( 1 - Math.pow( output , 2 ) ) / ( 2 * axonfuncflatness ); break;
// temp_derivative = Math.pow( ( 2 / ( Math.exp(activation / axonfuncflatness ) + Math.exp( - activation / axonfuncflatness ) ) ) ,2 ) / axonfuncflatness; break;
case 'l': // linear
temp_derivative = 1; break;
default: temp_derivative = 0; break;
}
return temp_derivative;
}
}
Code:
class Pattern {
public double[] input;
public double[] target;
public boolean selected;
// constructor
public Pattern (double[] temp_inputs, double[] temp_targets) {
input = new double[temp_inputs.length];
target = new double[temp_targets.length];
for ( int i = 0; i < temp_inputs.length; i++) {
input[i] = temp_inputs[i];
}
for ( int i = 0; i < temp_targets.length; i++) {
target[i] = temp_targets[i];
}
selected = false;
}
}
Code:
import java.util.Random;
public class Randomizer {
public Random random;
// constructor using system clock
public Randomizer () {
random = new Random();
}
// constructor using seed
public Randomizer ( long seed ) {
random = new Random(seed);
}
// Method to generate a uniformly distributed value between two values
public double Uniform (double min, double max) { // throws ...
return ( random.nextDouble() * (max - min) ) + min;
}
}
Code:
public class Synapse
{
public double weight;
public double prevweight; // to be used during bp
public double cumulweightdiff; // cumulate changes in weight here during batch training
public Neuron sourceunit;
public Neuron targetunit;
// constructor
public Synapse (Neuron sourceunit, Neuron targetunit, Randomizer randomizer)
{
this.sourceunit = sourceunit;
this.targetunit = targetunit;
cumulweightdiff = 0;
weight = randomizer.Uniform(-1,1);
prevweight = weight;
}
}
is ne ganze Menge, ich weiß... :-(