Having problems with neural network training C#

Metalsoul · 2010-12-12T08:06:40

Hi everyone!I'm new and I've a problem with a neural network I'm working on.I'm writing it in c#I think the problem is in the "train" method.Could anyone help me?I'm going to paste here my whole code so you can read it.Thanks in advice!(Sorry for the bad english) public class NeuralNetwork { /// <summary> /// Create a new NeuralNetwork object /// </summary> /// <param name="inputs">Number of input nodes</param> /// <param name="outputs">Number of output nodes</param> /// <param name="hiddens">Number of hidden nodes</param> /// <param name="Beta">Indicates the learning rate, from 0.0 to 1.0, best is 0.5</param> public NeuralNetwork(int inputs,int outputs,int hiddens,float Beta)//Constructor/initialization { X = new float[inputs]; Y = new float[outputs,2]; W1 = new float[inputs * hiddens]; // first serie of synapses H1 = new float[hiddens,2]; W2 = new float[hiddens * hiddens]; // second serie of synapses H2 = new float[hiddens,2]; W3 = new float[hiddens * hiddens]; // third serie of synapses H3 = new float[hiddens,2]; W4 = new float[hiddens * outputs]; // third serie of synapses errors = new float[(outputs*hiddens)*2]; B = Beta; Eta = (inputs / outputs) * 0.5f;// quickprop/rprop reset(false); } //Variable Declaration private float[] X; //Input vector private float[,] Y; //Output vector private float[,] H1; //Hidden vector private float[,] H2; //Hidden vector private float[,] H3; //Hidden vector private float[] W1; //Synapse vector/s 1 private float[] W2; //Synapse vector/s 2 | =Four hidden levels private float[] W3; //Synapse vector/s 3 / private float[] W4; //Synapse vector/s 4 / public float[] errors; private float B; private float Eta; /// <summary> /// Reset the neural network so it can be trained again /// </summary> /// <param name="kill">Reset to 0 all the threshold</param> public void reset(bool kill) { int value = 1; if (kill) value = 0; //For the initialization we will use the "Equal Distribution" random method, from -0.5 to 0.5? //first serie for (int a = 0; a < W1.Length; a++) { W1[a] = Game1.RandomBetween(0.0f, 0.5f) * value; } //hiddens 1 for (int a = 0; a < H1.Length/2; a++) { H1[a, 1] = Game1.RandomBetween(0.0f, 0.5f) * value; } //second serie for (int a = 0; a < W2.Length; a++) { W2[a] = Game1.RandomBetween(0.0f, 0.5f) * value; } //hiddens 2 for (int a = 0; a < H2.Length/2; a++) { H2[a,1] = Game1.RandomBetween(0.0f, 0.5f) * value; } //third serie for (int a = 0; a < W3.Length; a++) { W3[a] = Game1.RandomBetween(0.0f, 0.5f) * value; } //hiddens 2 for (int a = 0; a < H3.Length / 2; a++) { H3[a,1] = Game1.RandomBetween(0.0f, 0.5f) * value; } //third serie for (int a = 0; a < W4.Length; a++) { W4[a] = Game1.RandomBetween(0.0f, 0.5f) * value; } //outputs for (int a = 0; a < Y.Length/2; a++) { Y[a, 1] = Game1.RandomBetween(0.0f, 0.5f) * value; } } //Activation function private float sigma(float val) { return (float)(Math.Tanh(B * val)); } //Training /// <summary> /// Train the NeuralNetwork with the specified Training Set /// </summary> /// <param name="TS">used Training Set</param> public int train(float[,,] TS) { /* * The training set should be built so: * float[proofs,test level,value] * so for example * we should execute a test aiming to get the NeuralNetwork pointing to the maximum number among four * TS=new float[100,2,4] * so 100 proofs * level 0 contains input data * level 1 contains optimal output * with 4 input and output */ int returner = 0; //input initialization float[] inputs = new float[Y.Length/2]; //training sets cycle for (int a = 0; a < TS.GetLength(0); a++) { for (int b = 0; b < Y.Length/2; b++) { inputs = TS[a, 0, b]; } //update.. returner = update(inputs); //########### BACK-PROPAGATION ########### //calculate errors for (int c = 0; c < Y.Length / 2; c++) { errors[c] = TS[a, 1, c] - Y[c, 0]; } //last level (Y) for (int c = 0; c < Y.Length/2; c++) { Y[c, 1] = (float)(errors[c] * B * (1 - Math.Pow(Y[c, 0], 2))); } //########### I THINK THAT ERRORS OCCURS HERE ########### //level H3 for (int c = 0; c < H3.Length / 2; c++) { float sum = 0; //first sum all the values for (int d = 0; d < Y.Length/2; d++) { sum += W4[(d * (H3.Length / 2)) + c] * Y[d,1]; } //then change the values for (int d = 0; d < Y.Length/2; d++) { H3[c, 1] = (float)(B * (1 - Math.Pow(Y[d,0], 2)) * sum); } } //level H2 for (int c = 0; c < H2.Length / 2; c++) { float sum = 0; //first sum all the values for (int d = 0; d < H3.Length / 2; d++) { sum += W3[(d * (H2.Length / 2)) + c] * H3[d, 1]; } //then change the values for (int d = 0; d < H3.Length / 2; d++) { H2[c, 1] = (float)(B * (1 - Math.Pow(H3[d, 0], 2)) * sum); } } //level H1 for (int c = 0; c < H1.Length / 2; c++) { float sum = 0; //first sum all the values for (int d = 0; d < H2.Length / 2; d++) { sum += W2[(d * (H1.Length / 2)) + c] * H2[d, 1]; } //then change the values for (int d = 0; d < H2.Length / 2; d++) { H1[c, 1] = (float)(B * (1 - Math.Pow(H2[d, 0], 2)) * sum); } } float val = 0; //level W4 for (int c = 0; c < H3.Length / 2; c++) { for (int d = 0; d < Y.Length/2; d++) { val += H3[c, 1] * Y[d,1]; } } for (int c = 0; c < H3.Length / 2; c++) { for (int d = 0; d < Y.Length/2; d++) { W4[(c * (Y.Length/2)) + d] += -Eta * val; } } //level W3 val = 0; for (int c = 0; c < H2.Length / 2; c++) { for (int d = 0; d < H3.Length / 2; d++) { val += H2[c, 1] * H3[d, 1]; } } for (int c = 0; c < H2.Length / 2; c++) { for (int d = 0; d < H3.Length / 2; d++) { W3[(c * (H3.Length / 2)) + d] += -Eta * val; } } //level W2 val = 0; for (int c = 0; c < H1.Length / 2; c++) { for (int d = 0; d < H2.Length / 2; d++) { val += H1[c, 1] * H2[d, 1]; } } for (int c = 0; c < H1.Length / 2; c++) { for (int d = 0; d < H2.Length / 2; d++) { W2[(c * (H2.Length / 2)) + d] += -Eta * val; } } //level W1 val = 0; for (int c = 0; c < X.Length; c++) { for (int d = 0; d < H1.Length / 2; d++) { val += X[c] * H1[d, 1]; } } for (int c = 0; c < X.Length; c++) { for (int d = 0; d < H1.Length / 2; d++) { W1[(c * (H1.Length / 2)) + d] += -Eta * val; } } } return returner; } //Give the input /// <summary> /// Update the NN by giving it the inputs. /// </summary> /// <param name="inputs">Values to pass.</param> /// <param name="training">Is this a training session?.</param> /// <param name="expected">Only if training is true: the array value you expect from the input.</param> /// <returns>Returns the index of one of the outputs.</returns> public int update(float[] inputs) // inputs->X { X = inputs; //first transmission for (int a = 0; a < H1.Length / 2; a++) { H1[a,0] = 0; for (int b = 0; b < X.Length; b++) { H1[a,0] += X * W1[(a * X.Length) + b]; } H1[a, 0] = sigma(H1[a, 0] - H1[a, 1]); } //ok //second transimssion for (int a = 0; a < H2.Length / 2; a++) { H2[a,0] = 0; for (int b = 0; b < H1.Length/2; b++) { H2[a,0] += H1[b,0] * W2[(a * H1.Length / 2) + b]; } H2[a, 0] = sigma(H2[a, 0] - H2[a, 1]); } //ok //third transimssion for (int a = 0; a < H3.Length / 2; a++) { H3[a,0] = 0; for (int b = 0; b < H2.Length / 2; b++) { H3[a,0] += H2[b,0] * W3[(a * H2.Length / 2) + b]; } H3[a, 0] = sigma(H3[a, 0] - H3[a, 1]); } //last sigma, output results float max = 0; int highest = 0; for (int a = 0; a < Y.Length/2; a++) { Y[a,0] = 0; for (int b = 0; b < H3.Length/2; b++) { Y[a,0] += H3[b,0] * W4[(a * H3.Length / 2) + b]; } Y[a,0] = sigma(Y[a,0]-Y[a,1]); //also find the highest if (Y[a,0] > max) { max = Y[a,0]; highest = a; } } //done return highest; }}[edit: added source tags][Edited by - InnocuousFox on November 19, 2010 1:10:28 PM]

Artificial Intelligence Programming

Started by Metalsoul November 19, 2010 08:59 AM

22 comments, last by sjaakiejj 14 years, 2 months ago

sjaakiejj

130

December 07, 2010 10:20 AM

This is the function to calculate the 'error' for the output layer:
ErrorOut = Output * (Activation_Potential)*(Target - Output)

Though the more correct wording would be delta.

This is the function for all the other layers
Error = Output * ActivationPotential * SUM(Weights * ErrorOut)

This is the one you are using.

     //calculate errorsfor (int c = 0; c < Y.Length / 2; c++){    errors[c] = TS[a, 1, c] - Y[c, 0];}//last level (Y)for (int c = 0; c < Y.Length/2; c++){    Y[c, 1] = (float)(errors[c] * B * (1 - Math.Pow(Y[c, 0], 2)));}//then change the valuesfor (int d = 0; d < Y.Length/2; d++){     //error*Beta*(1-Y^2)*sum     H3[c, 1] = (float)(B * (1-Math.Pow(Y[d,0], 2)) * sum);}

Now to be honest with you, this part of the code doesn't make any sense. You're using an array of inputs into the first layer to calculate your delta? Following on which you use the Beta to calculate Y, and then use the beta again to calculate the error? If I look at the equations, the activation levels from the previous level are used instead of the TS, and the Beta is not used in the calculation to Delta at all.

In pseudo code the Neural Network would probably look something like this:

//The following code belongs to a loop which goes through every sample in the traindata.for each node in InputLayer  activationLevel[Node] = neuron(Weights(Node), TD(Node))endfor each node in otherLayers  activationLevel[Node] = neuron(Weights(Node), activationLevel[prevNodes]) end//Output Layer only has one deltadeltas[outputLayer] = activationLevel[outNode] * (activationPotential) * (labels - activationLevel[outNode])for each node in lastHiddenLayer  deltas[node] = activationLevel[node] * activationPotential * WeightsToOutput * deltas[outputLayer]end//The rest is pretty similar, you can work that out by yourself//After you get the deltas, you update the weights. Then you iterate through this loop //again for the next sample.

Since I only made that pseudo code quickly, there may be some mistakes in it, but you get the general idea. Hope this helps.

Metalsoul

Author

100

December 07, 2010 10:31 AM

Okay, in fact I was detecting some strange code, for example:

Y[c, 1] = (float)(errors[c] * B * (1 - Math.Pow(Y[c, 0], 2)));

I really don't know where I read the (1-Y^2), it is completely nonsense!

So, I suppose that the update code for now is 'correct'.
I think I should probably make a pseudocode for everything and THEN transpose it to C# code.
Thanks, I'll see what I made out of this and I'll post my results.

Well, let's see if I understood, following your explanation I modified my code in this way:

           //########### BACK-PROPAGATION ###########                //calculate errors using optimal output                for (int c = 0; c < Y.Length / 2; c++)                {                    errors[c] = TS[a, 1, c] - Y[c, 0];                }                //last level (Y)                for (int c = 0; c < Y.Length / 2; c++)                {                    //Y[c, 1] = sigma(Y[c, 1]) * errors[c];                    Y[c, 1] = Y[c, 0] * B * errors[c];                }                //level H3                for (int c = 0; c < H3.Length / 2; c++)                {                    float sum = 0;                    //first sum all the values                    for (int d = 0; d < Y.Length/2; d++)                    {                        sum += W4[(d * (H3.Length / 2)) + c] * Y[d,1];                    }                    //then change the values                    for (int d = 0; d < Y.Length/2; d++)                    {                        H3[c, 1] = H1[c, 0] * B * sum;                    }                }

But I've a huge question (now my knowledge is stagging), what do you mean for (Activation_Potential) in these two equations?:

ErrorOut = Output * (Activation_Potential)*(Target - Output)
Error = Output * ActivationPotential * SUM(Weights * ErrorOut)

I think you don't mean Beta (as I left written in my code) but I neither now what else is it, thresholds? or activation functions like BTanh(value)?
Sorry if I have so many troubles (and sorry for the english)
What can you suggest me?

sjaakiejj

130

December 07, 2010 12:18 PM

I took Activation Potential from your notes, but it means this:
Activation_Potential = 1 - activationLevel

But yeah I'd definitely suggest working something like this out in Pseudo code first, and then implement it in C#. Neural Networks aren't exactly easy to implement, considering that the one you're trying to implement is actually the simplest form of it.

Metalsoul

Author

100

December 08, 2010 04:05 AM

Ok, so, not for bothering xD
Could you explain me your pseudo code? I didn't understand your neuron(,) part of it

Metalsoul

Author

100

December 08, 2010 02:41 PM

Hey I found a way to work on it!!
I've posted down here my code, it's a bit more complex but it can be modified, I made that you can build the neural network with the number of layer and neurons you want with AddLayer(), then you add all the weights with ComputeWeigth().

    class NNOperator    {        public NNOperator(int inputs,int outputs)        {            layers = new List<float[,]>();            weights = new List<float[]>();            layers.Add(new float[inputs, 2]);            layers.Add(new float[outputs, 2]);        }        public void AddLayer(int neurons)        {            weights.Clear();            layers.Insert(1, new float[neurons, 2]);        }        public void ComputeWeights()        {            weights.Clear();            for (int a = 0; a < layers.Count()-1; a++)            {                weights.Add(new float[layers[a].GetLength(0) * layers[a + 1].GetLength(0)]);            }        }        public List<float[,]> layers;        public List<float[]> weights;        public float[] Update(float[] inputs)        {            float[] output = new float[layers.Last().GetLength(0)];            //Copy input values            for (int a = 0; a < layers[0].GetLength(0); a++)            {                layers[0][a, 0] = inputs[a];            }            //Cycle through layers            for (int layer = 0; layer < layers.Count() - 1; layer++)            {                //Cycle through neurons of the next layer                for (int next = 0; next < layers[layer+1].GetLength(0); next++)                {                    //Reset neuron temporary value                    layers[layer+1][next, 0] = 0;                    //Cycle through neurons of the current layer                    for (int prev = 0; prev < layers[layer].GetLength(0); prev++)                    {                        //H2[a, 0] += H1[b, 0] * W2[(a * H1.Length / 2) + b];                        layers[layer + 1][next, 0] += layers[layer][prev, 0] * weights[layer][(next * layers[layer].GetLength(0)) + prev];                    }                    //If the temporary value doesn't exceed threshold, reset it                    if (layers[layer + 1][next, 0] < layers[layer + 1][next, 1])                    {                        layers[layer + 1][next, 0] = 0;                    }                }            }            //Transfer values to output            for (int a = 0; a < output.Length; a++)            {                output[a] = layers.Last()[a, 0];            }            return output;        }    }

For testing the update method I built a neural network like the XOR neural network you see in this page: http://www.heatonresearch.com/course/intro-neural-nets-cs/1
And I set it up with this code:

            NNOperator nn = new NNOperator(2, 1);            nn.AddLayer(2);            nn.ComputeWeights();            //let's set it up as a XOR (perfect xor)            nn.weights[0][0] = 1;            nn.weights[0][1] = 1;            nn.weights[0][2] = 1;            nn.weights[0][3] = 1;            nn.layers[1][0, 1] = 1.5f;            nn.layers[1][1, 1] = 0.5f;            nn.weights[1][0] = -1;            nn.weights[1][1] = 1;            nn.layers[2][0, 1] = 0.5f;

And I've tested it by giving to it two values into input like 0,0 0,1 1,0 and 1,1 the result I've obtained is a perfect XOR!
A B Output
0 0 0
0 1 1
1 0 1
1 1 0
So the update works perfectly!
Now I've only to find out a way for backpropagation.
What do you think?

Atrix256

540

December 08, 2010 02:56 PM

Are you doing back propagation because you want to learn how it works? Or would you be happy with any method of neural network learning?

If you would be just as happy with another method, you might check out using genetic algorithms to train neural nets.

It isn't appropriate for all situations but it may be useful for yours, depending on what you are trying to accomplish.

Essentially you have "genes" that defines a network (ie neuron weights and thresholds, possibly even network topology) and generate a bunch of these guys with a random set of genes or a guided randomness if you wish)

Then you evaluate the performance of each of these networks and take some percentage of the winning networks, and a smaller percentage of the non winning networks, and "Mate" them all with some rate of mutation, coming up with the next generation of neural networks.

Rinse and repeat til you have suitable network(s).

Might be OT but wanted to toss it out there in case you weren't aware of the technique.

sjaakiejj

130

December 08, 2010 04:19 PM

Neuron(;) would be the Sigma or Sigmoid function that you use to calculate your activation level.

Metalsoul

Author

100

December 09, 2010 09:25 AM

Actually, as you can read in my last reply (not this), you can see that I made a XOR neural network by creating the network and setting all the values to work like a perfect XOR, I would like to learn how to implement back-propagation training to, for example, creating a network with casual values and by training it, seeing that it works as a XOR.
(Maybe I'll be able to use this class for my games in future)

sjaakiejj

130

December 10, 2010 06:59 AM

I gave you all the information you need to implement a Back-Prop neural network, but you'll have to understand the Maths behind it first. Read the link that I posted a few replies back, that one should be clear enough.

Metalsoul

Author

100

December 10, 2010 08:23 AM

Well that link doesn't seem to work.. okay thank you for your help, I'll try making it out with the information you gave me.

Having problems with neural network training C#

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Having problems with neural network training C#

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines