Evaluating Recurrent Neural Network
Let me first off apologize for any errors in my terminology, only been researching neural nets for 2 days.
I'm not 100% sure in my understanding of how to evaluate the output for a recurrent neural network (With loop backs and such?).
Assume I have the following neural network:
http://smg.photobucket.com/albums/v643/Kalagaraz/?action=view¤t=Ann.jpg
How to evaluate the output of neuron 3 confuses me. For simplification assume output == input, no activation function. The formula in this case would be
3.output = 1.output * b.weight + 3.output * c.weight
in which case I have 3.output on both sides of the equation so how do I evaluate it?
my ASSUMPTION is that 3.output on the right hand side would be the output for the previous evaluation. If that is the case, then for the very first evaluation would I just assume it to be 0?
Now if i'm correct on my previous assumptions, I assume (Yeah I assume a lot) that this a way of giving the neural network a memory of sorts?
Anyways thanks for any help. Trying to implement NEAT and learning from white papers and such isn't very easy to me :).
Pic link doesnt work for me.
Do you absolutely have to use recurrent ANNs? I'd recommened just doing feedforward networks which keeps it much simpler. When you add new links in NEAT just make sure they don't go backward.
As for evaluating recurrent networks, it works just like activation of feedforward.
1) load inputs
2) propagate data across all connections
3) apply activation functions to all nodes
4) read outputs
Depending on how you write your activation function it should not matter if the ANN is recurrent.
Do you absolutely have to use recurrent ANNs? I'd recommened just doing feedforward networks which keeps it much simpler. When you add new links in NEAT just make sure they don't go backward.
As for evaluating recurrent networks, it works just like activation of feedforward.
1) load inputs
2) propagate data across all connections
3) apply activation functions to all nodes
4) read outputs
Depending on how you write your activation function it should not matter if the ANN is recurrent.
Well when I did a feedforward network I could just go through each layer, calculate the outputs and use those as the inputs for the layer above it. With the recurrent network it's a bit different though.
In the picture, the main thing I'm concerned about is that node3 loops back on itself. Which has me confused on how to evualate it when I don't know all it's inputs since node3 hasn't been evaluated yet :).
maybe this picture will work:
http://img177.imageshack.us/my.php?image=annbm8.jpg
[Edited by - Kalagaraz on December 19, 2007 7:39:49 PM]
In the picture, the main thing I'm concerned about is that node3 loops back on itself. Which has me confused on how to evualate it when I don't know all it's inputs since node3 hasn't been evaluated yet :).
maybe this picture will work:
http://img177.imageshack.us/my.php?image=annbm8.jpg
[Edited by - Kalagaraz on December 19, 2007 7:39:49 PM]
Quote: Original post by Kalagaraz
Well when I did a feedforward network I could just go through each layer, calculate the outputs and use those as the inputs for the layer above it. With the recurrent network it's a bit different though.
In the picture, the main thing I'm concerned about is that node3 loops back on itself. Which has me confused on how to evualate it when I don't know all it's inputs since node3 hasn't been evaluated yet :).
maybe this picture will work:
http://img177.imageshack.us/my.php?image=annbm8.jpg
Hi,
Basically you are just using the output of node3 from the previous iteration - so in this case since there was no previous iteration, node3 has no previous output so when you calculate the input of node3 you have other inputs + previous iteration output of nodes = other inputs + 0.
Not sure if I explained that well enough, but hopefully you get the idea.
Here's the activation function my own ANN implementation, maybe it will help. If you do your activate() this way it doesn't matter if your ANN is feedforward or recurrent. I tried to make it as simple as possible.
Note:
connection variables: from, to, weight
node variables: data, accumulator (holds temp data during activation)
Connections: vector of all connections
Nodes: vector of all nodes in this order - inputs, outputs, inner
I used a function pointer since I have 10+ different activation functions. Just replace that last line with your activation function().
Note:
connection variables: from, to, weight
node variables: data, accumulator (holds temp data during activation)
Connections: vector of all connections
Nodes: vector of all nodes in this order - inputs, outputs, inner
//---------------------------------------------------------------------// activate() - single step CPPN activation // - nodes order: |--inputs--|--outputs--|-----inner-------|//---------------------------------------------------------------------void CPPN::activate(){ int i, from, to; // reset node accumulators for(i=0; i<nodes.size(); i++) { nodes.accumulator = 0.0f; } // propogate data across connections // applying weight for(i=0; i<connections.size(); i++) { to=connections.to; from=connections.from; nodes[to].accumulator += nodes[from].data * connections.weight; } // apply activation function to each node // and save to results to data for(i=0; i<nodes.size(); i++) { // apply activation function to data nodes.data = (*nodes.function_ptr)(nodes.accumulator); }}
I used a function pointer since I have 10+ different activation functions. Just replace that last line with your activation function().
In general a recurrent neural network is an implementation of the vector differential equation
dx(t) / dt = Cx(t) + Wf(x(t)) + Du(t)
where C,W and D are appropriately dimensioned matrices. x is the vector state of the network nodes and u(t) is the external input to the system.
This is only true if you use a discrete time approximation for your network, which is equivalent to approximating the above differential equation with a difference equation and using an Euler update scheme. Certainly discrete time recurrent networks work and you may well get good results with one, but that totally depends on the problem you are applying it to. A popular example of a discrete time network is the Kohonen net.
If you are just starting out using recurrent networks I would suggest playing with simple discrete time networks and then moving to continuous time nets if the problem domain warrants it. There is a wealth of information online. If you need specific references for particular information just holler.
Cheers,
Timkin
dx(t) / dt = Cx(t) + Wf(x(t)) + Du(t)
where C,W and D are appropriately dimensioned matrices. x is the vector state of the network nodes and u(t) is the external input to the system.
Quote: Original post by streklin
Basically you are just using the output of node3 from the previous iteration -
This is only true if you use a discrete time approximation for your network, which is equivalent to approximating the above differential equation with a difference equation and using an Euler update scheme. Certainly discrete time recurrent networks work and you may well get good results with one, but that totally depends on the problem you are applying it to. A popular example of a discrete time network is the Kohonen net.
If you are just starting out using recurrent networks I would suggest playing with simple discrete time networks and then moving to continuous time nets if the problem domain warrants it. There is a wealth of information online. If you need specific references for particular information just holler.
Cheers,
Timkin
This topic is closed to new replies.
Advertisement
Popular Topics
Advertisement