Article: Price forecasting with neural networks - page 14

 
Neutron, the smoothing effect of increasing the number of inputs is legitimate. But the lag can be determined by many factors, both the lack of these inputs and possible network imperfections, maybe undertraining. It is another matter if this, with all the '+' to the above criteria, gives a lag.
 
Neutron:

This shows the results of testing the predictive ability of the two NS.


The figure shows in red the original time series (RT), in blue the prediction at 1 bar forward of the linear single-layer network, and in green the nonlinear two-layer network. The depth of dip is the same in both cases. It can be seen that for this artificial case there is a marked residual of the predicted data on the BP trend plot. I wonder if my experienced colleagues observe this effect, and if so, what could it be related to?

Your network is somehow behaving strangely - it is getting the same set of input data at the trend section and is getting different forecasts (it is getting a broken line on the chart while it should be straight). In this regard, the questions are:

1) how many neurons are there in the intermediate network layers?

2) how many inputs?

3) what is fed to the inputs?


Regarding the lag in predictions on the trend. That's the way it's supposed to work. You fed about 30 training vectors into the network, where consecutive rising values of 0.1 generated a rise of 0.1, and fed 1 vector where the same consecutive rising values generated a fall of 3. You trained the network on inconsistent data. So it averaged those 31 contradictory predictions of yours...


I also built a 2-layer NS. With 3 inputs and 3 intermediate neurons. The inputs were incremental values. To compare, I trained this network once on all data, and the second time I excluded outliers from training - i.e. all training vectors that contained too large values in inputs, i.e. that very abrupt collapse by 3. The difference is evident:

 
rip:
slava1:
Well, you can argue about it for a long time who understands what they understand, or don't understand. The conversation was about data preparation. I understand that no one wants to discuss this issue here. Pity

If you want to give a lecture on how to prepare data for NS training, I don't think it should be done on this forum, not many people would be interested in it here.

Well, you're wrong! I think it would be interesting to read the experience of practitioners, how they do it

 
ds2:

Your network is behaving strangely - it receives the same sets of input data on the trend section, but gives different forecasts (it appears as a broken line on the graph, while it should be straight). In this regard, the questions are:

1) how many neurons are there in the intermediate network layers?

2) how many inputs?

3) what is fed to the inputs?


Regarding the lag in predictions on the trend. That's the way it's supposed to work. You fed about 30 training vectors into the network, where consecutive rising values of 0.1 generated a rise of 0.1, and fed 1 vector where the same consecutive rising values generated a fall of 3. You trained the network on inconsistent data. So it averaged those 31 contradictory predictions of yours...


I also built a 2-layer NS. With 3 inputs and 3 intermediate neurons. The inputs were incremental values. To compare, I trained this network once on all data, and the second time I excluded outliers from training - i.e. all training vectors that contained too large values in inputs, i.e. that very sharp collapse by 3. The difference is evident:

ds2, and everyone who responded, thank you very much for your attention and sensible advice - it really helped. The thing is, I've limited the number of training cycles (I believe it's called Epochs) to 100 times to speed up forecasting in my network. Of course, that was not enough, so the network was not learning properly. After increasing the number of epochs to 1000, everything worked fine - tick to tick (well, almost).

I have a two-layer network with non-linearity in the form of hyperbolic tangent, with the number of neurons in a hidden layer 2 and the number of synapses d*2+2, where d is a dimension of the NS input. I have d=2, for the case shown in fig. and the number of training vectors is 6. The number of vectors in the training sample is equal to the number of synapses, so I see, that the network does not retrain and does not try to smooth the forecast, which allows tracking price jumps more effectively. For training I also fed a sequence of incremental values as input. I'm not yet acquainted with the fine art of preparing the input data (I mean your "excluded spikes from training for the second time") but I hope to master this art very soon.

Here's a pre-interesting result:


It's a one step forward prediction with re-training at each step of a single layer nonlinear NS with ONE input! and a training sample = 2 vectors with one element in each.

I'm blown away....

It turns out that Neron Networks and Artificial Intelligence are not simple, but very simple!

 
Neutron:
ds2:

Your network is acting strangely - it receives the same sets of input data on the trend section, but gives different forecasts (it appears as a broken line on the graph, while it should be straight). In this regard, the questions are:

1) how many neurons are there in the intermediate network layers?

2) how many inputs?

3) what is fed to the inputs?


Regarding the lag in predictions on the trend. That's the way it's supposed to work. You fed about 30 training vectors into the network, where consecutive rising values of 0.1 generated a rise of 0.1, and fed 1 vector where the same consecutive rising values generated a fall of 3. You trained the network on inconsistent data. So it averaged those 31 contradictory predictions of yours...


I also built a 2-layer NS. With 3 inputs and 3 intermediate neurons. The input was fed incremental values. To compare, I trained this network once on all data, and the second time I excluded outliers from training - i.e. all training vectors that contained too large values in inputs, i.e. that very sharp collapse by 3. The difference is evident:

ds2, and everyone who responded, thank you very much for your attention and helpful advice - it really helped. The thing is, I've limited the number of training cycles (I believe it's called Epochs) to 100 times to speed up forecasting in my network. Of course, that was not enough, so the network was not learning properly. After increasing the number of epochs to 1000, everything worked fine - tick to tick (well, almost).

I have a two-layer network with non-linearity in the form of hyperbolic tangent, with the number of neurons in a hidden layer 2 and the number of synapses d*2+2, where d is a dimension of the NS input. I have d=2, for the case shown in fig. and the number of training vectors is 6. The number of vectors in the training sample is equal to the number of synapses, so I see, that the network does not retrain and does not try to smooth the forecast, which allows tracking price jumps more effectively. For training I also fed a sequence of incremental values as input. I'm still far from the fine art of input data preparation (I mean your "excluded spikes from training for the second time"), but hopefully I'll soon learn the trick.

To improve the sample, try to enlarge the training series, say, by this principle:

there are two adjacent values of the series - x1,x2 in the interval they add x1,2 = (x1+x2)/2


This method works well for time series with high correlation of neighboring values.

The case of a stream of quotes is just that.

 
Neutron:


Here's a pre-interesting result:


This is a one step forward prediction with retraining at each step of a single layer non-linear NS with ONE input! and a training sample = 2 vectors with one element in each.

I'm blown away...


Can I ask you to post the raw data file from which the training vectors were generated?

 

What should the format of the file look like so that you can read it. The point is that I write my own NS from scratch and use the data in a format I feel comfortable with.


Для улучшения выборки попробуйте расширить обучающий ряд, скажем по такому принципу:

There are two adjacent values of a series - x1,x2 in between them add x1,2 = (x1+x2)/2

This method works well on time series, with high correlation of adjacent values.

The case of streaming quotes is just that.

This case is suitable for positively correlated series. Price VR has a significant negative correlation between neighboring samples, so this method will not be correct.

One more thing. It would be good if the esteemed forum users would post substantiated arguments in favor of greater predictive ability of 2-layer NS in relation to single-layer one, other conditions being equal.

As an option. Below you can find a zip archive with a test vector, which is shown in Fig. Cut it yourself as you see fit.

Files:
rnd.zip  1 kb
 
Neutron:

I'm blown away...

It turns out that Nero Networks and Artificial Intelligence are not just, but very simple!


I haven't tried to practically apply NS yet (although I seem to be carried there in the context of the last idea), but from a human perspective the test vector consists of very simple dependencies (about two or three parameters) and should be very easily piecewise approximated. I suspect that's exactly what NS does. Fast rearrangement may be a consequence of NS simplicity, i.e. in this case short memory is a blessing.

 

In general, Candid, I agree with you, but I want to dig deeper... For example, how justified is the complication of NS (hidden layers).

And, actually, I'm crazy! The point is that if you represent the nonlinearity of NS in a certain form, you can get the exact analytical solution for the weights. This, in turn, means that it will be possible to refuse the method of Inverse Propagation of Error for network training and get the result as accurate as possible in one action, without any there 1000 Epochs of training!!!


P.S. I forgot to warn you that I use one synapse with a constant level of excitation in each neuron.


Addendum.

Brought it, brought it, and carried it ;-)

This is the "FINE" analytical solution for the scales of a single layer nonlinear NS:

It made me smile.

But the whole calculation takes one millisecond.
 

Explain this to me as a novice "neuroscientist"... I understand that the network in question is a multilayer perceptron.

What is the reason for choosing this type of network, why not Hopfield or Kohonen or something else?