How to form the input values for the NS correctly. - page 24

 
Reshetov писал (а) >>

And finally, for those nerds who think that interpolation capabilities of NS are necessary for trading, I can provide a concrete counterargument. Just take any redrawing indicator or oscillator and you will get an amazing interpolation on history without any neural networks and tricky architectures. Of course traders shun redrawing indices because what is suitable for interpolation or approximation is not suitable for extrapolation in non-stationarity conditions.

This is nonsense... What does a redrawing indicator have to do with interpolation and future forecasting???

 
Reshetov писал (а) >>

And finally, for those nerds who think that interpolation capabilities of NS are necessary for trading, I can provide a concrete counterargument. Just take any redrawing indicator or oscillator and you will get an amazing interpolation on history without any neural networks and tricky architectures. Of course traders shun redrawing indices because what is suitable for interpolation or approximation is not suitable for extrapolation in non-stationarity conditions.

You just don't quite understand what is being approximated. There is an input vector X of dimension N and an output vector Y of dimension M. NS establishes relation between them, i.e. it approximates dependence Y = F(X). Y can be anything, even triple-winded, NS does not care, it solves exactly the problem of approximation F(X) on the training sample.

 

REDRAWING IS THE OPIUM OF THE PEOPLE!!! ))))

 
Mathemat писал (а) >>
I'd even strengthen this advice: divide by 10. For some reason the thread about stochastic resonance comes to mind. Training the mesh to the end can drive the target function to a deep minimum, i.e. to a steady state. Stable states are not typical of financial markets at all. They are quasi-stable, i.e. such that are ready to turn into a disaster (trend) at any moment under the influence of even a slight "noise". But this is just philosophical thinking...

In my opinion, there is a misunderstanding of the nature of the state of the NS, which can be described as "overtrained" and "undertrained". These terms refer to traits that relate the length of the training sample, the number of free parameters (synapses) of a given NS, and the magnitude of the generalization error on the test set. If the sample length is comparable to the number of adjustable weights (in the limit less or equal), then on the training sample we will get any exact match of the NS response to the input vectors, but on the test sample we will get total nonsense! This is an example of an over-trained network. If the training sample length is too long (how long is a separate question), we will get a poor match on the training sample (in the limit, we will only determine the sample mean). On the test sample, we will get the same thing - the mean.

As can be seen, the number of training epochs is out of the question. Moreover, to achieve a global minimum (learning NS), we need out of all possible solutions of the redefined system of nonlinear equations (what NS does), choose the one that will give us the lowest cumulative error (will satisfy ALL equations of the system at least). This condition is of course satisfied by the solution (found weights of synapses) which tends to the limit ones - obtained when the number of training epochs tends to infinity.

Therefore one should not confuse over-training or under-training of NS with the number of training epochs - the latter should always be reasonably large (the exact number should be determined experimentally).

I encountered a discussion of the "early stopping problem" in the literature, but my impression is that the authors do not quite understand the nature of what they are writing about. Indeed, if we take a situation when the training sample length is less than the optimal one, then in the process of training a situation will occur, when the error on the test set will firstly decrease, and then, with further increase in the number of training epochs, will again start to grow... Well, that's another story, comrades!

 

I'll retire now that enough NN grandees have gathered here. My opinion is insignificant, as I am an amateur in nerve networks.

I didn't even talk about the ratio of inputs to degrees of freedom, assuming it's at least 10 as recommended by theory. I only said about the moment when TF at the verification section passes through the minimum. It seems to be quite clearly described by Shumsky, if I am not mistaken.

 
Mathemat писал (а) >>

I'll retire now that enough NN grandees have gathered here. My opinion is insignificant, as I am an amateur in nerve networks.

I didn't even talk about the ratio of inputs to degrees of freedom, assuming it's at least 10 as recommended by theory. I only said about the moment when TF at the verification section passes through the minimum. It seems to be quite vividly described by Shumsky too, if I'm not mistaken.

Grands in maths wouldn't hurt either, please don't be deterred :) . I think others will join the request.

 
Mathemat писал (а) >>
I would further reinforce this advice: divide by 10. For some reason a branch about stochastic resonance comes to mind. Training the mesh to the end can drive the target function to a deep minimum, i.e. to a steady state. Stable states are not typical of financial markets at all. They are quasi-stable, i.e. such that are ready to turn into a disaster (trend) at any moment under the influence of even a slight "noise". But it's just philosophical reflections...

Well, I mean the same thing. It's just that the term "steady-state" has been used instead of the popular term "steady-state". By both terms it is meant that statistical (fitting) data is close to probabilistic. But anyone who has dealt with financial instruments knows very well that statistics are not applicable to them due to non-stationarity.


Just empirically, I find that the grid needs to be retrained by about a third. Again, though, it depends on the adequacy of the inputs. It is possible that others may empirically only need to be trained by 10%.

 
Reshetov писал (а) >>

Well, that's what I mean too. It's just that the term "steady-state" has been used instead of the popular term "steady". By both terms it is meant that the statistical (fitting) data is close to probabilistic. But anyone who has dealt with financial instruments knows very well that statistics are not applicable to them due to non-stationarity.


Just empirically, I find that the grid needs to be retrained by about a third. Again, though, it depends on the adequacy of the inputs. It is possible that others may empirically only need to train by 10%.


According to Haikin there can be a discrepancy between the results of full training on the training and on the test sample only if the number of patterns is not high enough.

If there are enough patterns, full training produces better results for the test sample than for the early breakpoint as mentioned above.

From my experience I tend to believe these results.



About the linear neural network - if you can get positive results from it with sufficient reliability, there can only be one conclusion - you don't need a neural network.

 
TheXpert писал (а) >>

This is nonsense... What does redrawing indicator have to do with interpolation and future forecasting?

Dear Sir, where did I claim that interpolation is related to the future? Go see an oculist and read the posts carefully instead of throwing around expressions. I have reported and reiterate for the particularly gifted that extrapolation is necessary for the future.


My post was in response to rip's post :


------------------ Quote ------------------------


rip 14.07.2008 00:01
Reshetov wrote (a) >>

Right. The architecture, with proper inputs, is no longer a problem. You could say: Inputs are everything, architecture is nothing.


Here, gentlemen picked up normal inputs and got proper results with MTS "Sombo" :

I agree with you to some extent. But, network architecture plays a big role... for example, RBF-networks are much better at solving some interpolation tasks.

 
TheXpert писал (а) >>

According to Heikin, a discrepancy in the results of full training on a training and test sample can only occur if the number of patterns is not large enough.

Mr. Nerd, normal people have their own brains and experience, while nerds quote other nerds because there are no brains of their own and there cannot be.


Haykin most likely trained the network in a stationary environment, hence his conclusions. In non-stationary environment the network may fail to learn at all if too many patterns are given to it because, for example in trading, today a pattern points to buy and the next time it points to sell. Because any entry has some probability of false signals.