Discussion of article "Neural networks made easy (Part 6): Experimenting with the neural network learning rate"
I was just passing by)))) Aren't you confused by the fact that the error grows with training??? it should be the other way round)))))
Hi Dmitriy,
I really like this series, as a learning tool for me for Neural Networks. I use MT4, including finding an implementation of SymbolInfo. I am guess that is where the problem is, as it is running but not doing anything during learning. Would you have any idea on what would be needed for it to run in MT4? Thanks!
Good afternoon!
Can you tell me, do you train NS only on the closing price? Or do you also use the trading volume at a given TM?
Good afternoon!
Can you tell me, do you train NS only on the closing price? Or do you also use the trading volume at a given TM?
Now in the described example, the neural network receives opening, closing, high, low, volume, time and readings of 4 indicators. The process of transferring the initial data to the neural network is described in the link.

- www.mql5.com
For anyone coming after me: note the first example Fractal_OCL1.mql won't compile
You need to change
//#define lr 0.1
double eta=0.1;
The main problem is not in selecting the training coefficient, after all, Tensor Flo has a function that gradually reduces it during training to a specified value, selecting the optimal one. The problem is that the neural network does not find stable patterns, it has nothing to hold on to. I have used models ranging from fully connected layers to the new-fangled ResNet and Attention. The effect does not exceed 60%, and this is on a narrow area, in general, everything slips to 50/50. With neural networks we need to think about what could be analysed in general. Just arrays of prices and volumes, in any combinations, do not give results.
Try to analyse the correlation between the initial data and the target result.
"...in the absence of a fractal in the reference value, when training the network, I specified 0.5 instead of 1."
Why exactly 0.5, where did this figure come from?
Why exactly 0.5, where did this figure come from?
During training, the model learns the probability distribution of each of the 3 events. Since the probability of fractal absence is much higher than the probability of its appearance, we artificially underestimate it. We specify 0.5, because at this value we come to approximately equal level of maximum probabilities of events. And they can be compared.
I agree that this approach is very controversial and is dictated by observations from the training sample.

- Free trading apps
- Over 8,000 signals for copying
- Economic news for exploring financial markets
You agree to website policy and terms of use
New article Neural networks made easy (Part 6): Experimenting with the neural network learning rate has been published:
We have previously considered various types of neural networks along with their implementations. In all cases, the neural networks were trained using the gradient decent method, for which we need to choose a learning rate. In this article, I want to show the importance of a correctly selected rate and its impact on the neural network training, using examples.
The third experiment is a slight deviation from the main topic of the article. Its idea came about during the first two experiments. So, I decided to share it with you. While observing the neural network training, I noticed that the probability of the absence of a fractal fluctuates around 60-70% and rarely falls below 50%. The probability of emergence of a fractal, wither buy or sell, is around 20-30%. This is quite natural, as there are much less fractals on the chart than there are candlesticks inside trends. Thus, our neural network is overtrained, and we obtain the above results. Almost 100% of fractals are missed, and only rare ones can be caught.
To solve this problem, I decided to slightly compensate for the unevenness of the sample: for the absence of a fractal in the reference value, I specified 0.5 instead of 1 when training the network.
This step produced a good effect. The Expert Advisor running with a learning rate of 0.01 and a weight matrix obtained from previous experiments shows the error stabilization of about 0.34 after 5 training epochs. The share of missed fractals decreased to 51% and the percentage of hits increased to 9.88%. You can see from the chart that the EA generates signals in group and thus shows some certain zones. Obviously, the idea requires additional development and testing. But the results suggest that this approach is quite promising.
Author: Dmitriy Gizlyk