Question on neural network programming - page 6

 
Vinin:


I made a normal grid of 256 inputs, one hidden layer per 256 neurons. And an output layer of one neuron. And I trained it all perfectly in MT4

1. With genetic tester algorithm or with internal mesh learning algorithm?

 
Reshetov:
By the tester's genetic algorithm or by an internal mesh learning algorithm?

All done through a script. The internal algorithm
 
Although there were variants with optimisation. I made a counter and saved the best options in global variables. Rewrote them when the result improved. There was a perfect fit.
 
The point is that I have all weights and thresholds picked up by the tester GA, see input parameters in the report, except for the last 3. The learning curve is high, as optimization of 10000 bars takes no more than 5 minutes. Besides, I do not need 256 inputs since three is good enough - even on them the grid is retrained black and white.
 
Reshetov:


I don't need 256 inputs, as three is good enough - even with them, the grid is retrained in black.

This is already out of the scope of the topic. It is not worth discussing. To each his own
 
Vinin:

This is already out of the scope of the topic. There is no need to discuss it. To each his own.


The topic stalled long ago, it won't do any harm...

Reshetov:


And besides, I don't need 256 inputs, as three are bad enough - even they retrain the grid in the black.


Are you sure that it retrains? Maybe it simply lacks "generalization and computational power" due to weakness of the architecture?

The expert at the intersection of the two wagons is adjusted in the optimiser, but it will only work on a forward if you are very lucky. And it's obviously not about overtraining...

Actually it's easy to fight overshoot - increase training sample size to a stable plus on control or OOS. NS can not remember the whole sample, and just has to generalize. In this case decrease of characteristics of the system is inevitable, but here it depends more on the inputs for example.

 
Figar0:


Well, the topic is long overdue, no harm done...

But it is better not to argue with moderators. At least the rules we IMHO do not violate and in case of what we have the right to appeal to the administration.

Figar0:


Are you sure that the retraining? Maybe it just lacks "generalization and computational power" due to weak architecture?

Yes, I'm sure.

It's easy to prove. The point is that if the results of optimization are very grainy, the forward ones are unsuccessful for them. But we can sort out these very results by indicators and then look for more down-to-earth values where forwards already give success. It would be more correct to say that successful forwards are crowded into optimization results where these very results are not too grave, but not too parfuid either.

The simplest example. We take a decent interval of history, remove Take and Stop and set TS to take TC readings of VS at every bar and use them for trading. I.e. if the grid readings are against the wool of the open position, we flip it. The grid is set up in such a way that it opens only a few deals and all of them are winning (there is no profit factor). I have not encountered this on the one-layer perceptron and even on standard nets. The grid is often wrong there, so it makes a decent amount of trades even without stops and takeovers on a long history.

I suppose this is due to the fact that I modified the first layer and it somehow manages to feed the inputs of the hidden strictly linearly-separable data now. Who knows though, since my design is much simpler than even Rosenblatt's? Most likely the reason is that the entire network is adjusted with GA for all input parameters in one pass and GA is like a tank searching for extrema, though not directly, but multifactor data for such optimization is like two fingers on asphalt, as long as there is a distinct extremum or several extrema. On the other hand, due to primitiveness of the first layer, it is very quickly and quite adequately tuned.

IMHO modern neural networks have gone too far in complicating the input layers. As a result they do not work very adequately. And in a multilayer network the first layer is the most important one as it will influence the final result depending on what it outputs to the hidden layer. After all, a normal grid is more often than not three-layered, and the hidden and output layers are already a primitive linear plane in multidimensional space.

Another trick is dynamic normalization of input data. If we apply static, any change in volatility (it will certainly change on forwards) will affect the result. I have solved this problem in favour of dynamic.

 
Reshetov:

....

And the entrances like on the previous page? It's a slippery slope there... I can't even imagine what kind of wizardry could have been done with NS to get results on such a primitive. And what is the truth, wherever you take a training sample, it goes uphill on SOS? Is the result really stable? What about other pairs and instruments? EURUSD is the easiest one to predict.

Reshetov:

I have had to fiddle with the trading strategy in order to prevent the grid from adapting.

I do not understand it either. NS should give signals, and it seems illogical to me personally to direct NS by means of TS at the training stage. Here somehow the approach is just the opposite... In what direction did the TS get twisted?

Reshetov:

Another trick is dynamic normalization of input data. If we apply static, any change in volatility (it will certainly change on forwards) will affect the result. I have solved this problem in favour of dynamic.

I also practice it - sometimes it works, sometimes it is not so good. It will also depend on the input, but those inputs will definitely benefit. Of course, dynamic normalisation adds informativeness to the input, but makes it harder to learn, and since learning is "easy", it makes sense that it works for a positive result.

 
Figar0:

And the entrances like on the previous page? It's a slippery slope there with the waving period...

I have the waving period as a constant. It used to be an input parameter, but I've collected various successful forward tests and came to the conclusion that they dance around this very constant.

Figar0:


I can't even imagine what magic trick an NS could do to get results using such a primitive...

...

I don't understand it either. NS should give signals, and to direct NS through TS at the stage of training, personally, seems illogical to me. Here somehow the approach is just the opposite... In what direction did the TS get twisted?

All this stuff was not assembled in one day, but rather a long time and bit by bit. Some things were gradually added, some things were polished and some things were removed from the algorithm.

The work was done by:

1. To minimize the code. Michelangelo's method: we take a block of stone, cut off unnecessary things and get a sculpture (later this method was ascribed to Occam and his razor).

2. bypassing the constraints of tester optimisation

3. Loaded TC to such a level of thickness to get rid of fitting

Figar0:

So is it true that wherever you take a training sample, it goes uphill on OOS? Is the result really stable? What about other pairs and instruments? EURUSD is the easiest to predict.

The results on OOS are different, there are also plumbs. It would be surprising if after knowingly fitting results the grid didn't drain on the spread. I didn't say they were all in profit, I said I picked the most "tasty" one (there may be even more "tasty" ones, as I haven't looked at them all - there are too many out there).

The only difference to other strategies: successful forwards can be easily found manually (they get crowded when optimizing results are sorted out) and the choice is quite broad + decent slices by individual input parameters, i.e. extrema with gentle descents.

Among other pairs I tested a bit on gold and GBPUSD - about the same.

 
Reshetov:

Pessimism is determined by the limitations of the strategy tester, i.e. if the input value ranges are large or the number of these same values exceeds the limit, the optimiser refuses to start. So there are limits after all.

Today I finally completed building a neural network written entirely in MQL4 with 3:3:1 architecture (three neurons at the input, three hidden inputs, one output). All the layers are configured using tester GA. But the trouble is that for 1 layer you need at least 12 input parameters, at least with values from -1 to 1 in steps of 1 (like Rosenblatt). But the optimizer can't handle that many. I had to wriggle out and simplify the first layer.

In contrast to someone else's mesh, the self-made one is better in that it can be upgraded. For example, besides making the first layer non-standard, I added dynamic normalization of input data.

Signals at the inputs are quite primitive:

Despite the primitiveness mentioned above, the grid proves to be very trainable, i.e. weights and thresholds are easily chosen so that test results prove to be without a single error (no profit factor). But after such a fitting, the forward test immediately starts plummeting on the spread. I had to fiddle with the trading strategy, so as not to allow the grid to adjust.

It was worth the effort, although it made my brain turn inside out:

These are the results of the test. From 1 to 273 deals - optimization, then forward test.

And here's the forward test:

Here are the results of the forward test:

Strategy Tester Report
RNN
Alpari-Demo (Build 409)

Symbol EURUSD (Euro vs US Dollar)
Period 1 Hour (H1) 2011.10.24 00:00 - 2012.01.13 23:59 (2011.10.24 - 2012.01.14)
Model By open prices (only for Expert Advisors with explicit bar opening control)
Parameters t1=54; t2=4; t3=48; x1=194; x2=128; x3=68; y1=1; y2=1; y3=-1; t4=136; sl=900; lots=1; mn=888;

Bars in history 2431 Modelled ticks 3862 Simulation quality n/a
Chart mismatch errors 0




Initial deposit 10000.00



Net profit 14713.00 Total profit 40711.60 Total loss -25998.60
Profitability 1.57 Expectation of winning 88.10

Absolute drawdown 2721.60 Maximum drawdown 4800.00 (39.74%) Relative drawdown 39.74% (4800.00)

Total trades 167 Short positions (% win) 101 (67.33%) Long positions (% win) 66 (92.42%)

Profitable trades (% of all) 129 (77.25%) Loss trades (% of all) 38 (22.75%)
Largest profitable trade 900.00 losing deal -907.20
Average profitable deal 315.59 losing trade -684.17
Maximum continuous wins (profit) 13 (2557.00) Continuous losses (loss) 4 (-3605.40)
Maximum Continuous Profit (number of wins) 3511.60 (11) Continuous loss (number of losses) -3605.40 (4)
Average continuous winnings 4 continuous loss 1





The most interesting thing is that even on the chart we can see that the optimization section is worse than the forward one. This rarely happens. Although I have selected this one as the best forward out of many others, i.e. other forwards have much worse results than the optimization ones but nevertheless they have the best ones.


Thanks for the tip! You and Vinin are an authority for me. It's been a long time and you are still dealing with this topic.I have collected your works on the Internet.The most interesting thing is that both of you are right and you don't know that you are going in parallel, towards the same goal. I took, from one and took from the other and now I'm waiting for the result. You are creating a new direction, but a very difficult one!!!