Machine learning in trading: theory, models, practice and algo-trading - page 487

 
Ivan Negreshniy:

In theory, there should be little error in random forests, because during their construction all variables are used in decision trees, and there is no restriction on memory usage as in neural networks - the number of neurons. There you can only use separate operations to "blur" the result, such as level restriction, tree trimming or bagging. I don't know if there is clipping in MQ implementation of alglib, backgammon is

If this variable is smaller than 1, the error should increase.


I did, but the error was still showing average, as I described above... now it's normal

2017.09.27 18:34:34.076 RF sample (EURUSD,H1)   Info=1  Error=0.2893400000000008
2017.09.27 18:34:34.077 RF sample (EURUSD,H1)   Тест 1 >> 1*6=6 // 7*2=14 // 1*5=5 // 5*5=25 // 3*8=24 // 1*8=8 // 9*6=55 // 8*8=64 // 2*2=4 // 9*4=37 // 
2017.09.27 18:34:34.077 RF sample (EURUSD,H1)   Тест 2 >> 4.3*8.7=36.34(37.41) // 2.0*6.3=12.18(12.60) // 7.1*5.7=42.39(40.47) // 2.1*1.6=3.96(3.36) // 8.9*2.5=26.57(22.25) // 

By the way, even with r decreasing by o.1 the error increases very much. Above r 0.9 below 0.8

2017.09.27 18:36:11.298 RF sample (EURUSD,H1)   Info=1  Error=0.5431000000000188
2017.09.27 18:36:11.299 RF sample (EURUSD,H1)   Тест 1 >> 3*7=21 // 6*1=7 // 8*3=24 // 2*1=2 // 4*5=20 // 7*5=34 // 7*7=49 // 1*9=10 // 6*9=55 // 7*7=49 // 
2017.09.27 18:36:11.300 RF sample (EURUSD,H1)   Тест 2 >> 6.0*6.3=37.00(37.80) // 2.7*8.4=23.85(22.68) // 5.6*6.2=36.71(34.72) // 7.3*6.6=48.66(48.18) // 7.4*2.8=20.74(20.72) // 

At r = 0.66 (as in the classical version of RF)

2017.09.27 18:37:44.484 RF sample (EURUSD,H1)   Info=1  Error=0.7935200000000080
2017.09.27 18:37:44.485 RF sample (EURUSD,H1)   Тест 1 >> 2*1=3 // 6*1=7 // 2*6=13 // 5*9=45 // 7*8=57 // 2*6=13 // 7*5=35 // 3*3=9 // 8*4=33 // 6*1=7 // 
2017.09.27 18:37:44.485 RF sample (EURUSD,H1)   Тест 2 >> 4.1*9.9=40.11(40.59) // 7.6*3.2=24.40(24.32) // 6.8*8.3=55.62(56.44) // 1.9*5.6=11.64(10.64) // 9.3*7.8=71.33(72.54) // 

And the results show that the multiplication table already solves badly

 
Ivan Negreshniy:

In theory, there should be little error in random forests, because during their construction all variables are used in decision trees, and there is no restriction on memory usage as in neural networks - the number of neurons. There you can only use separate operations to "blur" the result, such as level restriction, tree trimming or bagging. I don't know if there is pruning in MQ implementation of algib, but there is tagging

If you make this variable smaller than 1, the error should increase.

For the error to be as small as @Maxim Dmitrievsky's
И тоже очень маленькая ошибка: 2017.09.27 16:26:12.267  RF sample (EURUSD,H1)   Info=1  Error=0.0000000000000020
If you want to make one wrong deal per 5000000000000000000, it is impossible to make it in any instrument.

Sincerely.
 
Andrey Kisselyov:
For the error to be as small as @Maxim Dmitrievsky's
It's impossible to make 1 wrong deal at any instrument.

Sincerely.

What do trades have to do with it? I'm telling you that every decision tree practically remembers all the patterns and there may be no error at all in a training set with 100% sampling i.e. R=1.

Yes, it's overfitting, but that's how the algorithm works, which is why they use all sorts of tricks in random forests.

 
Ivan Negreshniy:

What does this have to do with deals, I'm telling you that every decision tree practically remembers all the patterns and there may be no error at all at 100% sampling i.e. R=1.


for this you need to look out of bag, to estimate the model, but then r=0.66 max put yes

 
Ivan Negreshniy:

What about deals, I'm telling you that every decision tree practically remembers all patterns and there can be no error at all at 100% sampling i.e. R=1.

Ivan Negreshniy: I didn't get into how forest works, but from your words I understand, that every tree will remember some pattern, which subsequently may not repeat. At the same time (since there is no repetition) we can not say what the probability of its processing in plus and as an axiom we take its probability as 1, instead of taking it as 0.5, because it is essentially unknown.

Respectfully.
 
Maxim Dmitrievsky:

for this you need to look out of bag, to evaluate the model, but then r=0.66 max put yes

Probably need to pick up, but one bagging is not a very strong technology for prediction - IMHO
 
Ivan Negreshniy:
I guess it's necessary to pick up, but backgammon alone is not a very strong technology for prediction - IMHO

Well, so far so good. :) Later, if I connect a normal lib with dipling, I'll watch it

but the speed!

 
Maxim Dmitrievsky:

It is, but the error was still showing average, as described above... now it is normal

By the way, even with r decreasing by o.1 the error increases greatly. Above r 0.9 below 0.8

At r = 0.66 (as in the classical version of RF)

And the results show, that the multiplication table already solves very badly.

When I increased threshold for signal NS compensated it by increase of quantity of necessary input data, as consequence error decreased, but also variants for input became less.






Respectfully.
 
Andrey Kisselyov:
I did not go into how the forest works. but from your words I understand that each tree memorizes a pattern, which subsequently may not repeat. in this case (since there is no repetition), we can not say what the probability of its performance in the plus and as an axiom take it for 1, instead of taking it for 0.5 as it is not known. from here we get that the forest is almost never wrong (from your words).

Respectfully.
R=1 means that every tree remembers the whole training set of patterns and 0.66 means only 66%, and every tree chooses patterns with return, i.e. the same patterns can be repeated by many trees in the forest.
 
Andrey Kisselyov:
When I increased the threshold for the signal NS compensated for this by increasing the number of inputs required, as a consequence the error decreased, but also the options for entry became less.




Sincerely.

Well, there is a question of the correct chips and targets, although it would seem that it could be simpler than the multiplication table, but there is not a small error