Machine learning in trading: theory, models, practice and algo-trading - page 11

 
I'll wait for CC to clarify how he wants to do it.

Once again, I think predicting trades with an exit by a simple condition may not be optimal.

What if we train another machine that will learn the moment of closing on already open trades? Let me explain. The deal should close in one hour - that is how we have trained the machine to open deals.

For each open trade, there will be 60 records (lines) with chips - for each minute, starting at 31 minutes and ending at 90 minutes. About half of these entries will be marked 1 as a good exit around the 60th minute.

This is what I call a detailed explanation of how the problem should be solved.

What do you think?
 
Alexey Burnakov:

The idea is interesting. I also have advisors working. Maybe I will think about how to update them. But it is not clear to me what exactly needs to be improved. What do I have to teach my programmer?

The Expert Advisor has a rigid logic of opening and closing positions. The decision in machine learning is made somewhat differently.

That is, it is not quite clear what exactly you are going to do.

Please note this in my post above:

I take the general direction from the high bar. But if you look closely at time, there is an enormous lag in terms of lower bars especially. So if it is D1 and I trade on M5, it turns out that I take almost the day before yesterday's data for direction. Even predicting one step forward for D1 with 30% error has radically improved the Expert Advisor's profitability, and most importantly, increased confidence that it will not fail.

In my case I'm using lagged data from indicators for appropriate R-prediction.

 
SanSanych Fomenko:

Note this from my post above:

I take the general direction from the high bar. But when you look closely at the time there is a monstrous lag in terms of lower bars especially. So if it is D1 and I trade on M5, it turns out that I take almost the day before yesterday's data for direction. Even predicting one step forward for D1 with 30% error has radically improved the Expert Advisor's profitability, and most importantly, increased confidence that it will not fail.

In my particular case I'm going the way of using lagging information from indicators on corresponding predictions from R.

Okay, the idea is clear.
 

I have straight up big news.

While I was on a business trip, in the evenings I ran the Lerning machine on my data. Tried different combinations of approaches. And it looks like a stone flower came out - all of a sudden.

So, I have been using my data from the link above and have been doing this all over again, trying everything described in my blog. But I added a couple more tricks. For example, I selected those training parameters where there is more profit on crossvalidation, rather than abstract guessing accuracy.

For this, of course, I had to write my own fitness function.

On the graph are pairs of expectation values on Training and Validation for different prediction horizons and training parameters. I also used the idea of a "gray area", that is, a prediction space where nothing happens.

As you can see, I got quite sane MO values on both training and correlated values on validation! Note that a spread of 0.00020 (twenty pipettes) was included in the simulation.

I also counted the total trade including the spread. The values are also very nice. The maximum profit is achieved at a certain ratio of the number of trades to the MO of the trade, all this on a planning horizon of 12 hours. On the chart is a search of all planning horizons, all gray area values, and the best learning parameters of the model. See below:


Well, and in order to spread this knowledge, I am attaching a file with all the results, including even the scaffold training parameters. But without the inputs. The inputs from my data are selected. This will be a little secret. Whether or not to post the full code of the experiment, I'll think about it. Now I want to do integration with MT of this machine, because the result seems to me already quite working.

Alexey

 
Alexey Burnakov:

I have straight up big news.

While I was on a business trip, in the evenings I ran the Lerning machine on my data. Tried different combinations of approaches. And it looks like a stone flower came out - all of a sudden.

In general, on my data on the link that gave me I ran the training more and more, tried what is already described in my blog. But I added a couple more tricks. For example, the selection of those training parameters where there is more profit on crossvalidation, rather than the abstract accuracy of guessing.

For this, of course, I had to write my own fitness function.

On the graph are pairs of expectation values on Training and Validation for different prediction horizons and training parameters. I also used the idea of a "gray area", that is, a prediction space where nothing happens.

As you can see, I got quite sane MO values on both training and correlated values on validation! Note that a spread of 0.00020 (twenty pipettes) was included in the simulation.

I also counted the total trade including the spread. The values are also very nice. The maximum profit is achieved at a certain ratio of the number of trades to the MO of the trade, all this on a planning horizon of 12 hours. On the chart is a search of all planning horizons, all gray area values, and the best learning parameters of the model. See below:


Well, and in order to spread this knowledge, I am attaching a file with all the results, including even the scaffold training parameters. But without the inputs. The inputs from my data are selected. This will be a little secret. Whether or not to post the full code of the experiment, I'll think about it. Now I want to do integration with MT of this machine, because the result seems to me already quite working.

Alexey

The total profit in pips is not an indicator. The ratio of this total profit to the number of bars of history at which it is obtained is the quality factor that is indicative. I've been saying for a long time and I use it only for model optimization and estimation.

Hard work gives results.

Good luck

 
Vladimir Perervenko:

The total profit in pips is not the indicator. It's the ratio of this total profit to the number of bars of the history, on which it is obtained, that is the quality factor which is indicative. I've been saying for a long time and I use it only for model optimization and estimation.

Hard work gets results.

Good luck

It seems to me that one does not exclude the other.

Alexei has overcome the obvious disadvantage of all estimates of classification predictions: it is obvious that the value of a correct prediction of a bar of 1 pip and the value of a prediction of a bar with 10 pips are completely different.

Congratulations to you Alexey!

 
Vladimir Perervenko:

The total profit in pips is not an indicator. It is the ratio of this total profit to the number of bars of history at which it is obtained that is indicative. I have long said and use it only for model optimization and estimation.

Hard work gets results.

Good luck

Let me explain.

The forecast for 12 hours ahead (724 minutes, to be exact).

The number of transactions - more than 5000 for each of the validation samples. The result is averaged over 49 validation samples. Trades within each sample occur in increments of about 12 hours, that is, there is no explicit dependence and no overlap of results. I will now post more charts of trade simulation. The results are too good. But so far all is confirmed.

 

Final data on my experiment:

Simulation of trading on signals from a trained machine on 49 validation samples (in pips):

Distribution of trade results in pips for 49 samples:

And performing hypothesis tests on the significant difference of the mean (median) MO of each validation sample from zero:

validation sample
shapiro test normality p-value
test p-value for difference from zero
mean
median
mean dist upper 99%-tail
mean dist lower 99%-tail
1
1.11E-65
0
0.000139
0.000095
0.000146
0.000133
2
8.55E-64
0
0.000139
0.000096
0.000145
0.000133
3
8.24E-63
0
0.000137
0.000096
0.000143
0.000131
4
3.31E-66
0
0.000139
0.000095
0.000146
0.000133
5
4.64E-66
0
0.000142
0.000097
0.000149
0.000136
6
7.08E-63
0
0.000141
0.000097
0.000147
0.000135
7
8.72E-65
0
0.000135
0.000096
0.000141
0.000129
8
4.52E-65
0
0.000139
0.000096
0.000145
0.000132
9
4.31E-64
0
0.000143
0.000102
0.000149
0.000137
10
4.53E-66
0
0.000141
0.000099
0.000147
0.000134
11
8.97E-67
0
0.000143
0.000098
0.000149
0.000136
12
2.21E-63
0
0.000139
0.000102
0.000145
0.000133
13
1.16E-63
0
0.000142
0.000099
0.000148
0.000135
14
7.82E-64
0
0.000138
0.000097
0.000144
0.000132
15
1.41E-65
0
0.000146
0.000103
0.000152
0.000140
16
8.17E-63
0
0.000135
0.000097
0.000140
0.000129
17
6.54E-65
0
0.000143
0.000099
0.000149
0.000136
18
6.70E-66
0
0.000138
0.000096
0.000144
0.000132
19
1.86E-65
0
0.000143
0.000099
0.000149
0.000136
20
1.79E-66
0
0.000142
0.000098
0.000148
0.000135
21
2.37E-62
0
0.000136
0.000099
0.000142
0.000131
22
5.51E-65
0
0.000141
0.000100
0.000147
0.000135
23
7.15E-67
0
0.000142
0.000097
0.000149
0.000136
24
1.06E-65
0
0.000144
0.000102
0.000150
0.000137
25
4.01E-65
0
0.000147
0.000101
0.000153
0.000140
26
2.33E-64
0
0.000141
0.000098
0.000147
0.000135
27
7.85E-65
0
0.000141
0.000100
0.000147
0.000134
28
2.07E-64
0
0.000141
0.000098
0.000147
0.000134
29
2.01E-63
0
0.000140
0.000098
0.000146
0.000134
30
2.77E-64
0
0.000139
0.000098
0.000145
0.000133
31
1.43E-66
0
0.000145
0.000098
0.000151
0.000138
32
1.08E-65
0
0.000141
0.000098
0.000147
0.000134
33
3.47E-62
0
0.000136
0.000099
0.000141
0.000130
34
6.04E-67
0
0.000140
0.000096
0.000147
0.000134
35
2.32E-65
0
0.000145
0.000100
0.000152
0.000139
36
6.39E-65
0
0.000143
0.000098
0.000149
0.000137
37
1.10E-61
0
0.000141
0.000103
0.000147
0.000135
38
6.74E-63
0
0.000142
0.000100
0.000148
0.000136
39
2.54E-64
0
0.000141
0.000098
0.000147
0.000135
40
2.45E-64
0
0.000139
0.000098
0.000145
0.000133
41
6.25E-66
0
0.000141
0.000099
0.000148
0.000135
42
3.99E-66
0
0.000141
0.000097
0.000147
0.000135
43
1.35E-66
0
0.000142
0.000098
0.000148
0.000135
44
1.01E-63
0
0.000134
0.000097
0.000140
0.000128
45
1.56E-64
0
0.000139
0.000097
0.000145
0.000133
46
3.11E-66
0
0.000145
0.000103
0.000152
0.000139
47
6.11E-66
0
0.000138
0.000099
0.000144
0.000131
48
2.99E-66
0
0.000146
0.000101
0.000152
0.000139
49
1.84E-63
0
0.000138
0.000098
0.000144
0.000131

The distribution within samples is not normal. Wilcoxon test shows that MO is significantly different from zero.

By the way, the change in the growth pattern of the point sum curve in the last 3/5 parts of the validation samples is explainable.

I have about the same number of observations for the 5 majors within the samples, and they are in the following order:

dat_eurusd 
dat_audusd 
dat_gbpusd 
dat_usdcad 
dat_usdchf

I am sure that for the last three pairs the volatility is higher and the spread is higher (we should take 25-30 pips, not 20). Therefore, the gross result and percentage of guessed directions is better for them. But introduction of an increased spread will not reduce the statistics to zero anyway. Just to understand what's going on.

FAQ.

 
SanSanych Fomenko:

It seems to me that one does not exclude the other.

Alexei has overcome the obvious disadvantage of all estimates of classification predictions: obviously, the value of correctly predicting a bar of 1 pip and the value of predicting a bar of 10 pips are completely different.

Congratulations Alexey!

Thank you, SanSanych. Everything is working. I will try to run it on MT4 for starters.
 

Warning. Found a bug in the code, which has given me great results. All my optimizations are canceled until the detailed analysis!

I have made an Expert Advisor in MT4 in connection with Arom. I saw discrepancies in the results. Therefore, I have made a detailed code review and found a crude error. I have not found anything profitable yet. The experiment goes on.