Machine learning in trading: theory, models, practice and algo-trading - page 570

 
Maxim Dmitrievsky:
  • Random Forest: Gini Importance or Mean Decrease in Impurity (MDI)[2]
  • Random Forest: Permutation Importance or Mean Decrease in Accuracy (MDA)[2]
  • Random Forest: Boruta [3]

https://medium.com/@ceshine/feature-importance-measures-for-tree-models-part-i-47f187c1a2c3

You should try to add at least 1 method to algib forests and then everything can be done automatically in МТ5 without R, for example data retrieval


The most awesome predictor selection in caret: gafs- genetic selection of predictors; rfe- reverse predictor selection (the fastest); safs- simulated stability of predictor selection (annealing) - the most effective.


If we talk about machine learning, we should take caret - it is a shell that includes the whole cycle: datamining, modeling, estimation.



PS.

All you clinging to the subdeluge, so you will be: "here's to finish writing...".

 
SanSanych Fomenko:



No, I mean the RF method of selection, not the other ways. Or is it for RF? I understand Gini is the most popular.

Yes, R MDI is used

 

About the selection of predictors...
Initially added to the model - day of the week. I looked at the trade of the best variant and it turned out that in 40 days (8 weeks) it had learned that you should buy on Thursday and in the test for 10 days (2 weeks) it was buying almost every Thursday bar and was winning. And on other days she either did not trade or made only isolated deals.
Conclusion - I have to remove day of the week to make my trading more balanced. I have tested it now and have a look what will come out.
But I was eliminated purely by common sense, the automation considered this predictor very important.

So it's a bit risky to rely on automation... Although it is possible that the day of the week is the only thing you can see and understand manually, something smaller and not notice

 
elibrarius:

About predictor selection...
Initially I added to the model - day of the week. I looked at the trading of the best variant and it turned out that for 40 days (8 weeks) I had learned that I should buy on Thursday and in the test for 10 days (2 weeks) I almost bought every Thursday bar and was profitable. And on other days she either did not trade or made only isolated deals.
Conclusion - I have to remove day of the week to make my trading more balanced. I have tested it now and have a look what will come out.
But I have eliminated it by purely common sense, automated system has considered this predictor to be very important.

So, I don't think it's safe to rely on automatics...


I'm just curious to understand how it works, so as not to do anything :) if the variable is not important, then just delete it, so as not to generate unnecessary dimensionality

fiches importance is not about whether the model will work on the forward, about how well the fiches at the moment describe the target

On the other hand, for NS or Deep NS feature selection is not very important at all, it just throws less weight and extra features have almost no effect. Additional tuning is certainly good, but it's data satanism for stats, it is not suitable for handicap, and will give there 5-7% increase in quality, which is nothing

In itself the selection of predictors in forex is almost a waste of time, the importance will change from set to set. IMHA

 
Maxim Dmitrievsky:

On the other hand, for NS or Deep NS the selection of traits is not very important at all, it just throws smaller weights and extra traits have almost no effect. Additional tuning is certainly good, but it's data satanism for statistics, for the handicap is not suitable

In my example Thursday turned out to be very good for shopping (in the interval of 50 days) and apparently NS has assigned the highest weight to the day of the week. NS has worked, but I think buying on every bar on Thursdays is a mistake. After all, everything can change, we need to look for deeper patterns.

Maybe teach on the year... But it will be 6-7 times longer. Then maybe the day of the week will go down in importance.

 
elibrarius:

In my example, Thursday turned out to be a very good shopping day (at the 50-day interval) and apparently the NS gave the day of the week the highest weight. NS has earned, but I think buying on every bar on Thursdays is a mistake. After all, everything can change, we should look for deeper patterns.


It can change, yes :) you either need a larger sample or to retrain yourself and assess yourself... I do the second one. The last run, so to speak :)

 

I remember the JPY chart as if it was chopped up with an axe. So how to trade here. These amplitudes greater than 1000 pips should be structured (by levels of volatility gaps) and the probability of their repetition should be calculated. The probability is high - the Expert Advisor is asleep on this chart. Or you can catch the increase of the gap probability and knowing which way it will go, use pending orders to try to settle it. It is also important to study what happens before the gap - because it is clearly man-made, so they cheat there before the gap in order to concentrate the orders in the opposite direction from the direction of the forthcoming gap. Added.


And I agree about the set. But the set for each timeframe is different. For minutes it's a day, for an hour it's a month, etc. Or you call a set a profitable pattern of quote behavior, which occurs more often? Then yes - a set will be measured by the time of this "pattern's" creation

And those values of quotes history or your predictors (as I understand they are repeating patterns of quote behavior on the chart), which are closer to the current date, should have a greater weight, that is the history for 1 year is of interest only for searching these predictors - the repeating "profit" patterns of quote behavior for the array of predictors and give them a weight according to their detection frequency on the quotes history. The second weight should increase the status of the predictor depending on the proximity to the latest events on the chart.

It's like Mendeleev waking up and telling everything in my sleep. I do not understand it myself))) Good luck, waiting for the results.

100% p.a. and stability in trading. Do not need anything else. Do not overdo it.


Predictor

11.1.5. ЛИНЕЙНЫЙ ПРЕДИКТОР И ФУНКЦИЯ СВЯЗИ
  • lib.alnam.ru
линейный предиктор изменяется на единиц. (Это может быть как реалистичной интерпретацией, так и нереалистичной. В примере с объемом древесины, если радиус ствола дерева увеличивается, то, вероятно, возрастает и его высота.) Ожидаемое значение У связано с линейным предиктором посредством функции связи . В некоторых случаях известны естественные...
 
geratdc:

100% p.a. is not enough, you need a month :)

 
elibrarius:

About the selection of predictors...
Initially added to the model - day of the week. I looked at the trade of the best variant and it turned out that in 40 days (8 weeks) it had learned that you should buy on Thursday and in the test for 10 days (2 weeks) it was buying almost every Thursday bar and was winning. And on other days she either did not trade or made only isolated deals.
Conclusion - I have to remove day of the week to make my trading more balanced. I have tested it now and have a look what will come out.
But I have eliminated it by purely common sense, automated system has considered this predictor to be very important.

So it's a bit risky to rely on automation... Although it is possible that the day of the week is the only thing you can see and understand manually, something smaller and not notice

but no... After removing the day of the week, everything remained the same. There are 5 daily bars as a context, maybe it goes there... I have to try it without them
 
elibrarius:
but no... After removing the day of the week, everything remained the same. There are 5 daily bars as a context, maybe it goes there... I would have to try it without them

So it has little effect... how do I know without imports? ) only if you shuffle the values of each of the predictors in turn, retrain and watch how the total error changes

only it can be a surprise that randomly shuffled data will suddenly turn out to be very important :) + long time if there are a lot of chips. So without features it's a sysyphean task.