Machine learning in trading: theory, models, practice and algo-trading - page 3117

 
mytarmailS #:

I don't know.

And what good is this filtering by the second model?

It's better on the new data.

 
Maxim Dmitrievsky #:

on the new data, it's better

if we set probability thresholds on the initially single model like

> 0.7 buy

< 0.3 sell

then it will also become better on both test and traine, and there will be fewer trades naturally...

does the second model really give something? I'm curious...

Were there any tests, comparisons?

 
Maxim Dmitrievsky direction of the trade and a meta model that predicts the probability of winning (to trade or not to trade):

Let's call the first model the main model, which divides the feature space into buy/sell with a black line. And the second is a meta model that divides the total feature space into trade/don't trade (red line).

Now let's imagine another variant, when there are two meta models and each of them divides different feature spaces of BUY and SELL classes into trade/non-trade separately (two red lines).

A purely theoretical question "to think about" is whether the second option is better. And if it is better, why. Please comment.

A request, probably, even to Alexei Nikolaev, how one can determine the effect of such "intervention". After all, we will get 2 probability distributions of two meta models, which can be compared/evaluated/distributed by corners.

This is an ambiguous statement of the problem.

It turns out that we believe the second, probabilistic model more than the first one, and use the second model as a filter for the first one.

Or we treat the situation as an "AND" operation, i.e. intersection of results.


It's a dead end, been there, done that.


I have not met any models that would give a direction, because if they give a direction even externally, it is the result of regularisation of the probability of direction. That's why the standard approach for R called "ensemble of models" is suggested, when the results of two or many models, so to say, of the first level, are used as predictors in some classification algorithm of the second level. By the way, if you like categorical variables so much, you can feed them into the input of a classifier too. If it is possible to rank the results of models by confidence level, it can be adjusted by weights. That is, the second level is a classifier that uses the results of the first level model classification as predictors. This approach is very interesting for unbalanced classes obtained by some regularisation other than 0.5, e.g. we divide the classifier result as a probability by quantiles with values of 0.4 and 0.6. The middle is out of the market.

 
mytarmailS #:

if on the initially single model we put probability thresholds like

> 0.7 buy

< 0.3 sell

then it will also become better on both test and traine, and there will be less trades naturally....

Does the second model really give something? I wonder...

Have there been any tests, comparisons?

Imagine that you have trained the first model through cross-validation, and put all wrong predictions into the second model as not to trade. You already have a statistical significance that the first model is more often wrong in certain places, and always trades well in some places. This can then be filtered out by the second one. Through one model is already more difficult to do. There are other variants of such tuning.
 
Maxim Dmitrievsky #:
Imagine that you have trained the first model through cross validation, and put all incorrect predictions into the second model as a no-trade. You already have statistical significance that the first model is more likely to be wrong in certain places, which can be filtered out by the second model. Through one model is already more difficult to do. There are other variants of this kind of tuning.

Well, that sounds reasonable.

 
mytarmailS #:

Well, that sounds reasonable.

Even if the second model is wrong too, it still somehow corrects the errors of the first one in this case, yes, kind of like that. In Kozul inference there is a more rigorous justification of their approaches. I'd say perfectly rigorously proven.

https://en.wikipedia.org/wiki/Frisch%E2%80%93Waugh%E2%80%93Lovell_theorem

 
Forester #:
I haven't tried it. Intuitively) But as Marx said: practice is the criterion of truth. If it works for you in practice - good).

I'm trying to switch to the second option, in the process.

 
СанСаныч Фоменко #:

An ambiguous statement of the problem.

It turns out that we believe the second, probabilistic model more than the first one, the second model is used as a filter for the first one.

Or we interpret the situation as an "AND" operation, i.e. intersection of results.


A dead-end way, we have been through it.


I have not encountered any models that would give a direction, because if they give a direction even externally, it is the result of regularisation of the probability of direction. That's why the standard approach for R called "ensemble of models" is suggested, when the results of two or many models, so to say, of the first level, are used as predictors in some classification algorithm of the second level. By the way, if one likes categorical variables so much, they can also be fed into the input of a classifier. If it is possible to rank the results of models by confidence level, it can be adjusted by weights. That is, the second level is a classifier that uses the results of the first level model classification as predictors. This approach is very interesting for unbalanced classes obtained by some regularisation other than 0.5, e.g. we divide the classifier result as a probability by quantiles with values of 0.4 and 0.6. The middle is out of the market.

Ensemble is close in meaning but far away in implementation. Because the proposed approach can be used in different ways to obtain different results, it is very flexible.

I also did ensembles, it didn't work.

 
Maxim Dmitrievsky #:
Imagine that you have trained the first model through cross validation, and put all incorrect predictions into the second model as a no-trade. You already have statistical significance that the first model is more likely to be wrong in certain places, which can be filtered out by the second model. Through one model is already more difficult to do. There are still other variants of such tuning.

The idea of error filtering is not clear to me at all.

It turns out that if the model predicts 50/50, then throwing out the bad 50 the rest will predict 100%? That's just superlearning and nothing else.


Classification error arises from the fact that the same values of predictors in some cases predict correctly, and in other cases not correctly, and this is the problem that can only be eliminated at the stage of filtering the "strength of the relationship" between the predictor and the target variable, and it is completely impossible, God willing to filter predictors and at this expense to reduce the classification error by 10 per cent.

 
СанСаныч Фоменко #:

To me, the idea of filtering out errors is completely incomprehensible.

It turns out that if the model predicts 50/50, then throwing out the bad 50 the rest will predict 100%? It's just super-learning and nothing else.


Classification error arises from the fact that the same values of predictors in some cases predict correctly, and in other cases not correctly, and this is the problem, to get rid of which can be only at the stage of filtering the "strength of the relationship" predictor and target variable, and completely impossible, God willing to filter predictors and at this expense to reduce the classification error by 10 per cent.

Your philosophy has been clear for a long time, where are the results? ) what are they, show me.

I got an improvement on OOS and rejoiced, I keep improving until the approach exhausts itself.
Reason: