Machine learning in trading: theory, models, practice and algo-trading - page 2799

 
СанСаныч Фоменко #:

Valuation itself is a relative thing.

I'll repeat the pictures.

It's bad, it's hopeless.


Better, if there are several of them, we can talk about 30% prediction error.


And rubbish must be removed, because on the training set the chip can lie in favour of rubbish, it is easier to find the value that leads to the optimum.

the more mutual information in class partitioning, the less the distributions overlap, which is logical

The distributions will still float away on new data.

I wouldn't rely on any such manipulation much, just an idea to try.
 
elibrarius #:

Busting searches for the best splits from all columns and all examples. I.e. it uses the best fiches.
Schuch. forest takes half of the fiches and half of the examples (the share is configurable) for each tree and then from 20-100 trees it finds the average. If there are only 5 informative chips out of 200 chips, then some of the trees will not contain informative chips (on average 2.5 informative chips per tree). And we will average a part of informative trees with noise trees. The result will also be very noisy.
A sporadic forest works well if there are a lot of informative chips (as in classical examples/ MO problems).

Busting will find and use the most informative fiches, since it checks them all. So by the logic of bousting, it will select the best fiches by itself. But bousting has its own problems too.

Logically
 
Maxim Dmitrievsky #:
the more mutual information when dividing into classes, the less the distributions overlap, which is logical

The distributions will still float on the new data.

I wouldn't rely heavily on any such manipulation, just an idea to try.

You haven't noticed the variability of sd

 
СанСаныч Фоменко #:

Can't agree on the bousting.

Bousting will find features that have strong correlation (predictive power) - believe it. Everything is fine if the magnitude of the relationship is constant. Giving up the estimation of the trait itself, in bousting we cannot track the variability of the magnitude of the association, and according to my data SD of the association estimate can vary from 10% to 120 (on my traits). What will bousting give us? After all, we need to sample the traits that have more variability.

All MO models look for patterns. Bousting automatically selects the best traits on the traine.

If there is variability (e.g. in market data), then we have to do something extra. I've experimented with Walking Forward. But it just shows the result, it does not affect the selection of signs. And nothing can predict which features will work in the future if there are no patterns or they change. The only chance is that they do not change instantly, and the pattern will work for some time.

 
Aleksey Vyazmikin #:

Created a thread with a sample that proves otherwise - bousting is not omnipotent, especially out of the box.

I think it's not a boosting problem, but a data variability problem. I will try to train on your data.
 
elibrarius #:
I think it's not a boost problem, but a data variability problem. I'll try to train on your data.

Of course, it's not the algorithm per se, but the data.

Give it a try and see what you come up with!

The sample is relatively unique in that it is hard to train on it so that something works outside of training.

I am still experimenting with it.

 
Aleksey Vyazmikin #:

The sample is relatively unique in that it's hard to train on something that would work outside of training.

How is that unique? Market data usually doesn't work outside of training. I asked you a couple of questions there

 
elibrarius #:

Is that uniqueness? Outside of training, market data usually doesn't work. I asked you a couple of questions there

Well, they don't work, they usually work, but not very well.

The peculiarity here is that CatBoost model prefers to assign all examples to probability less than 0.5 - so it doesn't classify the target "1", and what is between 0 and 0.5 is also not very well distributed - there are screenshots of the model in the thread.

 
Aleksey Vyazmikin #:

The peculiarity here is that the CatBoost model prefers to assign all examples to a probability less than 0.5 - so it does not classify the target "1", and what is between 0 and 0.5 is also very poorly distributed - there are screenshots of the model in the thread.

The peculiarity here is that there seems to be a strong imbalance of classes, if for 100 examples there are 5 labels of one class and 95 labels of another class, how can the model give more than 0.5 probability for the first class? This is not a question to the model, it is a question to the author of the dataset...?

 
mytarmailS #:

The peculiarity is that there seems to be a strong imbalance of classes, if for 100 examples there are 5 marks of one class and 95 marks of another, how can the model give more than 0.5 probability for the first class?? it's not a question to the model, it's a question to the author of the dataset.

There's over 30% first class. And, yes, it can, I don't see the problem. It's enough to find one rule\list that will be more likely to predict "1" than "0", even if rarely.

Besides - no one prevents to change the dataset by balancing the classes.