Machine learning in trading: theory, models, practice and algo-trading - page 1255

 
Maxim Dmitrievsky:

If the market is more or less stable, a trend or something, then this time will work, at least for me... the patterns are the same, why not?

i have simplified the learning process to one button press and i don't need any predictors )) i made such a fun machine and i may sell it as an exhibit of human madness

Well, I did not mess with the predictors from the beginning. But with one button I did not even try. With one button I have except for the same optimization, only in profile, nothing works. How you get around it with one button is a mystery).

 
Yuriy Asaulenko:

Well, I didn't mess around with the predictors from the beginning. But I didn't even try with one button. With one button I except the same optimization, only in profile, nothing works. How you get around it with one button is a mystery).

I use montecarrel and look for the best error in the test sample, that's all

Optimizing optimizer aha
 
Maxim Dmitrievsky:

Monte Carlo and look for the best error in the test sample, that's it.

You can't do without Monte Carlo.) And with the best error is not so simple. Optimality is a multifactorial and ambiguous thing, and what is the optimum is far from obvious.

 
Yuriy Asaulenko:

You can't do without Monte Carlo.) And with the best error is not so simple. Optimality - a thing multifactorial and ambiguous, and that there is an optimum - is not obvious.

I see. I take any periodic equation where the optimum is evident and the system will make money at least forever.)

There is no optimum in the market, only local ones

 
 

I read theory on trees.
I am thinking about pruning (pruning).

It is possible to use the following simple rule: build a tree, cut off or replace with a subtree those branches which will not lead to an increase in the error.

Maybe it is easier and faster not to divide the leaf when building a tree, if you can not find a division that reduces the error by at least some value, such as 0.1-0.5%?
The result should be the same, but faster.

Or is it possible that after a couple of branches with improvement of the model by 0.0001% there will be one that will improve it by 1-5% at once?

 
Maxim Dmitrievsky:

How much faster could it be? It's very fast as it is.

And in general, you're going to build your own boosting

The Bayesian methods are slow and not for large samples, but they work differently and are not retrained out of the box. Each model has its own peculiarities. I'm getting a kick out of Bayesian now, great power to optimize TC without retraining

Good perspective - you can update them and not retrain them

If I feed a million strings with 200-1000 predictors, it's probably a long time...
The problem with pruning is that you have to build the tree to the end, then prune it.
And with stopping branching by min error improvement, I think there would be significant savings with a similar result. In xgboost parameter is called gamma and there is no clipping. Apparently developers also decided that these things are interchangeable.
 
elibrarius:
Well, if million minute lines to feed with 200-1000 predictors, it's probably a long time...
And with pruning - you have to build the tree to the end, then prune it.
And with stopping branching by min error improvement, I think there will be significant savings with a similar result. In xgboost parameter is called gamma and there seems to be no pruning. Apparently developers also decided that these things are interchangeable.

Well, they know better how to do it, there were teams of specialists working on boosting, testing

catbust seems to work fine, fast, the trees there are originally shallow

millions of data on forex... i doubt that it is necessary

 
By the way, I came up with a situation where the first division almost does not improve the error, and the second improves by 100%.

4 sectors with 10 points each. 1 division on the x-axis or on the y-axis. It will not almost improve the error, it will remain about 50%. For example, the first division is in the middle vertically. The second split in the middle horizontally will lead to a very strong improvement in error (from 50% to zero).
But this is an artificially created situation, it does not happen in life.
 
Sample sizes are never large. If N is too small to get a sufficiently-precise estimate, you need to get more data (or make more assumptions). But once N is "large enough," you can start subdividing the data to learn more (for example, in a public opinion poll, once you have a good estimate for the entire country, you can estimate among men and women, northerners and southerners, different age groups, etc.). N is never enough because if it was "enough" you'd already be on to the next problem for which you need more data.
Reason: