Machine learning in trading: theory, models, practice and algo-trading - page 1710

 
Maxim Dmitrievsky:

Max! Remind me again what these models are called...

1) Model 1 is trained

2) model 2 is trained by the predictions on the test data of model 1 etc...

Stacking?

 
Aleksey Vyazmikin:

Yes, strange results. Don't they take the probability from the test sample involved in the training? But there seems to be an error.

And how many total units (lines of target) are in the sample?
There is no test sample.
There are 891 rows in total in the dataset.

I think one of the formulas rms,rmse,cls or something else is used there. The main thing is that the result converges at 0%, 50%, and 100%. And in between they are curved. Splitting by class is usually done at 50%, and in this place there is a coincidence with the usual probability. So decided to leave the question unresolved.
 
Aleksey Vyazmikin:
elibrarius:

Can I ask you a question?

Why ketbust? What does it have that analogues do not have?

 
elibrarius:
There is no test sample.
There are 891 rows in total in the dataset.

I think one of the formulas rms,rmse,cls or something else is used there. The main thing is that the result converges at 0%, 50%, and 100%. And in between they are curved. Splitting by class is usually done at 50%, and in this place there is a coincidence with the usual probability. So decided to leave the question unresolved.

Yeah, you have to break the code to understand the depth of the idea. But it's interesting how they assign weights to leaves, taking into account the already existing ones.

 
mytarmailS:

Can I ask you a question?

Why ketbust? What does it have that analogues do not have?

I am interested in it for reasons:

1. Support - a lot of information and feedback from developers.

2) Fast learning - I want to use all cores of the processor.

3. Flexible settings for model building and retraining control - although there's a lot of room for improvement.

4. The possibility to use binary symmetric models after training in MQL5, but it is not my development.

 
Aleksey Vyazmikin:

Thanks

 

Maybe someone will be interested in

There is a new book on predicting time series in R, including examples of bitcoin prediction

https://ranalytics.github.io/tsa-with-r/

 
Aleksey Vyazmikin:

Yeah, you have to break the code to understand the depth of the idea. But it is interesting how they assign weights to leaves, taking into account already existing ones.

By definition
The idea of gradient binning is to build an ensemble of sequentially refining each other elementary models. The nth elementary model is trained on the "errors" of the ensemble of n-1 models, and the answers of the models are weighted together. "Errors" are in quotes here, because in fact each successive model approximates the antigradient of the loss function, which is not necessarily equal to the difference between the actual and predicted values (i.e., the error is literally the same).

It seems that the weights are determined as usual - by probability.
But the split is apparently not just the best one, but the one that improves the overall result. But this is just a guess. It's impossible to look through the code since there are kilometers of listing. It's not 4000 strings from the alglib.

mytarmailS:

Why catbust? What does it have that analogues don't have?

I agree with Alexey. I have some experience with xgboost. It will be possible to compare in practice.
 
elibrarius:

I was just asking, I see how you are struggling with these trees from ketbust, there are some problems with output, crutches...

I have got into the subject of "rule induction" and I see that R has many packages for generating rules or rule ensembles...


1) rules are easy to output, one line

2) rules are easy to read for a human

3) there are a lot of types of rule generation, from trivial to genetic

4) the quality of prediction on the level of everything else


So I think maybe you should not bother with this kitbust. and get something more pleasant or something like that.

 
mytarmailS:

Max! Remind me again what these models are called...

1) Model 1 is trained

2) model 2 is trained on the predictions on the test data of model 1 etc...

stacking ?

meta labeling de prado