Machine learning in trading: theory, models, practice and algo-trading - page 3480

 
СанСаныч Фоменко #:

That's exactly what I'm clarifying.

For you, who write in R, it is natural, but for others it does not even occur to you that learning is carried out on randomly chosen lines.

Hmm, maybe it's contagious....

Training is done according to the algorithm - there are variants with random subsampling per conditional iteration, and there are variants without.

However, I was clarifying originally the terminology of fxsaber, which is from algotrading in a broad sense. Therefore, clarifications of both are inappropriate.

 
Aleksey Vyazmikin #:

Hmm, maybe it's contagious.....

Training is done according to the algorithm - there are variants with random subsampling per conditional iteration, and there are variants without.

However, I was specifying originally the terminology of fxsaber, which is from algotrading in a broad sense. Therefore, clarifications of both are inappropriate.

It is inappropriate to have a lot of terminologies for one concept (everyone has his own).

There are: traine, validation, test.

fxsaber uses trace and validation, but validation is called OOS and implies test.

Well, go ahead, dig into it yourself if you like to talk about nothing....

 
mytarmailS #:

fxsaber  uses traine and validation, but validation is called OOS and implies a test.

Did he write that he is doing the validation on the second section? He's just evaluating the results. That's what I'm asking, what's the chance there.

And then, it is not difficult for me to switch to other terms, if I am familiar with them, for dialogue with the interlocutor, and not to pretend that I don't understand what he is talking about.

Or do you think I should write "I'm sorry, but you have two samples, not three - so you're doing it wrong - forget about coming here"?

 
Even before our era, Aristotle said: "...To have more than one meaning is tohave no meaning at all; if words have no (definite) meanings, then all possibility of reasoning with each other, and indeed with oneself, is lost, for it is impossible to think anything if one does not think one thing every time."
 
mytarmailS #:
Even before our era, Aristotle said: "...To have more than one meaning is tohave no meaning at all; if words have no (definite) meanings, then all possibility of reasoning with each other, and indeed with oneself, is lost, for it is impossible to think anything unless one thinks of one thing at a time".

I don't even know if there were already synonyms in his native language back then.....

 
Such a question has been bothering me for a long time, and what will happen if in predictors to change their digital value, well, let's say to change places 1 and 10, will change the result of training on the same CatBoost?
 
Aleksey Vyazmikin CatBoost change?
If they are categorical, it will not change. At switched off randomisation (fixed seed, but maybe in CatBoost there is something else random...).
If numerical and there are values from 2 to 9, then 1 will appear behind 9, and 10 to 2 at sorting. Grading the splits will become different.
 
Forester #:
If they are categorical, it won't change. If randomisation is disabled (fixed seed, but maybe in katbusta there is something else random...).
If they are numeric and there are values from 2 to 9, then 1 will be behind 9, and 10 to 2 when sorting. The evaluation of splits will become different.

Suppose the algorithm doesn't work with cat predictors. If the algorithm is strong, then, it should find similar splits as before the permutation - one hypothesis was. In fact, the learning is quite different. This means that if you just transform the scale so that the order of values changes, it will change the learning result.

So I thought, and ordered the values by probability shift after quantisation, the training became 7 times faster - instead of 400 trees - only 60, but the financial result on the other two samples became much worse. It turns out that due to the haus of the class membership probability distribution, randomisation makes learning slightly better.

 

If you look at the average logloss of 100 models:

train was 0.518 became 0.448.

test was 0. 543 became 0.555

exam was 0. 560 became 0.570.

I.e. for the 2nd and 3rd samples the result is comparable, but the first sample has faster learning/generalisation after transformation.

 
Aleksey Vyazmikin #:

If you look at the average of 100 models:

train was 0.518 became 0.448

test was 0.543 became 0.555

exam was 0.560 became 0.570

I.e. the result is comparable for the 2nd and 3rd samples, but the first sample has faster learning/generalisation after transformation.

0.57/0.448 = 1.2723, i.e. 27% difference. the model can be discarded.

Reason: