Machine learning in trading: theory, models, practice and algo-trading - page 2944

 
Stanislav Korotky #:

Please explain how the following formula is obtained in the algorithm of classification on trees(you can link to PDF):


In all materials that I could find in the Internet, the formula is just magically "taken from the ceiling".

If summarising by classes, the denominator is the Gini index or node purity. The smaller it is, the better. In the numerator is the number of rows in the sheet.

The bigger the criterion, the better - classes are separated more cleanly, but without excessive chopping of sheets.

The Gini index seems to be chosen because it is considered more sensitive than the classification error rate.

 
Aleksey Nikolayev #:

If summarised by class, the denominator is the Gini index or node purity. The smaller it is, the better. The numerator is the number of rows in the sheet.

The bigger the criterion, the better - classes are separated more cleanly, but without excessive sheet shredding.

The Gini index seems to be chosen because it is considered more sensitive than the classification error rate.

No, summarising over the records that hit the node. The question is not about the measure of informativeness. It's about transferring "residuals" between trees - there is a constant recalculation from probability to logit and back again.

 
Stanislav Korotky #:

No, summarising by the records that hit the node. The question is not related to the measure of informativeness. It's about transferring "residuals" between trees - there is a constant recalculation from probability to logit and back again.

And how can frequency be counted for a record in general? For a class it is clear how.

 
Stanislav Korotky #:

No, summarising by the records that hit the node. The question is not related to the measure of informativeness. It's about transferring "residuals" between trees - there is a constant recalculation from probability to logit and back again.

Or is it about classification by logistic regression? Either way, a formula plucked from somewhere is not enough, you need the whole text.

 
Aleksey Nikolayev #:

Or are we talking about classification by logistic regression? In any case, a formula plucked from somewhere is not enough, you need the whole text.

Logit function in the sense of ln(odds). You need it to translate the region of probability values [0,1] to plus or minus infinity - otherwise you can't train by gradient.

For example, here is the text - https://medium.com/swlh/gradient-boosting-trees-for-classification-a-beginners-guide-596b594a14ea

And here is the video - https://www.youtube.com/watch?v=hjxgoUJ_va8.

PS. IMHO, both there and there are errors in the material.
Gradient Boosting Trees for Classification: A Beginner’s Guide
Gradient Boosting Trees for Classification: A Beginner’s Guide
  • Aratrika Pal
  • medium.com
Introduction Machine learning algorithms require more than just fitting models and making predictions to improve accuracy. Nowadays, most winning models in the industry or in competitions have been using Ensemble Techniques to perform better. One such technique is Gradient...
 
Aleksey Nikolayev #:

If summarised by class, the denominator is the Gini index or node purity. The smaller it is, the better. The numerator is the number of rows in the sheet.

The bigger the criterion, the better - classes are separated more cleanly, but without excessive sheet shredding.

The Gini index seems to be chosen because it is considered more sensitive than the classification error rate.

Oh!
Finally someone knows about the Gini index.... I looked it up back in '18, the code for it. https://www.mql5.com/ru/blogs/post/723619
Нужна ли деревьям и лесам балансировка по классам?
Нужна ли деревьям и лесам балансировка по классам?
  • www.mql5.com
Я тут читаю: Флах П. - Машинное обучение. Наука и искусство построения алгоритмов, которые извлекают знания из данных - 2015 там есть несколько страниц посвященных этой теме. Вот итоговая: Отмеченный
 
Stanislav Korotky #:

Logit function in the sense of ln(odds). It is needed to translate the region of probability values [0,1] to plus or minus infinity - otherwise it will not be possible to train by gradient.

Yes, it is used for logistic regression when you are looking for the probability (logit function from it) of belonging to a class.

It seems that the author wants to present the insides of bousting in a popular way, but he has taken a too complicated variant of the problem. He mixes logit regression, trees and bousting, which are not easy to understand by themselves. The essence of bousting cannot be stated logically without funcan. To understand the essence of logit regression, you need a theorist (binomial distribution, probably).

 
Forester #:
Oh!
Finally someone knows about the Gini index... I was back in '18 looking for the code for it. h ttps:// www.mql5.com/ru/blogs/post/723619

There's also the Gini coefficient. It's also used in the MOE, but that's different.)

 
Stanislav Korotky #:

Please explain how the following formula is obtained in the algorithm of classification on trees with bousting(you can link to PDF):


In all materials that I could find in the Internet, the formula is just magically "taken from the ceiling".

Where did you get the formula from? Judging by the "from the ceiling" usual collective farming, most likely Soviet.

You need to use professional maths, for which there are well-established algorithms.

R has a huge number of wooden models, and the difference between professional R language and very many others is obligatory references to the authors of the algorithm and the corresponding publication. At a quick glance, I can't remember any more or less complex function from R packages that doesn't have corresponding references.


Forget about everything but R. Today it is the only professional environment for statistical calculations.

 
I love R, for me it's the best language in the world, but Sanych's constant adverts in his every post make me really sick.