Machine learning in trading: theory, models, practice and algo-trading - page 2944
You are missing trading opportunities:
- Free trading apps
- Over 8,000 signals for copying
- Economic news for exploring financial markets
Registration
Log in
You agree to website policy and terms of use
If you do not have an account, please register
Please explain how the following formula is obtained in the algorithm of classification on trees(you can link to PDF):
In all materials that I could find in the Internet, the formula is just magically "taken from the ceiling".
If summarising by classes, the denominator is the Gini index or node purity. The smaller it is, the better. In the numerator is the number of rows in the sheet.
The bigger the criterion, the better - classes are separated more cleanly, but without excessive chopping of sheets.
The Gini index seems to be chosen because it is considered more sensitive than the classification error rate.
If summarised by class, the denominator is the Gini index or node purity. The smaller it is, the better. The numerator is the number of rows in the sheet.
The bigger the criterion, the better - classes are separated more cleanly, but without excessive sheet shredding.
The Gini index seems to be chosen because it is considered more sensitive than the classification error rate.
No, summarising over the records that hit the node. The question is not about the measure of informativeness. It's about transferring "residuals" between trees - there is a constant recalculation from probability to logit and back again.
No, summarising by the records that hit the node. The question is not related to the measure of informativeness. It's about transferring "residuals" between trees - there is a constant recalculation from probability to logit and back again.
And how can frequency be counted for a record in general? For a class it is clear how.
No, summarising by the records that hit the node. The question is not related to the measure of informativeness. It's about transferring "residuals" between trees - there is a constant recalculation from probability to logit and back again.
Or is it about classification by logistic regression? Either way, a formula plucked from somewhere is not enough, you need the whole text.
Or are we talking about classification by logistic regression? In any case, a formula plucked from somewhere is not enough, you need the whole text.
Logit function in the sense of ln(odds). You need it to translate the region of probability values [0,1] to plus or minus infinity - otherwise you can't train by gradient.
For example, here is the text - https://medium.com/swlh/gradient-boosting-trees-for-classification-a-beginners-guide-596b594a14ea
And here is the video - https://www.youtube.com/watch?v=hjxgoUJ_va8.
PS. IMHO, both there and there are errors in the material.If summarised by class, the denominator is the Gini index or node purity. The smaller it is, the better. The numerator is the number of rows in the sheet.
The bigger the criterion, the better - classes are separated more cleanly, but without excessive sheet shredding.
The Gini index seems to be chosen because it is considered more sensitive than the classification error rate.
Finally someone knows about the Gini index.... I looked it up back in '18, the code for it. https://www.mql5.com/ru/blogs/post/723619
Logit function in the sense of ln(odds). It is needed to translate the region of probability values [0,1] to plus or minus infinity - otherwise it will not be possible to train by gradient.
Yes, it is used for logistic regression when you are looking for the probability (logit function from it) of belonging to a class.
For example, here is the text - https://medium.com/swlh/gradient-boosting-trees-for-classification-a-beginners-guide-596b594a14ea
It seems that the author wants to present the insides of bousting in a popular way, but he has taken a too complicated variant of the problem. He mixes logit regression, trees and bousting, which are not easy to understand by themselves. The essence of bousting cannot be stated logically without funcan. To understand the essence of logit regression, you need a theorist (binomial distribution, probably).
Oh!
Finally someone knows about the Gini index... I was back in '18 looking for the code for it. h ttps:// www.mql5.com/ru/blogs/post/723619
There's also the Gini coefficient. It's also used in the MOE, but that's different.)
Please explain how the following formula is obtained in the algorithm of classification on trees with bousting(you can link to PDF):
In all materials that I could find in the Internet, the formula is just magically "taken from the ceiling".
Where did you get the formula from? Judging by the "from the ceiling" usual collective farming, most likely Soviet.
You need to use professional maths, for which there are well-established algorithms.
R has a huge number of wooden models, and the difference between professional R language and very many others is obligatory references to the authors of the algorithm and the corresponding publication. At a quick glance, I can't remember any more or less complex function from R packages that doesn't have corresponding references.
Forget about everything but R. Today it is the only professional environment for statistical calculations.