Forest is better suited for BP analysis, but will it handle 100,000 samples? - General

Forester 2018.12.26 06:22 #12351

Dear foresters. Is it necessary to do class balancing for trees and forests (equalize the number of examples of different classes)?

Дмитрий 2018.12.26 06:33 #12352

elibrarius:
Dear Foresters. Is it necessary to do a balancing of classes for trees and forests (equalize the number of examples of different classes)?

No

Forester 2018.12.26 07:37 #12353

Dimitri:

No

I'm reading here: Flach P. - Machine Learning. The Science and Art of Building Algorithms that Extract Knowledge from Data - 2015

There are several pages devoted to this topic there. Here's the bottom line:

Point 1 noted says that balancing is useful.

But there is also point 2. From which we can conclude that with a large sample, when there are enough examples of a small class, then the sample on it will become representative. And then balancing is not necessary.
How many examples can be considered representative of BP?

And then there's para. 3. But it is difficult to know whether there is such a correction in the particular implementation of the tree in the program chosen to use.

Features of the mql5 [ARCHIVE!] Any rookie question, Can someone create this

Дмитрий 2018.12.26 07:47 #12354

elibrarius:

I'm reading here: Flach P. - Machine Learning. The Science and Art of Building Algorithms that Extract Knowledge from Data - 2015

There are several pages devoted to this topic there. Here's the bottom line:

Point 1 noted says that balancing is useful.

But there is also point 2. From which we can conclude that with a large sample, when there are enough examples of a small class, then the sample for it will become representative. And then the balancing is unnecessary.

And then there's para. 3. But it's hard to know if there is such a correction in the particular implementation of the tree in the program chosen to use.

In my opinion, the afftar is laying out the law of large numbers for MO.

Clearly, if you have 10 observations to the first class and 6 to the second, adding 4 to the second will change the model (not necessarily improve it), but it will still not be representative.

Can the SB chart About the coin [Archive] Learn how to

Forester 2018.12.26 07:52 #12355

Dimitri:

In my opinion, the afftar is stating the law of large numbers for MO.

Clearly, if you have 10 observations to the first class and 6 to the second, adding 4 to the second will change the model (not necessarily improve it), but it will still not be representative.

No not large, he explained on small numbers of 10: 8:2 vs. 6:4. But we have a lot of data.

How many examples can be considered representative of BP? I usually don't use less than 10000, small class it should have at least 1000

[What do you mean THE NEW NIPPLE SYSTEM! Where is the line

Дмитрий 2018.12.26 07:56 #12356

elibrarius:
Yes, he was just looking at examples of 10 vs. 8:2 vs. 6:4. But we have a lot of data.

How many examples can be considered representative of BP?

HZ. I took the maximum, but I was working on daily data for trees and forests - at least 2 years.

Ask A_K - he determined the optimum through Chebyshev's inequality (if I remember correctly), but it is only for continuous variables.

Try to start from the number of variables - at least 100 for each.

In general, if you are trying to find a "perpetual" pattern, the more the better. If the "pattern" is floating, you have to look for the optimal window.

Tester memory handler: tester Why is it better From theory to practice

Forester 2018.12.26 08:00 #12357

elibrarius:
No not large, he explained on small numbers by 10: 8:2 vs. 6:4. But we have a lot of data.

How many examples can be considered representative of BP? I usually don't use less than 10000, small class it should have at least 1000

Although we will be adding thousands, and then the model may change as well.

And maybe it's right. The market, as they say, changes, let the model change.

Дмитрий 2018.12.26 08:02 #12358

elibrarius:
Although we'll be making additions by the thousands, and then the model might change too.

And for that one do you use wood?

Forester 2018.12.26 08:04 #12359

Dimitri:

And for it, do you use wood?

For BP analysis, in order to make money.
I don't use it yet, but I'm getting ready to do it. I read the theory for now to understand its pros and cons. I am not satisfied with the results, so I decided to work with forest. It seems to me that for BP it is better suited.

FOREX - Trends, forecasts BUY SELL orders/deals explanation Comparison of two quotation

Дмитрий 2018.12.26 08:07 #12360

elibrarius:
To analyze BP, in order to make money.
I don't use it yet, but I'm getting ready to do it. I read theory for now in order to understand its pros and cons. I am not satisfied with the results, so I decided to go for the forest. It seems to me that for BP it is better suited.

Two years ago I wrote here Maximka that NS is a toy like a nuclear bomb. That if ANY other model gives at least satisfactory results, it is not recommended to use NS - they find something that does not exist and can not do anything with it.

Trees are a good thing, but it's better to use scaffolding.

Universal MA Cross EA Rent a robot for Big changes for MT4,

Machine learning in trading: theory, models, practice and algo-trading - page 1236