How to train trees without taking all data into memory - General

Aleksey Vyazmikin 2020.10.22 13:17 #20301

mytarmailS:

we can try to express the target in a more complex way in the form of 4 parameters at once

Suppose we decide to buy...

and the grid doesn't just tell us to buy or sell

it tells us

at what price to buy, at what price to close, in how much time to buy and in how much time to close

you can also add a stop loss

Such precise and distant forecasts seem to me hard to study...

For takeoffs - I think the classification with different variants of profit fixing points should be done and the model should choose the most probable and profitable one. It looks like ZZ, but the model should work on every bar from a certain point, but this point will not appear on every price trajectory from position opening.

A good place to enter is where the minimum loss may be obtained, ie, it is important to know immediately the exit point suitable for the setting of SL, if SL is tied to some level indicator, the entry points are quite easy to find and sift out, they are similar, and therefore the training should be better.

So the question is how to find such points...

Any questions from newcomers Is there a pattern Elite indicators :)

Rorschach 2020.10.22 15:01 #20302

Aleksey Vyazmikin:

There is a wish for success :)

Do you need a regression? I do not have much experience in such models.

I am familiar with this concept - there are those who do it - the question is how to create strategies - in the engine itself ...

Then for the classification to do the target one? I'll leave the first part of the table, which is about entry, SL, TP and the last column +-1 as the result of the trade. Maybe I should not give information about exit.

What engine are we talking about? In homegrown bruteforce or genetics, for starters.

Optimisation vs Fitting in "Trees don't grow to Brain-training tasks related to

Aleksey Vyazmikin 2020.10.22 15:22 #20303

Rorschach:

Then for the classification of the target to do? I will leave the first part of the table, which is about the entry, SL, TP and the last column +-1 as the result of the transaction. Serve information about the output probably should not, can spy.

What engine are we talking about? In the self-written bruteforce or genetics, for starters.

You can do regression, a model can be done, as long as it is only research, as I understand it. But to assess the quality there is more complicated - will need to assess the deviation from the plan, I do not know whether you can immediately assess the vector of deviation or there modulo - not engaged.

About an engine that will wisely take the right data, so as not to generate inherently meaningless trading conditions - the process of strategy generation itself, and after that we can think about genetics or something else to further train the model.

Is there a pattern When does it make Has anyone wondered why

Rorschach 2020.10.22 15:43 #20304

Aleksey Vyazmikin:

It is possible to do regression, a model can be done, as long as it is only research, as I understand it. But it is more complicated with quality estimation - we will need to estimate the deviation from the plan, I do not know whether we can estimate the vector of deviation or modulo - I did not do it.

About an engine that will wisely take the right data, so as not to generate deliberately meaningless trading conditions - the process of strategy generation itself, and after that we can think about genetics or something else to further train the model.

In fact, it is interesting to see the clustering, how it will be grouped, will there be any logic.

We can start with martin, anti-martin and overturn. And then ifelse: if trade closed in minus, then the next one is opened with double lot or just in the opposite direction or both. It's hard to think of something more complicated from scratch.

CExpertSignal close position learn how to earn [Archive!] Pure mathematics, physics,

Aleksey Vyazmikin 2020.10.22 15:46 #20305

Rorschach:

In fact, it is interesting to see the clustering, how it will be grouped, whether there will be any logic.

To begin with we can take a martin, anti-martin and overturn. And then ifelse: if trade closed in the negative, then the next one opens with double lot or just in the opposite direction or both. It's hard to come up with something more complicated from scratch.

I can provide resources, I can't do more yet.

Rorschach 2020.10.22 16:27 #20306

Aleksey Vyazmikin:

I can provide resources, can't do more yet.

Catbust has feature_importances, the ability to look at clusters, as in scaffolding?

Will your machine digest table 14 for 180,000,000?

Aleksey Vyazmikin 2020.10.22 16:33 #20307

Rorschach:

Catbust has feature_importances, the ability to see clusters, as in forests?

Will your machine digest table 14 for 180,000,000?

"Feature_importances" is the importance of features, what does that have to do with clusters? Or am I missing something. There is such a possibility, but I don't use it much, because this importance is essentially counted by tree tops, which doesn't fit my concept.

I trained models on tables of 6 gigabytes. And it consumed no more than 2 gigabytes of memory, as I remember it now.

Help me find a A question about making Coding help

Rorschach 2020.10.22 16:46 #20308

Aleksey Vyazmikin:

"Feature_importances" is the importance of features, what does it have to do with clusters? Or I don't know something. There is such a possibility, but I do not use it much, because this importance is essentially counted by tree tops, which does not fit my concept.

I trained models on tables of 6 gigabytes. And the memory consumption there was no more than 2 gigabytes, as I remember now.

For the forest it is possible to see the importance and the cluster. In catbust it is probably plot_tree.

I will prepare the data and post it.

I made a test version for 6 columns, it took 11GB. Notepad++ couldn't open it, says the file is too big. BD Browser for SQLite has been hanging for 20 minutes.

[WARNING CLOSED!] Any newbie Errors, bugs, questions "New Neural" is an

Forester 2020.10.22 19:49 #20309

Rorschach:

For the forest, it is possible to see the importance and clusters. In catbust it is probably plot_tree.

I will prepare the data and post it.

I made a 6 column test version, it took 11GB. Notepad++ could not open it, says file is too big. BD Browser for SQLite was hanging for about 20 minutes.

In total commander browsing takes big files, which makes Notepad++ hang up

Forester 2020.10.22 19:54 #20310

Aleksey Vyazmikin:

"Feature_importances" is the importance of features, what does it have to do with clusters? Or I don't know something. There is such a possibility, but I do not use it much, because this importance is essentially counted by tree tops, which does not fit my concept.

I trained models on tables of 6 gigabytes. And it consumed no more than 2 gigabytes of memory, as I remember now.

I wonder how they train trees without taking all data into memory. If the table is 6 gigabytes, then about 6 gigabytes of memory should have been used. The tree needs to sort each column as a whole. If you don't take everything into memory, but read the data from disk every time, it will be too slow.
The only option is to keep data in memory in float type instead of double, but this will reduce accuracy. For us with 5 digits of accuracy, it may not be too bad, but catbust is a universal software, I think the physical and mathematical problems should be solved in double precision.

Errors, bugs, questions Question for developers - MT5 RAM memory voraciousness,

Machine learning in trading: theory, models, practice and algo-trading - page 2031