Machine learning in trading: theory, models, practice and algo-trading - page 1193

 
Maxim Dmitrievsky:

on mql5 too... But here there is a good tester and a base :)

That's the problem, I'm spinning myself, tired of reading everything when I have time

If this is a product for the Market, alas, then everything is in MQL, if the purpose is personal use or distribution beyond this forum, then the problem comes to the possibility to get .dll and linking with MT

 
Igor Makanu:

That's the problem, I'm myself spinning, tired of reading everything when I have time

I need to define the goals - thoughts aloud )))), if the purpose is a product in the Market, alas, then all on MQL, if the purpose of personal use or distribution beyond this forum, then the task comes down to the ability to get .dll and linking to MT

The aim is an awesome bot in MO, the rest is bullshit. If you write for market then promotion is more important there, not trade performance, plus stupid buyers will blow your brains out (have experience). I've got experience. You may take almost any indicator or expert advisor from codebase and sell it in the Market... or put 200 of them like Gribachev, every day a new one, but this is not the way of samurai.

You have to put your wife or a hired slave on the computer so that they could communicate with buyers, but you won't waste your time on it :)
 
Maxim Dmitrievsky:

The goal is an awesome bot on the MO, the rest is bullshit. If you write for the market, then promotion is more important there, not trading performance, plus stupid buyers will blow your mind (I have experience). I've got experience in this field. You can take almost any indicator or Expert Advisor from the code base and sell it in the Market... or lay out 200 of them like Gribachev, every day a new one, but this is not the way of the samurai.

I'd rather let my wife or a hired slave sit at the computer and communicate with buyers than waste time on it :)

I have seen it coming, so I do not think it is necessary to "pant" and put something worthwhile in the Market - I can not support the product, because it will take a long time, and stuff stored there, with the hope that there is peeple who wants to give $ 30, as the conscience does not allow)))

ZS: A grid of orders on a simple indicator... It will work forever, forever (up or down), and is always in demand by users ))))

 
Igor Makanu:

I have already foreseen it, so I do not think it is necessary to "pant" and put something worthwhile in the Market - I will not be able to support the product, because it will take a long time, and stuff stored there, hoping to find the peeple who wants to give $ 30, as it is not allowed by conscience )))

ZS: A grid of orders on a simple indicator... It will always work (up or down) and is always in demand by users ))))

Martins, grids yes... geeks of all kinds, it's eternal :)

 

Here I came up with an idea - to determine overtraining by means of MO. I all dig catbust, there you can get a prediction in probability - scattered into groups from 0 to 9 probabilities - for ease of perception and further analysis - looked distributions, standard deviation, kurtosis, asymmetry, including a breakdown of the target and distribution of error, correct responses in each group. Now I will pull out different standard indicators to evaluate the model, such as AUC F1 and others, there you can see the dynamics of learning, but so far I do not know how best to describe it.

The two models in the graph are the distribution group * correct classification group. The blue model is better on the exam sample.


What other predictors can you think of to evaluate the model?

 
Aleksey Vyazmikin:

Here I came up with an idea - to determine overtraining by means of MO . I all dig catbust, there you can get a prediction in probability - scattered into groups from 0 to 9 probabilities - for ease of perception and further analysis - looked distributions, standard deviation, kurtosis, asymmetry, including a breakdown of the target and distribution of error, correct responses in each group. Now I will pull out different standard indicators to evaluate the model, such as AUC F1 and others, there you can see the dynamics of learning, but so far I do not know how best to describe it.

The two models in the graph are the distribution group * correct classification group. The blue model is better on the exam sample.


What other predictors can you think of to evaluate the model?

cool, that's actually what everyone does.

to evaluate the model use metrics, not predictors, standard ones are usually sufficient, but you can make up your own

Usually the larger the error in the forest, the smaller the spread (dispersion) of values, i.e. just white noise around 0.5, in this sense the blue line is worse than the red one
 
Maxim Dmitrievsky:

Great, in fact, that's what everyone does.

to evaluate a model, you use metrics, not predictors, standard ones are usually sufficient, but you can make up your own.

usually the more error in the forest the less spread (dispersion) of values, i.e. just white noise around 0.5, in this sense the blue line is worse than the red one

Ha, so it's about finding an estimation criterion, not just estimating with different calculation formulas! All these approaches with formulas evaluate the model stationary, but do not tell about its ability to continue working further, and that is exactly what I want to achieve, that is why I generate predictors, that would be possible to find a pattern from a set of different indicators with the help of MO.

About scatter is a very strange statement you reported, perhaps it only takes into account the very fact of finding values without taking into account their classification value and percentage of correct answers. On the graph from zero to 5 on the x-axis is the product of a cluster of zeros and their correct classification, and from 5 opposite is the product of units.

Here is a graph of these models, but shows the distribution of target "1"

as you can see, the red model has the percentage of distribution shifted beyond 5, which means that "ones" have no chance of correct classification, and those that do have a chance are less than the blue model - 23% and 28%, respectively.

And here's how the fidelity of the classification changes

Of course, it's possible to use such a flattened model too, but it's necessary to shift classification division from 0.5 to 0.7 for example, but there is very little material left for processing, but on the other hand such flattened models can be combined...

 
Aleksey Vyazmikin:

Ha, so the point is to find a criterion for evaluation, not just an evaluation using different computational formulas! All these approaches with formulas estimate the model stationary, but do not tell about its ability to continue working further, and that is exactly what I want to achieve, so I generate predictors that would be possible to find a pattern from a set of different indicators with the help of MO.

About the scatter - very strange statement you reported, perhaps it only takes into account the fact of finding values without taking into account their classification value and percentage of correct answers. On the graph from zero to 5 on the x-axis is the product of the accumulation of zeros and their correct classification, and from 5 on the opposite is the product of the units.

Here is a graph of these models, but shows the distribution of target "1"

as you can see, the red model has the percentage of distribution shifted beyond 5, which means that "ones" have no chance of correct classification, and those that do have a chance are less than the blue model - 23% and 28%, respectively.

And here's how the fidelity of the classification changes

Of course, it is possible to use such a flattened model, but it's necessary to shift classification division from 0.5 to 0.7 for example, but there is very little material left for processing, but on the other hand such flattened models can be combined...

the fact that it is biased just speaks in favor of some class, it may be on the trend market, i.e. training sample (roughly).

And if you take the blue one, then you get a sharp decrease in the probabilities, that is, if, ideally, the probability of the signal should be 1, then you have it 0.6-0.7 maximum, i.e. both classes rotate around 0.5, with small deviations towards one or the other class, in fact there is noise, not signals, or the model is highly regularized

The ability to keep working on a test sample by errors... if you manage to get close to the errors on a trace, the model is good, as a rule

 
Maxim Dmitrievsky:

the fact that it is biased just speaks in favor of some class, it may be in the trend market, i.e. training sample (roughly).

We are comparing the model under the same conditions, here are the same models on other data target units hit classification 1 - 35% vs. 39%

classification fidelity

and since the accumulation of all values is closer to the center, we get the product

Maxim Dmitrievsky:

And if you take the blue one you have a sharp decrease in the probability, that is, if, ideally, the probability of the signal should be 1, then you have it 0.6-0.7 maximum, that is, both classes revolve around 0.5, with small deviations towards one or the other class, in fact there is noise and not signals

ability to keep working on the test sample by errors... if you manage to get close to the errors on the tray, then the model is good, as a rule

Why this probability must be "1" - rather it is self-confidence, on the contrary I think that in a correct (ideal) model there must be two hump between 0.1 and 0.3 and 0.7 and 0.9 - because this will indicate the stability and adequacy, but such models are not observed in fact yet.

About approaching values of estimated coefficients - yes, I agree - I will look at the delta and take some more measurements on the dynamics - at catbust you can see how indicators change when you add trees to the model.
 
Aleksey Vyazmikin:

We are comparing the model under the same conditions, here are the same models on other data target units fell under classification 1 - 35% vs. 39%

fidelity of classification

and since the accumulation of all values is closer to the center, we get the product

Why this probability should be "1" - more likely it is complacency, on the contrary I think that the correct (ideal) model should have two humps between 0.2 and 0.4 and 0.7 and 0.9 - because this would indicate stability and adequacy, but I do not observe such models in fact yet.

About approaching values of estimated coefficients - yes, I agree - I will look at delta and make some more measurements on dynamics - at catbust you can see how indicators change when you add trees to the model.

The higher the probability of the event, the more accurate the signal, it kind of comes even from the definition :) 2 there will be no hump on noisy data, and at least because there will be transient states, but the model should at least properly capture the extreme values, otherwise it is never sure about the inputs at all

Reason: