Machine learning in trading: theory, models, practice and algo-trading - page 2744

 
Valeriy Yastremskiy #:

After Sanych's explanations, I stopped understanding a bit what significant predictors mean in the end. According to his explanation, they are frequently occurring and their magnitude correlates with the result. But these are apparently general signs of the series, over the entire training period. I can't seem to match what it is in the series model. It turns out that these are predictors that work always, if quite simplified, or most often. In general, it is clear that using settings that work most often will give a more positive result than using settings that work only on a certain segment...

As it is not formed a picture, what in the end is searched for, and why.

And there is a problem with terms, the main root of misunderstanding and your own mistakes. I wrote as I understood his message. I showed him where the weaknesses are.
 
Valeriy Yastremskiy #:

Let me be a PTU, blame everything on me, and you cool down and more on the case as it would be better, with arguments and if and with jokes, then without childish teasing))))

No... you're good.)

 
Valeriy Yastremskiy #:

Let me be a schoolboy, blame everything on me, and you will cool down and more on business as it would be better, with arguments and if with jokes, then without childish teasing)))).

I don't know what words to use to get him off my back 😀

Sanych and Perervenko drop in, throw a bone and leave. When you start discussing the facts, it turns out that there is nothing there.
 
Maxim Dmitrievsky #:
And there's a problem with terms, a major root of misunderstanding and your own mistakes. I wrote the way I understood his message. I showed you where the weakness was.

No, I was stumped after his explanation, your understanding seems to be in line with Sanych's, but has completely ceased to correlate with mine.

If, of course, we take the same time interval of a day, or a week, well, in general, not the whole series, but some sections that are the same and teach on it, the picture is kind of clear, but how to find these identical sections. The easiest way is to do it by time, from 17 to 18, for example... and in general it should work, but in Sanych's case the whole row appeared without changes, it is not clear.

 
Maxim Dmitrievsky #:
I don't know what words to use to get him to come off 😀

Sanych and Perervenko pop into the thread, throw a bone and leave. When you start discussing the facts, it turns out to be nothing.

Do not answer.

Well, they apparently have other projects, and this is like a small hobby, and it may well be that in this hobby reza is not very good.

 
Valeriy Yastremskiy #:

Nah, I was stumped after his explanation, your understanding sort of matches Sanych's, but has completely ceased to correlate with mine.

If, of course, we take the same time interval of a day, or a week, well, in general, not the whole series, but some parts that are the same and teach on it, the picture is kind of clear, but how to find these identical parts. The easiest way is to do it by time, from 17 to 18, for example... and in general it should work, but in Sanych's case the whole row appeared without changes, it is not clear.

It takes signs and target ones, measures correlation or entropy between them in a sliding window with a given step. It looks at the spread, mean and other statistics. Throws out the bad signs.

Then during the training process it substitutes different signs into the model, i.e. they change over time. I don't know on what principle they are substituted, probably according to the results of dividing the history into modes.

 
Maxim Dmitrievsky #:

Takes attributes and targets, measures correlation or entropy between them in a sliding window with a given step. Looks at the spread, mean and other statistics. Throws out the bad signs.

Then during the training process it substitutes different signs into the model, i.e. they change over time. I don't know on what principle they are substituted, probably according to the results of dividing the history into modes.

If the features have time associated with other features, it is more understandable. But he didn't say about the types of signs)))))

 
Valeriy Yastremskiy #:

If the traits have time associated with other traits, it's more understandable that way. But he didn't say about types of traits)))))

Well, 180 some ceiling traits, probably on the basis of increments. So why guess?
 
Maxim Dmitrievsky #:
And if you read carefully, you can see the ambush in point 2, i.e. the initial fit to the story. That's why his learning error drops


and statistical tests for regression coefficients are for what? or testing hypotheses about equality of mean and variance? (if PCA still shows that 1st PC explains the acceptable share of variance [residual variance is very small] - then accept it and check for confirmation of significance of regression coefficients)....

ideally -- it is clear that in order to have 100% probability we should use functional rather than correlation relations -- but if we study a stochastic process, the results will be only probabilistic and confirmable only on a large number of test data and only until a new Driver appears on the market.... [here, by the way, factual/logical awareness is also very important, not only robust analysis].

fitting to history is always there, as long as we rely on historical data..... but we can always compare variance by F-statistics -- if the variance reduction is much larger than the remaining unexplained variance, then a new regression is identified. (with dr SLOPE)... and it works only until a moment in the future and only on large numbers... well, or switch actor's state (if DL is used)... but driver knows better than to wait... but it's better to know the driver than to wait until the current sample is collected to confirm it.

Feature Engineering if you make logical, you have correctly noticed - "theoretically" logical (under any stat processing there are earthly physical-logical laws and human knowledge, patterns do not fall out of thin air) -- [but someone may have ignorance] -- then FS in the process of modelling will not bother either the modeler or the developer very much.... and you can't go anywhere without history, to know what and when became a driver and what didn't - you don't need much intelligence in higher mathematics, only understanding of the laws of money and commodity market, private and state sectors (and this is not VM), otherwise we will use the apparatus of applied higher mathematics (VM) only "afterwards" to learn that the news that changed the world has already been heard once ... it's just that the reaction of the market, including the market, is usually lagging.

p.s..

to whom words and letters are unknown, don't read an unknown topic of unknown letters, so as not to cling to letters - look for a VM machine for your FS.... if then you also prove statistical validity of your results, and not only %% of hit-to-point (not always unbiased, by the way)), then the conversations will be different.... But for now, yes, everyone has their own terminology....

 
JeeyCi #:


and statistical tests for regression coefficients are for what? or testing hypotheses about equality of mean and variance? (if PCA still shows that 1st PC explains the acceptable share of variance [residual variance is very small] - then accept it and check for confirmation of significance of regression coefficients)...).

ideally -- it is clear that in order to have 100% probability we should use functional rather than correlation relations -- but if we study a stochastic process, the results will be only probabilistic and confirmable only on a large number of test data and only until a new Driver appears on the market.... [here, by the way, factual/logical awareness is also very important, not only robust analysis].

fitting to history is always there, as long as we rely on historical data..... but we can always compare variance by F-statistics -- if the variance reduction is much larger than the remaining unexplained variance, then a new regression is identified. (with dr SLOPE)... and it works only until a moment in the future and only on large numbers... well, or switch actor's state (if DL is used)... but driver knows better than to wait... but it's better to know the driver than to wait until the current sample is collected to confirm it.

Feature Engineering if you make logical, you have correctly noticed - "theoretically" logical (under any stat processing there are earthly physical-logical laws and human knowledge, patterns do not fall out of the air) -- [but someone may have ignorance] -- then FS in the process of modelling will not bother either the modeler or the developer very much.... and you can't go anywhere without history, to know what and when became a driver and what didn't - you don't need much intelligence in higher mathematics, only understanding of the laws of money and commodity market, private and state sectors, otherwise we will be using the apparatus of applied higher mathematics only "afterwards" to learn that the news that changed the world has already been heard once.... It's just that the market reaction, including the market reaction, is usually lagging.

p.s..

to whom words and letters are unknown, don't read an unknown topic of unknown letters, so as not to cling to letters - look for a VM machine for your FS.... if then you also prove statistical validity of your results, and not only %% of hit-to-point (not always unbiased, by the way)), then the conversations will be different.... But for now, yes, everyone has their own terminology....

It all started with the fact that people wanted to co-op )) began to sort out, it turned out that everyone denies everything that others do. Plus they invent new definitions. In the end no one understood anything. I generally like Sanych's approach, so I asked for specifics. And with designations, too, it is a trouble that there is a relation and connection, if not correlation.

It is obvious that he cherishes his know-how and does not reveal details.