Bayesian regression - Has anyone made an EA using this algorithm? - page 6

 
Dmitry Fedoseev:
According to this formula, on a trend, the variance will be 0. Is that what you want?
It won't be 0, try substituting values :)
 
Mike:
Will not equal 0, try substituting values. :)
Let's say the trend is perfect, i.e. there is the same increment on every bar. So the graph of increments is a straight line. So what will be the variance of a straight line?
 
Dmitry Fedoseev:
Let's assume the trend is perfect, i.e. every bar has the same increment. So the graph of increments is a straight line. So what would be the variance of a straight line?
You first wrote "on the trend". :) I don't know about ideal ones, I haven't met any...
 
Haven't you seen a gopher either? He has.
 
Dmitry Fedoseev:
Let's assume that the trend is ideal, i.e. every bar has the same increment. So the increment graph is a straight line. So what is the variance of the straight line?

Yes, then zero.

And the variance is maximised on increments with uniform density. For the market, this is - to put it in other bird's terms - the period of greatest entropy, when small, medium and large increments are equally frequent.

 
Alexey Burnakov:

.......................

So when economists say that we have measured, for example, the variance of an instrument, they do the following: variance = sum((Xi - X^)^2) / (N - 1),

where Xi is the increment calculated by one of the formulas,

X^ is the X with a cap - the sample estimate of the mean incremental value in the available sample

N - 1 is the sample size minus one,

and the whole formula is an unbiased estimate of the variance.

And then these economists start thinking that the density of increments is normal and try to do a thing like: sqrt(variance) * sqrt(m) * 1.96,

where the root of variance is an estimate of standard deviation and the whole formula is a stretching of the consequence of normality on the non(!)normal series in order to get an estimate of the extreme limit of price spread in m steps forward with 95% probability. And errors are obtained, of course.

I hope I've explained approximately. And the original price series does not even at first approximation look like a normal one, unlike the increments.

In this post, section 5. Removing the trend
https://www.mql5.com/ru/articles/363
the author shows quite an acceptable approximation of the sample of increments to the normal one. Points which do not lie on a straight line are long known to be dealt with - about 7-10% of maximum modulo values are excluded from the sample. Then even Kolmogorov's goodness-of-fit criterion (which is very sensitive to the distribution form) shows that the sample is normal. As for the excluded values, these are the points where the current trend has broken down. The source where this methodology came from (I read something in English a long time ago, I don't remember where) basically suggests forming samples of increments from points that are between the trend break points, this is what is suggested to be called the current trend.
Преобразование Бокса-Кокса
Преобразование Бокса-Кокса
  • 2012.01.17
  • Victor
  • www.mql5.com
Статья призвана познакомить читателя с преобразованием Бокса-Кокса (Box-Cox Transformation). В статье кратко затрагиваются вопросы, связанные с его использованием и приводятся примеры, позволяющие оценить эффективность данного преобразования по отношению к случайным последовательностям и реальным котировкам.
 
Alexey Burnakov:

Yes, then zero.

And the variance is maximised on increments with uniform density. For the market, this is - to put it in other birds' terms - the period of greatest entropy, when small, medium and large increments are equally common.

And the point? Of course, it can be found if you look, during the existence of markets many different indicators have been invented, each of which can be attached to something with greater or lesser success.
 
Mike:
In this post, section 5, trend elimination
https://www.mql5.com/ru/articles/363
the author shows a perfectly acceptable transformation of a sample of increments. Points that do not lie on a straight line are long known to be dealt with: they are excluded from the sample by about 7-10% of maximum modulo values. Then even Kolmogorov's goodness-of-fit criterion (which is very sensitive to distribution form) shows sample normality. As for the excluded values, these are the points where the current trend has broken down. The source where this methodology came from (I read something in English a long time ago, I don't remember where) basically suggests forming samples of increments from points that are between the trend break points, this is what is suggested to be called the current trend.

What a reversal of fortune here.

I read:"The presence of such obvious 'trendiness' suggests trying to rule out a trend first".

As if you've fallen from the moon. As if identifying waves is difficult. The main problem with thechanalysis and consequently trading is identifying the trend.

 
Dmitry Fedoseev:

What a reversal of fortune here.

I read:"The presence of such obvious 'trendiness' suggests trying to rule out a trend first".

As if you've fallen from the moon. As if identifying waves is difficult. The main problem with the analysis, and therefore trading, is identifying the trend.

The author in the above post just expressed himself in an unfortunate way. Statistical packages have standard procedures for time series analysis: identifying the trend, selecting the seasonal component and taking the difference. The author meant the third one, which is used to move to a stationary series (in fact - approximation to stationarity).
 
Dmitry Fedoseev:
Suppose the trend is perfect, i.e. on each bar the increment is the same. So the graph of increments is a straight line. So what will be the variance of the straight line?

When we try to apply statistics, the cornerstone, the foundation, is the question of the APPLICABILITY of a particular tool from that science.

Your example contains no random variables - a constant. Dispersion refers ONLY to random variables. In your particular case, there was a result unique to statistics: the variance calculation showed that constants, not random numbers, were supplied as input.

The uniqueness of your example is that the result is correct and easily explainable. Usually, if you do not carefully justify the possibility of using a tool, such as linear regression, a result will be obtained that has nothing to do with reality, and therefore completely unusable in practice: numbers will be, they can be seen (gopher visible), but in reality, all these numbers do not! Just a numbers game.

Using linear regression as an example: a standard algorithm (not a homemade one) calculates the regression coefficients and, usually, the far right column tells us whether the regression coefficients we see actually exist. If the far right column has a figure of 0.5 (50%) then it is certain that the printed figures do not exist. If it's 10%, then it's just so, in the fog. but if it's less than 5%, then the numbers really exist. And this can only be believed if you have managed to justify the POSSIBILITY of applying this very linear regression beforehand.