Bayesian regression - Has anyone made an EA using this algorithm? - page 28

 

On the subject of the branch: Bayes, mql

Bayes formula

The linear dependence y=ax+b;

The formula for the normal distribution.( You could, in principle, take another distribution.)

Let's rewrite the formula

P(a,b|x,y)=P(x,y|a,b)*P(a)*P(b)/P(x,y); (1)

Next, as far as I understand it, you need to enumerate all possible combinations of a and b. Those a and b that give the maximum probability according to formula (1) will be the coefficients.

 
Yuri Evseenkov:

On the subject of the branch: Bayes, mql

Bayes formula

The linear dependence y=ax+b;

The formula for the normal distribution.( You could, in principle, take a different distribution.)

Let's rewrite the formula

P(a,b|x,y)=P(x,y|a,b)*P(a)*P(b)/P(x,y); (1)

Next, as far as I understand it, you need to enumerate all possible combinations of a and b. Those a and b that give the maximum probability according to formula (1) will be the coefficients.

There is some suspicion that this is not the case.
 
Dmitry Fedoseev:
There is some suspicion that this is not the case at all.
Share your suspicion, please.
 
Yuri Evseenkov:
Share your suspicion, please.
No. If I knew for sure, I would put it in the code, but you can rattle on endlessly. There are such megaladons in the thread, let them show off their eloquence in practice.
 
Dmitry Fedoseev:
No. If I knew for sure, I would represent it in the code, but you can rattle on endlessly. There are such megaladons in the thread, let them show off their eloquence in practice.
Pity. You are the most specific on the subject. And the off-topic competent comrades I am very interesting, but I am afraid to get lost in the "woods".
 
Yuri Evseenkov:

On the subject of the branch: Bayes, mql

Bayes formula

The linear dependence y=ax+b;

The formula for the normal distribution.( You could, in principle, take another distribution.)

Let's rewrite the formula

P(a,b|x,y)=P(x,y|a,b)*P(a)*P(b)/P(x,y); (1)

Next, as far as I understand it, you need to enumerate all possible combinations of a and b. Those a and b that give the maximum probability according to formula (1) will be the coefficients.

You seem to be thinking in the right direction. I've already forgotten it, but the explanation is this.

Suppose we have a time series (prices if you like), Y = {y[1], y[2], ..., y[n]}. We also have unknown model parameters W={w[1], w[2], ... , w[m]}. Suppose the model is regression, that is

y[i] = SUM_j w[j]*f(X) + e[i]

where f() is an approximating function (a polynomial for example), X is the input data, e[] is an error.

Let's use the maximum likelihood theorem to find the parameters of the model W:

W = argmax ln(P(W|Y))

Now apply Bayes theorem:

P(W|Y) = P(Y|W)*P(W)/P(Y)

Dividing by P(Y) is a normalization that can be neglected. We get

(1) W = argmax {ln(P(W|Y))} ~ argmax {ln(P(Y|W)) + ln(P(W))} ~ argmin {-ln(P(Y|W)) - ln(P(W))}

P(Y|W), the probability of X given the parameters W, can be calculated as follows:

P(Y|W) = P(SUM_j w[j]*f(X) + e[i] | W) = P(E)

If the errors have a normal distribution and are independent of each other, then

(2) P(Y|W) = P(E) ~ exp(-SUM{e[i]^2}/(2*sigma^2))

Substitute (2) into (1) and obtain

W ~ argmin {-ln(P(Y|W)) - ln(P(W))} ~ argmin SUM{e[i]^2} - ln(P(W))

P(W) is usually 1, and we can take out a Laplassian distribution:

P(W) ~ exp(-lambda*||W||_1)

We get

W ~ argmin SUM{e[i]^2} - ln(P(W)) ~ argmin SUMChe[i]^2 + lambda*|||W|_1

As the result, application of maximum likelihood and Bayes Theorem to the regression of our series with Gaussian errors leads to a least squares method with the governing summand lambda*... or without. The maths is convoluted, but the result is simple. If you don't like the normal error distribution, replace it with another one, e.g. Laplassian, you get:

W ~ argmin SUM|e[i]| + lambda*||W||_1.

You can also replace with a super-Gaussian, so you get

W ~ argmin SUM|e[i]|^p + lambda*|||W|_1

By the way, the regulating additive as written here turns the least squares method into a deflated coding method. Without it, it is a classical linear regression, solved by differentiating by W and equating to zero.

 
Vladimir:

You seem to be thinking in the right direction. I'm starting to forget it, but here's the explanation.

Suppose we have a time series (prices if you like), Y = {y[1], y[2], ..., y[n]}. We also have unknown model parameters W={w[1], w[2], ... , w[m]}. Suppose the model is regression, that is

y[i] = SUM_j w[j]*f(X) + e[i]

where f() is an approximating function (a polynomial for example), X is the input data, e[] is an error.

Let's use the maximum likelihood theorem to find the parameters of the model W:

W = argmax ln(P(W|Y))

Now apply Bayes theorem:

P(W|Y) = P(Y|W)*P(W)/P(Y)

Dividing by P(Y) is a normalization that can be neglected. We get

(1) W = argmax {ln(P(W|Y))} ~ argmax {ln(P(Y|W)) + ln(P(W))} ~ argmin {-ln(P(Y|W)) - ln(P(W))}

P(Y|W), the probability of X given the parameters W, can be calculated as follows:

P(Y|W) = P(SUM_j w[j]*f(X) + e[i] | W) = P(E)

If the errors have a normal distribution and are independent of each other, then

(2) P(Y|W) = P(E) ~ exp(-SUM{e[i]^2}/(2*sigma^2))

Substitute (2) into (1) and obtain

W ~ argmin {-ln(P(Y|W)) - ln(P(W))} ~ argmin SUM{e[i]^2} - ln(P(W))

P(W) is usually 1, and we can take out a Laplassian distribution:

P(W) ~ exp(-lambda*||W||_1)

We get

W ~ argmin SUM{e[i]^2} - ln(P(W)) ~ argmin SUMChe[i]^2 + lambda*|||W|_1

As the result, application of maximum likelihood and Bayes Theorem to the regression of our series with Gaussian errors leads to a least squares method with the governing summand lambda*... or without. The maths is convoluted, but the result is simple. If you don't like the normal error distribution, replace it with another one, e.g. Laplassian, you get:

W ~ argmin SUM|e[i]| + lambda*||W||_1.

You can also replace with a super-Gaussian, so you get

W ~ argmin SUM|e[i]|^p + lambda*|||W|_1

By the way, the regulating additive as written here turns the least squares method into a deflated coding method. Without it it is a classical linear regression, solved by differentiating by W and equating to zero.

Thank you!
 
Vladimir:

You seem to be thinking in the right direction. I'm starting to forget it, but here's the explanation.

Suppose we have a time series (prices if you like), Y = {y[1], y[2], ..., y[n]}. We also have unknown model parameters W={w[1], w[2], ... , w[m]}. Suppose the model is regression, that is

y[i] = SUM_j w[j]*f(X) + e[i]

where f() is an approximating function (a polynomial for example), X is the input data, e[] is an error.

Let's use the maximum likelihood theorem to find the parameters of the model W:

W = argmax ln(P(W|Y))

Now apply Bayes theorem:

P(W|Y) = P(Y|W)*P(W)/P(Y)

Dividing by P(Y) is a normalization that can be neglected. We get

(1) W = argmax {ln(P(W|Y))} ~ argmax {ln(P(Y|W)) + ln(P(W))} ~ argmin {-ln(P(Y|W)) - ln(P(W))}

P(Y|W), the probability of X given the parameters W, can be calculated as follows:

P(Y|W) = P(SUM_j w[j]*f(X) + e[i] | W) = P(E)

If the errors have a normal distribution and are independent of each other, then

(2) P(Y|W) = P(E) ~ exp(-SUM{e[i]^2}/(2*sigma^2))

Substitute (2) into (1) and obtain

W ~ argmin {-ln(P(Y|W)) - ln(P(W))} ~ argmin SUM{e[i]^2} - ln(P(W))

P(W) is usually 1, and we can take out a Laplassian distribution:

P(W) ~ exp(-lambda*||W||_1)

We get

W ~ argmin SUM{e[i]^2} - ln(P(W)) ~ argmin SUMChe[i]^2 + lambda*|||W|_1

As the result, application of maximum likelihood and Bayes Theorem to our series regression with Gaussian errors leads to a least squares method with or without the adjusting term lambda*... or without. The maths is convoluted, but the result is simple. If you don't like the normal error distribution, replace it with another one, e.g. Laplassian, you get:

W ~ argmin SUM|e[i]| + lambda*||W||_1.

You can also replace with a super-Gaussian, so you get

W ~ argmin SUM|e[i]|^p + lambda*|||W|_1

By the way, the regulating additive as written here turns the least squares method into a deflated coding method. Without it, it is a classical linear regression solved by differentiating by W and equating to zero.

Thanks for the detailed comment. The key words and formulas are given, I'll look into it.

"In summary, applying maximum likelihood and Bayes Theorem to our series regression with Gaussian errors leads to a least squares method with a lambda* adjusting term... or without. The maths is convoluted and the result is simple. "

Convinced. Almost. There remains a shadow of doubt that coefficients a and b of lines y=ax+b when calculated by different methods will be numerically or approximately equal. Here you need either painstakingly compare formulas of two methods or write a program. The main thing that formulas, algorithm and the code itself would be adequate to the theory. The program must:

-calculate the coefficients of a and b of the linear regression y=ax+b using the method of least squares

-obtain the coefficients of a and b, at which the probability by Bayes' theorem is maximal when applying the normal distribution with mathematical expectation equal to ax+b

Then we need to compare those coefficients and, in case of a considerable difference, look at the behavior of the two lines based on those a and b in dynamics. For example, in the strategy tester in visualization mode.

The program can be further used using other models, regressions, distributions with the Bayes formula. Maybe something will really take off well.

P.S My favourite example comes to mind:

"Chances are you've already used Bayesian thinking, even though you didn't know it. Discuss
an example I took from Neil Manson: You are a soldier in battle who is hiding in a foxhole. Youknow for a fact
that there is only one enemy soldier left on the battlefield, about 400
yards
away.You also know that if it is a regular soldier, he won't be able to hit you from that
distance.However, if that soldier is a sniper, it is quite possible that he can hit you
.But there aren't many snipers in the enemy army, so it's probably a regular soldier. You
lift your head out of the trench, trying to get a better look around.Bam! A bullet grazes your helmet
and you fall back into the foxhole.
Good, you think. I know snipers are rare, but this guy hit me from four hundred
yards.There's still a good chance it's a regular soldier, but the chance of it being a sniper is already
higher, since he hit me from such a long distance.After a few minutes you
dare to look out again and raise your head above the foxhole.Bam! A second bullet
grazes your helmet! You fall back down. Oh, shit, you think. It's definitely a sniper. No matter how rare they are,
however, the average soldier can't hit twice in a row from that distance
. It's definitely
a sniper. I better call for backup. If this is a rough approximation of what you would think in a
similar situation, then congratulations! You're already thinking like a Bayesian, at least
sometimes."
(Author not specified).

 
Yuri Evseenkov:


-Calculate coefficients a and b of linear regression y=ax+b by least squares method

-Get the coefficients of a and b, at which the probability by Bayes' theorem is maximal when applying a normal distribution with expectation equal to ax+b


They will be equal, very close. The question is whether there is sense in trying to set an a priori distribution for coefficients when applied to financial markets.

I've often seen regularisation applied in regression (L1, L2). Might work better than ordinal linear regression.

 
Alexey Burnakov:

They will be equal, very close. The question is whether it makes sense to try to set an a priori distribution for the coefficients as applied to financial markets.

I have often seen regularisation applied in regression (L1, L2). Might work better than ordinal linear regression.

The coefficients a and b I understand need to be enumerated to find the combination that gives the maximal probability according to Bayes formula P(a,b|x,y)=P(x,y|a,b)*P(a)*P(b)/P(x,y); (1) The probabilities P(a) and P(b) will be equal to steps of overshoot cycles and be a constant value. Their distribution will be uniform.

P.S. I adhere to the opinion that the nature of real financial markets and forex differ significantly. Forex is more of a gambling business. A kind of multiplayer, computer online simulation. So for Forex, it is possible to apply the laws that are relevant in these areas. The law of normal distribution, for example.