Econometrics: one step ahead forecast - page 116

 
faa1947:

I don't understand it myself.

Pattern broken: zz_high eurusd(-1 to -100) c @trend

I am forecasting a dependent variable taking values "0" - no signal and "1" - reversal signal downwards. We take 100 bars EURUSD as a dependent variable, i.e. a random value. After estimating the regression coefficient we obtain:

ZZ_HIGH = 1-@CNORM(-(1033.56764818*EURUSD(-1) + 361.005725087*EURUSD(-2) - 659.271726689*EURUSD(-3) + 1289.20797453*EURUSD(-4) - 1024.9175822*EURUSD(-5) - 173.354947231*EURUSD(-6) - 500.755211559*EURUSD(-7) + 487.538133239*EURUSD(-8) - 1741.90012073*EURUSD(-9) + 1250.27027863*EURUSD(-10) + 1204.01840496*EURUSD(-11) - 625.209628018*EURUSD(-12) - 88.4193896778*EURUSD(-13) - 821.374855285*EURUSD(-14) - 754.491291165*EURUSD(-15) + 538.519551372*EURUSD(-16) + 3220.86311608*EURUSD(-17) - 518.070207767*EURUSD(-18) - 2332.53473806*EURUSD(-19) + 569.684891562*EURUSD(-20) - 1619.61207529*EURUSD(-21) + 1641.76931445*EURUSD(-22) - 1414.74117489*EURUSD(-23) - 114.280781428*EURUSD(-24) + 450.449461697*EURUSD(-25) - 337.460964818*EURUSD(-26) + 908.232164753*EURUSD(-27) + 601.738993689*EURUSD(-28) + 861.74494980071*EURUSD(-29) + 259.833316285*EURUSD(-30) - 46.5215488696*EURUSD(-31) - 820.583809759*EURUSD(-32) - 1423.98506887*EURUSD(-33) + 935.969451579*EURUSD(-34) - 803.436564451*EURUSD(-35) + 221.143701299*EURUSD(-36) + 335.777492236*EURUSD(-37) + 650.456824302*EURUSD(-38) + 350.318958532*EURUSD(-39) - 467.384535354*EURUSD(-40) - 1463.62960078*EURUSD(-41) + 1023.33692559*EURUSD(-42) + 531.53858297*EURUSD(-43) - 1804.43807812*EURUSD(-44) + 505.327400995*EURUSD(-45) - 20.3151847226*EURUSD(-46) + 1454.71062626*EURUSD(-47) + 149.481921853*EURUSD(-48) - 1985.4346906*EURUSD(-49) + 8.64522845766*EURUSD(-50) + 1301.22397609*EURUSD(-51) + 1398.9062339*EURUSD(-52) - 1812.25415112*EURUSD(-53) - 815.17727151*EURUSD(-54) - 465.973849717*EURUSD(-55) + 891.665097704*EURUSD(-56) - 33.8677278433*EURUSD(-57) + 1802.96642724*EURUSD(-58) + 103.739651059*EURUSD(-59) + 395.877119657*EURUSD(-60) - 1358.3140469*EURUSD(-61) + 17.0144218275*EURUSD(-62) + 645.959444744*EURUSD(-63) - 1935.40489961*EURUSD(-64) + 847.657103772*EURUSD(-65) - 348.287297241*EURUSD(-66) + 1674.82953896*EURUSD(-67) - 1399.09585978*EURUSD(-68) + 442.848712733*EURUSD(-69) + 498.667519817*EURUSD(-70) + 175.460595585*EURUSD(-71) - 3.23177058628*EURUSD(-72) - 502.970783886*EURUSD(-73) - 486.45378574*EURUSD(-74) - 1284.12753179*EURUSD(-75) + 2212.99339275*EURUSD(-76) + 1011.83438787*EURUSD(-77) - 2762.97407148*EURUSD(-78) + 1603.46426721*EURUSD(-79) - 441.847609369*EURUSD(-80) - 173.0306096*EURUSD(-81) - 672.051786135*EURUSD(-82) - 1106.57500684*EURUSD(-83) + 337.977251734*EURUSD(-84) + 1392.23135411*EURUSD(-85) + 1222.020799*EURUSD(-86) + 327.446848701*EURUSD(-87) - 1208.41468022*EURUSD(-88) + 741.85661795*EURUSD(-89) + 1585.08937121*EURUSD(-90) - 2098.86445785*EURUSD(-91) + 58.0598765644*EURUSD(-92) - 166.744222595*EURUSD(-93) + 67.6457712184*EURUSD(-94) + 98.7949064574*EURUSD(-95) + 1406.32082135*EURUSD(-96) - 1658.83294022*EURUSD(-97) - 273.851042947*EURUSD(-98) + 93.5879401275*EURUSD(-99) + 243.060588194*EURUSD(-100) - 1295.0210728 + 0.08150857192*@TREND))

Everything seems to be there.

Forecast calculation is somehow completely the same as the fact.


I would like to understand the model. What is CNORM? On which interval was the model trained and on which was it tested? Do I understand correctly that the first 100 rows of Excel spreadsheets are training data? Why are they so few (equal to the number of dependent variables in the model)?
 
gpwr:

I would like to understand the model. What is CNORM? On which interval was the model trained and on which was it tested? Do I understand correctly that the first 100 rows of Excel spreadsheets are training data? Why are they so few (equal to the number of dependent variables in the model)?

@cnorm(x)

normal cumulative distribution (CDF)


It takes 500 bars. The first 100 bars are not taken, as they are bars in the formula as a period in the waving. Not training, estimating coefficients on 500 bars

 
faa1947:

@cnorm(x)

normal cumulative distribution (CDF)


500 bars are taken. The first 100 bars are not taken because these are the bars in the formula as a period in the waving. Not training, estimate coefficient at 500 bars

Such models can sometimes give accurate forecasts simply because of high probability of coincidence due to small number of possible final states and it doesn't mean anything - just take for example actual increment to close at the period of next 500 bars by coefficients derived at previous 500 bars and measure correlation with increment of forecast - that is difference in detailed values. The obtained coefficient will objectively estimate the quality of prediction, in your case you can use the evaluation of quantitative correlation. But again it's not a solution for the forecasting problem, because in order to use it as a "useful" system you must be able not only to "enter the market" successfully, but to "exit" it as well. You are wrong to assume that prediction error will accumulate one bar forward if more bars are taken, in fact, it's not promised... Same as selecting variables' significance is a separate solvable knowledge extraction task (Data Mining) and there are already methods for that and it's not done by whether it takes or doesn't take 100 or 500 bars...
 
dasmen:
Such models can sometimes give accurate predictions, simply from the high probability of coincidence, because of the small number of possible end states and it does not mean anything - just take for example the actual increment to close on the period of next 500 bars by coefficients extracted on previous 500 bars and measure the correlation with the increment forecast - that is the difference in detail values. The obtained coefficient will objectively estimate the quality of prediction, in your case you can use the evaluation of quantitative correlation. But again it's not a solution for the forecasting problem, because in order to use it as a "useful" system you must be able not only to "enter the market" successfully, but to "exit" it as well. You are wrong to assume that prediction error will accumulate one bar forward if more bars are taken, in fact, it's not promised... Just like variable significance selection is a separate solvable knowledge extraction task (Data Mining) and there are already techniques for that and it's not done on the principle of whether or not it takes 100 or 500 bars...

The whole topic is richer than the last post you commented on. The question of variable significance has been dealt with many times. Accumulation of prediction error is a medical fact, as one takes the previous prediction value for the next prediction due to lack of fact. If a fact is taken, it is a prediction one step ahead.

But these are minor and technical issues.

The use of increments was. Nothing works, because in the increments there is no trend, but there is a predicted trend. and here is the main question of the topic: what properties of the model give a guarantee of predictability? A whole set of such properties for an ordinary regression model has been suggested. What you are commenting on is a breakout model and there are other models here that I don't understand.

I would be grateful if you could comment on any of the many points in this thread.

 
faa1947:

...The use of increments was. Nothing works, as there is no trend in the increments, but the trend is predicted. and here the main question of the topic is: what properties of the model give a guarantee of predictability? A whole set of such properties for an ordinary regression model has been suggested. What you're commenting on is a probit model and there are other models here that I don't understand...

  1. It has been mathematically proven for many years that if the model and the process being analysed are the same (in your terms "the right model") then the best 1 step forward prediction is given by the Kalman filter. Search the forum for it...
  2. You've been told many times, your model is wrong. Are you stuck with this regression model that no other models exist ?...the whole world in all its diversity is described by this simple model ?...
  3. And about the type of ACF you have already been told here more than once, and its properties ...

Here is a link to my work which has been there for a long time(https://www.mql5.com/ru/code/8295).... ACF and if you look closely there are two curves. The first curve is (blue line), the second red one is the ACF of quotes. You can compare them (visually) ...

There is a model (it exists and has been known for a long time) with which you can describe the corit with sufficient accuracy for practice and it is not a regression model. It's true that econometrics textbooks probably say nothing about this model... look for other textbooks

Good luck to everyone. Happiness and health in the coming year!

 

Happy New Year! May all the models conquer.

 
faa1947:

The use of increments was. Nothing works, as there is no trend in the increments, but the trend is predicted.

You state:

1. That by switching to increments the trend is lost. This is not true, as the presence of a trend will directly affect the conditional expectation of the increments. Thus the prediction of one is equivalent to the prediction of the other.

2. That the incremental model does not have the property of reversibility. Again, this is not true, because we know the last price level. By predicting the increments, taking a cumulative sum from them and adding the last known price value - we get a one-to-one transition in time/price space.

We have a primitive regression model. It is shown that inside the sample it has a profit factor much larger than 10. Outside the sample it is a bit more than 1 and even that is doubtful. This model is "correctly" constructed.

Question: why does this "correct" model not have the property of stability or predictability?

You may be a theorist and build models with R^2 close to one and earn nothing. You can be a practitioner and evaluate models in terms of expected profits and associated risks. The first case is good if you want to write an article/dissertation/whatever. If you want to make money, estimate models by profit/risk first, and only then by R^2 and other statistics.

You can only look at tests inside the sample after you have obtained a stable positive result outside the sample. Otherwise you are wasting your time.

Next step. Applicability of stochastic diffusers to the market, links, please.

Diffurs are particularly popular when valuing derivatives such as options. There are applications in statistical arbitrage too.

It is the same for you. The NS in packages (EViews does not have it, but others do) takes the place of smoothing, and this is only a small part of the problem and not the most important one to solve. In the case of NS, it's an art. If you take splines and wavelets, it's maths.

NS takes the place of non-linear regression models.

 
gpwr:

I would like to understand the model. What is CNORM? On which interval was the model trained and on which was it tested? Do I understand correctly that the first 100 rows of Excel spreadsheets are training data? Why they are so few (equal to the number of dependent variables in the model)?

I would like to understand one thing, if you're predicting ZZ, how do you calculate the predicted ZigZag step?
 
Trolls:
  1. It has been mathematically proven for many years that if the model and the process being analysed coincide (in your terms the "correct model") then the Kalman filter gives the best 1 step forward prediction. Search the forum for it...
  2. You've been told many times, your model is wrong. Are you stuck with this regression model that no other models exist ?...the whole world in all its diversity is described by this simple model ?...
  3. And about the type of ACF you have already been told here more than once, and its properties ...

Here is a link to my work which has been there for a long time(https://www.mql5.com/ru/code/8295).... ACF and if you look closely there are two curves. The first curve is (blue line), the second red one is the ACF of quotes. You can compare them (visually)...

There is a model (it exists and has been known for a long time) with which you can describe the corit with sufficient accuracy for practice and it is not a regression model. It's true that econometrics textbooks probably say nothing about this model... look for other textbooks

Good luck to everyone. Happiness and health in the coming year!!!

(in your terms "correct model")

Correct means it has certain properties. These properties are not discussed by anyone.

then the best prediction 1 step ahead is the Kalman filter. Search the forum for it...

In EViews there is a model called state space, so Kalman filter. But I can't formulate such a model, although by all accounts it's the most promising one.

  1. And about type of ACF you were already told here more than once, and its properties...

Here's a link to my work has been lying there for a long time(https://www.mql5.com/ru/code/8295).... ACF and if you look closely there are two curves. The first curve is (blue line), the second red one is the ACF of quotes. You can compare them (visually)...

Your post about AFC I did not understand anything.

there is a model (it exists and has long been known) it can be used to describe kotir with sufficient accuracy for practice, and it is not a regression model.

What if you don't intrigue?

 

anonymous:



1. That the trend is lost in the transition to increments. This is not true, as the presence of a trend will directly affect the conditional expectation of the increments. Thus the prediction of one is equivalent to the prediction of the other.

2. That the incremental model does not have the property of reversibility. Again, this is not true, because we know the last price level. By predicting the increments, taking the cumulative sum from them and adding the last known value of price - we get an unambiguous transition in time/price space

I assert only one thing - the result on increments is much worse than on levels. That's me and I don't generalise this result. It is possible for someone else to get it.

One can be a theorist, build models with R^2 close to one and earn nothing from it. It is possible to be a practitioner and evaluate models from the perspective of expected profit and associated risks. The first case is good if you want to write an article/dissertation/whatever. If you want to make money, estimate models by profit/risk first, and only then by R^2 and other statistics.

You can only look at tests inside the sample after you have obtained a stable positive result outside the sample. Otherwise you are wasting your time.

I disagree with many on the forum on this point. If you want a truck and you get a bicycle, a successful bike test will not prove that you have a truck. It's all randomness that is bound to come out in the real world.

NS takes the place of non-linear regression models.

Once again. NS doesn't solve problems in all their diversity. Your comment about non-linearity shows this. What are non-linear regression models ? by variables or by parameters ? And are the parameters roughly constants or random variables ? and if random, what are their characteristics ? This is to the question of NS. They have their own place.