Dependency statistics in quotes (information theory, correlation and other feature selection methods) - page 73

 
alexeymosc:

Well said! We, Alexei, are in favour of market inefficiencies. And we already have practical results showing this, but not visible through the prism of the classical statistical-econometric approach.

Regarding your result you ignored my post.

Let it be. But.

The classical ACF knows where to stick and not me alone, but millions. After detrending ACF rarely yields dependencies greater than 10 and if so, it is more likely not a qualitative trending. However if dependencies remain and the number of lags exceeds 40 (135 in your picture), it implies fractional integration models (FARIMA). And what follows from your non classical approach? What models follow when information dependencies are detected?

 
faa1947:

Could be.

Any confidence interval sounds like this: at the 5% level (for example) the null hypothesis is (is not) confirmed.

How does your null hypothesis sound? Where is the confidence interval? etc. If the ACF is an understandable thing to me, your graph is not understandable. If the max is 2.098 bits, then 0.05/2.098 should not be discussed. And the issues at the beginning of the line are not removed.

By the way, what did you calculate the ACF on?

On what I counted the ACF - I wrote. On the data from the attached file, only I take the whole row of data, not 100 points like you. By the way, I don't understand why you have to take 100 data points. It's not enough, IMHO.

About the confidence interval. My result sounds like this: at the 0.01 level, the null hypothesis that the mutual information statistics between the null bar and the lags are not different on the random and source series is not confirmed.

Sorry I didn't answer right away. Just slipped my mind, and was a bit busy.

 
Do I understand correctly that acf is considered here for a linear sequence of series? And is it possible to move to the notion of correlation of distribution plots with increasing series length?
 
alexeymosc:

What I used to calculate the ACF - I wrote it down. On the data from the attached file, only I take the whole row of data, not 100 points like you. By the way, I don't understand why you have to take 100 data points. It's not enough, IMHO.

About the confidence interval. My result sounds like this: at the 0.01 level, the null hypothesis that the mutual information statistics between the null bar and lags is not different on the random and source series is not confirmed.

Thank you, you have provided me with complete clarity.
 
faa1947:
Thank you, you have given me complete clarity.
You're welcome. That was the main message of my article. I specifically conducted a test at the end: Kolmogorov-Smirnov and Mann-Whitney U-test - for samples without specifying the type of distribution. Both tests showed that the null hypothesis is not confirmed. How to interpret this is a much broader topic.
 
alexeymosc:
Please. That was the main message of my article. I specifically ran a Kolmogorov-Smirnov and Mann-Whitney U-test at the end for samples with no indication of the type of distribution. Both tests showed that the null hypothesis is not confirmed. How to interpret this is a much wider topic.
So where did the tests and ACF come from anyway?
 
faa1947:
So where did the tests and ACF come from anyway?
Ah, now I understand the question. Statistica.
 
alexeymosc:
Ah, now I understand the question. Statistica.
The next penultimate step is EViews and then the last step is R.
 
faa1947:
The next penultimate step is EViews, and then the last one is R.

Heard a lot about EViwes from you already, I'll give it a try. R - have also heard and even seen. I'll give it a try too, when I have time. I read on a medical forum that sometimes the results of calculations on tests differ between different programs, unfortunately.

And Excel in general - even the quality of PRNG, unlike Statistica. I myself observed differences in smoothness of the normal distribution bell.

 

When I have time, I would like to do this in this topic. By analogy with private autocorrelation (where the influence of intermediate lags is cut off), make cutting off the influence of intermediate lags when calculating mutual information.

Here is an example. This is autocorrelation of volatility (modulo) of EURUSD H1 at the depth up to 480 lags:

And this is how a chart of partial autocorrelations looks like - i.e. the influence of intermediate lags (false correlations) has been removed:

You can see that a lot of correlations are cut off at once.

Here I want to do a similar thing, only for a series of returns with signs. At least it will be visible up to which bar there really is a memory.