Zero sample correlation does not necessarily mean there is no linear relationship - page 40

 

Correlation coefficient = 0.766654

It was all calculated in excel. The only thing - I took gold quotes from MT (I was too lazy to convert the comma digits to points manually in yours)

 

I double-checked the data again: I got it a little wrong. Firstly, the ratios there are not calculated on open interest but on open interest of gold hedgers, and secondly, I have 3 zero values for OM at the end of the data - this too could have a strong impact. Anyway, updated ratios:

Pearson: 0.1968

Spearman: 0.2135

Kendall: 0.1430.

As you can see, it's gotten better.

 
Demi:

Correlation coefficient = 0.766654

Everything was calculated in Excel. The only thing - I took gold quotes from MT (I was too lazy to convert the comma digits to points manually in yours)

You can't count on rows, only on first differences.
 
Why not?
 
Demi: why not?

About half of the posts in this thread are devoted to a discussion of this issue (started here)

My opinion: Estimation of correlation by Pearson's correlation coefficient, by analogy with expectation estimation by arithmetic mean and variance estimation by RMS, is acceptable only for elements of sets from linear space. Otherwise, it is necessary either to make a transformation over the original data (for example, in the case of price time series, to convert measurements from absolute relative scale to interval scale) or to adjust the formulas for estimation.

 
GaryKa:

About half of the posts in this thread are devoted to discussion of this issue (started here)

My opinion: Estimation of correlation by Pearson correlation coefficient, by analogy with expectation estimation by arithmetic mean and variance by RMS, is acceptable only for elements of sets from linear space. Otherwise, it is necessary either to make a transformation over the original data (for example, in the case of price time series, to convert measurements from absolute to interval scale) or to adjust the formulas for estimation.

Actually here.

There is a lot of text - the correlation can be counted both between series and between the first differences. Hafftar posted two graphs and showed correlation coefficients of dimension 0.00... This struck me and I recalculated. But the afftar corrected himself.

P.S. Simpler, simpler we should be....

 

C-4:

Obviously, the first differences of the form I(0) are required for the calculation, because in the case of I(1) we are in for an ambush, because the series we are dealing with are always positive (the price is always greater than zero), but about that too later.


Heh, not obvious. For Pearson QC it doesn't matter whether the series are positive or negative, what matters is whether there is covariance, i.e. the similarity of the dynamics. Uncorrelated first differences do not at all imply that the original series are uncorrelated. Moreover, taking this very difference destroys the linear correlation elements which Pearson shows. There is therefore nothing unusual in the result obtained, and the conclusion is

1. As you can see the I(1) series cannot be used at all. For series whose correlation is not obvious and not rigidly functional, correlation coefficients are absolutely useless.

The fact that the QC is allegedly overestimated is absolutely wrong: the process is centered in the calculation (the sample average is subtracted), so the QC can be positive or negative. I.e. 15% in your case is a perfectly realistic coefficient, which is about what I would give when looking at the graph visually.

 
alsu:

I.e. 15% in your case is a perfectly realistic coefficient, which is about what I would give looking at the graph visually.

I do agree with this.

alsu:

Heh, not obvious. For Pearson's QC it doesn't matter whether the series is positive or negative, what matters is whether there is covariance, i.e. the similarity of the dynamics. Uncorrelated first differences do not at all imply that the original series are uncorrelated. Moreover, taking this very difference destroys the linear correlation elements which Pearson shows. There is therefore nothing unusual in the result obtained...

Ok, then why if we generate 100 independent BP(1) with insignificant positive bias (i.e. most BPs are in the area > 0), then build their correlation matrix and then get a histogram of their distributions, we will not see anything common with normal distribution on this histogram, but we will see this:

We can see that out of 10 000 BP combinations (100*100), there are as many combinations with 0.5 and -0.5 correlation. I.e. the probability that two independent, positive random walks will be correlated with each other with KK 0.0 is the same as if their KK were equal to any other number from -1.0 to +1.0. Which means I(1) cannot be used. Somehow.

 

The problem of correlation is on a completely different plane.

When QC is counted, we always get a number. The algorithm does not provide a QC= NA value, i.e. "no value". Not zero, but 'no value'. This is why it is possible to get a correlation of kothir with Saturn's rings, and at the same time with nose problems.

QC should only be counted for those pairs about which you know from their content that they are potentially correlated. At a minimum. And in general there needs to be a meaningful justification for the existence of such a connection. In this case the figure obtained will be interpreted as a quantitative measure of this content.

I am silent about the rest of the subtleties of the calculation.

 
faa1947:

The problem of correlation is on a completely different plane.

When QC is counted, we always get a number. The algorithm does not provide a QC= NA value, i.e. "no value". Not zero, but 'no value'. This is why it is possible to get a correlation of kothir with Saturn's rings, and at the same time with nose problems.

QC should only be counted for those pairs about which you know from their content that they are potentially correlated. At a minimum. And in general there needs to be a meaningful justification for the existence of such a connection. In this case the figure obtained will be interpreted as a quantitative measure of this content.

I am silent about all other subtleties of calculation.

This is all nonsense. "Potentially connected" everything in this world. And ocean temperature off the coast of Mexico has a functional effect on wheat yields in France.

A correlation coefficient can also be calculated between phenomena that are not causally related. The question is the interpretation of this coefficient