Discussing the article: "Two-sample Kolmogorov-Smirnov test as an indicator of time series non-stationarity" - page 4

 
The Smirnov test by definition applies to (a) independent, (b) identically distributed samples that are uniquely defined by their univariate distributions.

Attempting to use this test to determine violations of (a) and (b) in samples is not a good idea. Obviously, each sample may have a different violation and the sum of the possible violations is a large number of possible violations.
 
Aleksey Nikolayev #:

We meant fractality as such, not its specific indicator. Usually it is associated with persistence/antipersistence of a series, which are related to the dependence of neighbouring increments, which in turn is determined by their joint distribution.

If we talk about specific indicators of fractality, then FDI is not very good because it requires a lot of data for calculation and does not give values for the confidence interval of dimensionality.

I think you are mixing two things. The algorithm for calculating a particular statistic and the consequence we can make if it takes on certain values.
You must have meant that if the fractality index rejects the hypothesis of SB, then the probability function of a two-dimensional vector(Xn,Xn-1) is not equal to the product of one-dimensional vectors, I understand correctly ?

F(Xn,Xn-1) != F(Xn)*F(Xn-1)



 
Aleksey Nikolayev #:
The Smirnov test by definition applies to (a) independent, (b) identically distributed samples that are uniquely defined by their univariate distributions.

Attempting to use this test to determine violations of (a) and (b) in samples is not a good idea. Obviously, each sample's violation can be different and the sum totals up to a large number of possible violations.
Independence yes. But that's no reason to abandon the Smirnov criterion. Look how it can work. You compare two homogeneous independent samples and observe the null hypothesis all the time, then at some point in time, let's say, dependencies appear in the series, the Smirnov criterion will react to this by rejecting the null hypothesis, because it "does not like" dependencies. So this undesirable side effect is only a plus for the trader.

And what does it mean to require the same distribution of samples ???

Unfortunately, economic time series cannot afford such luxury. This requirement is a priori impossible and will never be fulfilled. We do not control the experiment or experience as physicists can afford and achieve ideal conditions for obtaining equally distributed data.





 
Евгений Черныш #:
I think you're confusing two things. The algorithm of calculation of a particular statistic and the consequence we can make if it takes certain values.
You probably mean that if the fractality index rejects the hypothesis of SB, then the probability function of a two-dimensional vector(Xn,Xn-1) is not equal to the product of one-dimensional vectors, I understand correctly ?

F(Xn,Xn-1) != F(Xn)*F(Xn-1)



Reasoning about fractality, FDI and other such things takes us out of the realm of matstat. What distribution has FDI in the case of SB? I don't know (and nobody knows, except that it is probably possible to calculate asymptotic distribution). So the term "statistics" has little application to FDI. In the few normal studies (when Monte Carlo method was used to calculate p-value for Hurst on SB) on real prices, the null SB hypothesis could not be rejected.

We only have empirical guesses about the relationship between fractal dimensionality and incremental correlations. At this level, yes, you have understood me correctly - if it is said about fractality, then the dependence {F(Xn,Xn-1) != F(Xn)*F(Xn-1)} appears and we can no longer talk about the applicability of the Smirnov test. That is why I think that Smirnov and FDI are not similar. At most, at the empirical level, one can consider Smirnov more applicable when FDI is close to its theoretical value on the SB (although there is a little doubt about this for trending assets).

 

Евгений Черныш #:

What does it mean to require the samples to be equally distributed?

Each of the two samples must be obtained from i.i.d. set of random variables. I have already written that due to daily fluctuations in volatility (due to market session, for example) the i.i.d. condition is violated.

 
Евгений Черныш #:
Independence, yes. But that's no reason to abandon Smirnov's criterion. Look how it can work. You compare two homogeneous independent samples and observe the null hypothesis all the time, then at some point in time, let's say, dependencies appear in the series, the Smirnov criterion will react to this by rejecting the null hypothesis, because it "does not like" dependencies. So for a trader this undesirable side effect is only a plus.

You will not be able to distinguish between violation of i. and violation of i.d. from condition i.i.d.) You will not be able to determine for which of the samples they are violated. Do the math yourself - there are 16=4*4 variants in total, of which only one is without violation of the conditions.

So Smirnov is made precisely to "dislike" deviations in i.d.) And you want to confuse him).

 
Aleksey Nikolayev #:

Discussions about fractality, FDI and other such things take us out of the matstat domain. What distribution has FDI in the case of SB? I don't know (and nobody knows, except that it is probably possible to calculate asymptotic distribution). So the term "statistics" has little application to FDI. In the few normal studies (when Monte Carlo method was used to calculate p-value for Hurst on SB) on real prices, the null hypothesis of SB could not be rejected.

We only have empirical guesses about the relationship between fractal dimensionality and incremental correlations. At this level, yes, you have understood me correctly - if it is said about fractality, then the dependence {F(Xn,Xn-1) != F(Xn)*F(Xn-1)} appears and we can no longer talk about the applicability of the Smirnov test. That is why I think that Smirnov and FDI are not similar. At most, at the empirical level, one can consider Smirnov more applicable when FDI is close to its theoretical value on the SB (although there is a little doubt about this for trending assets).

You are taking the independence requirement for the Smirnov criterion too literally. This requirement is necessary for the distribution of a given statistic to converge to the Kolmogorov distribution in order to test the null hypothesis. Therefore, if the independence requirement is violated, the Smirnov criterion can be an indicator for detecting statistical relationships in the data. That is, the independence requirement in no case prohibits the application of the Smirnov criterion to the data in which there may be some statistical relationships. Further, no one forbids to calculate the correlation of the studied data. If no linear dependence is found, well, consider that the requirement of independence is practically fulfilled and then the difference in the distribution of Smirnov distances is caused solely by the heterogeneity of the data. For nonlinear dependencies, the distribution of Smirnov distances differs insignificantly from the Kolmogorov distribution (at least for logistic mapping it is so). That is, it is clear that one Smirnov criterion cannot be used alone, we need to use additional methods of analysis.

As for the FDI, it most likely has exactly the same distribution for the SB as the Hurst index, i.e. normal. With the help of Monte Carlo method everything can be calculated there, Peters did it in his work "Fractal analysis of financial markets". FDI is no different from any other statistic in the sense that it itself is a random variable, like sample mean or sample variance, so you can easily find out how this statistic behaves on the SB, on small samples, on large samples, etc.

 
Aleksey Nikolayev #:

Each of the two samples must be obtained from i.i.d. set of random variables. I have already written that due to daily fluctuations in volatility (due to market session, for example) the i.i.d. condition is violated.

Requirements of equal distribution are good for theorem proving, rigorous proofs and within the department of Mathematical Statistics, but for real data this requirement is too strict. You must control the course of the experiment, make sure that the conditions under which the observation of a random variable does not change over time. It is clear that in the case of stock quotes we do not control anything. We simply observe how the invisible hand of the market pulls a certain number (price increment) out of the box, but we do not know whether at each moment of time the contents of this box are changed or not (and no one will ever know). This is the reality and we have to work with what we have.

In my opinion, comparing day to day is correct, because we have Asian, European and American sessions in each sample. If I were to compare the Asian session with the American one, it would be wrong. Well, of course, everyone decides for himself.

 
Aleksey Nikolayev #:

You will not be able to distinguish between violation i. and violation i.d. from condition i.i.d.)

I can and you can, at least for model data.

Is the Autoregressive process equally distributed ? It is identically distributed.

Is it independent ? No.

Does the Smirnov criterion "see" that? Yes.

 
Евгений Черныш #:

You take the independence requirement for the Smirnov criterion too literally. This requirement is necessary for the distribution of a given statistic to converge to the Kolmogorov distribution for testing the null hypothesis. Therefore, if the independence requirement is violated, the Smirnov criterion can be an indicator for detecting statistical relationships in the data. That is, the independence requirement in no case prohibits the application of the Smirnov criterion to data in which there may be some statistical relationships.

Imho, there is a clear problem with logic. Tautology, from which something else can be deduced.