Discussing the article: "Two-sample Kolmogorov-Smirnov test as an indicator of time series non-stationarity" - page 2

 
It is interesting to compare iSmirnovDistance with a fractal dimension (like this https://www.mql5.com/en/code/20586).
Fractal dimension index (Sevcik/Matulich)
Fractal dimension index (Sevcik/Matulich)
  • www.mql5.com
Mandelbrot describes the Fractal Dimension Index (FDI) as a way to measure "how convoluted and irregular" something is. The FDI can be used as a stock market indicator. The closer prices move in a one-dimensional straight line, the closer the FDI moves to 1.0. The more closely prices resemble a two-dimensional plane, the closer the FDI moves to 2.0.
 
Stanislav Korotky #:
It is interesting to compare iSmirnovDistance with fractal dimension (like this https://www.mql5.com/en/code/20586).
The Smirnov criterion (and similar ones) is an indicator, if I may say so, of the zero, basic level. It does not tell you whether you should buy or sell, it tells you how much data to take to analyse the first level indicators such as FDI, which already give signals for trading. At least that is how I see it.
 
Aleksey Nikolayev econometrics.

In general, the article is good.

There are indeed many tests of heterogeneity, as the topic is very important.
I understand that Pettit's test is based on ranks, but I have not found much information about it.
 

For me, the window for observations is too small.

However, even if we take this small window, maybe it makes sense to compare it not with a neighbouring window, but with windows for the last year or five years? It will be such a chessboard from which we can see how many windows were similar, group them and perhaps classify them. And then evaluate for patterns and their probabilistic outcomes.

Eugene Chernysh, have you done something like this?

 

Евгений Черныш #:

I understand that pettit is based on ranks, I have found almost no information about it.

I usually use its implementation from the trend package in R. There are references to sources in the description.

 
Aleksey Vyazmikin #:

To me, the window for observation is too small.

However, even if we take this small window, maybe it makes sense to compare it not with a neighbouring window, but with windows for the last year or five years? It will be such a chessboard from which we can see how many windows were similar, group them and perhaps classify them. And then evaluate for patterns and their probabilistic outcomes.

Imho, this would be typical p-hacking.

 
Aleksey Vyazmikin #:

To me, the window for observation is too small.

However, even if we take this small window, maybe it makes sense to compare it not with a neighbouring window, but with windows for the last year or five years? It will be such a chessboard from which we can see how many windows were similar, group them and perhaps classify them. And then evaluate for patterns and their probabilistic outcomes.

Eugene Chernysh, have you done something like this?

No, I have not tried it as you say, but it seems to me that the distribution of Smirnov distances will be the same with this approach as with the calculation of consecutive two days. But to collect statistics on the average number of days between two rejections of the null hypothesis of homogeneity, that can be done. Get an idea of how much time we have on average until a new distribution is established in the market.



 
Aleksey Nikolayev #:

Imho, it would turn out to be typical p-hacking.

How do you see it? I'm talking about a study on the similarity of days, and the similarity of predictor behaviour on those days.

I don't know the outcome so there is no purpose in fitting the study to the desired outcome.

If we can classify such groups, even within a day, we can use separate models for them on predictors with higher probability.

 
Евгений Черныш #:
distribution of Smirnov distances will be the same as in the calculation of consecutive two days.

How is this possible? Do I understand correctly that the last day and the one 100 days ago will have similar estimated metrics, as if the last day and the day before last were not similar? I.e. the difference varies within a narrow range?

Eugene Chernysh #:
But to collect statistics of the average number of days between two rejections of the null hypothesis of homogeneity is something you can do. Get an idea of how much time on average we have until a new distribution is established in the market.

Well, it is also interesting to look at the histogram of frequencies of distribution change.

 
Aleksey Vyazmikin #:

How do you see it?


As usual, multiple repetitions of the same test on the same data. If there are N days, then the number of repetitions of the test is N*(N-1)/2 (the number of pairs of days). It has to be N/2.

Not that I'm trying to forbid anyone to do this) Just, imho, this is the first step to self-deception.