Volumes, volatility and Hearst index - page 5

 
Yurixx:

You see, even if I put a coefficient not equal to 1 and determine it in some way for the euro on tf = H1, it does not mean that for the pound and on the other TF it will be the same. And it is not interesting. It is like dealing with the scale for each pair separately. If so, we can work with volumes.

Well, we may consider Hurst in the old way, as the slope of regression, and then this coefficient won't matter. In fact, you are not bound to standard TFs, so it is not a problem to find points for regression.


P.S. It wasn't a laugh, it was a smile. In the sense of a certain skepticism. Although, maybe I'm wrong and forum users can easily solve this problem.

 

I have calculated the (High-Low)/(Close-Open) ratio using a simple script for 1.5 million minute bars.

For AUDUSD on the interval from 2005.11.02 07:49 to 2010.08.20 22:59 the average (H-L)/(C-O) = 1.65539495
for USDJPY on the interval from 2006.04.11 20:21 to 2010.08.20 22:59 the average (H-L)/(C-O) = 1.72965927
for USDCHF on the interval 2006.01.24 04:23 to 2010.08.20 22:59 mean (H-L)/(C-O) = 1.69927897
for USDCAD on the interval from 2005.05.19 13:31 to 2010.08.20 22:59 average (H-L)/(C-O) = 1.62680742
for GBPUSD on the interval from 2006.02.21 23:31 to 2010.08.20 22:59 Average (H-L)/(C-O) = 1.65294349
For EURUSD on the interval from 2006.03.08 13:41 to 2010.08.20 22:59 the average (H-L)/(C-O) = 1.69371256

It's not that much of a spread. Although I was hoping it would be even smaller.

By the way, it is interesting if the local value of this ratio can help to distinguish a trend from a flat. At least impulses must be detected.

 

(High-Low)/(Close-Open) ?

sorry, is the module lost?

 
Svinozavr:

Let me explain the method. ...

It's certainly an interesting approach. And, presumably, in the author's hands it is effective.

But all of these indicators continue to retain their time settings. Which, as I understand it, are set by taste.

That is, if we are looking for objective indicators here, it should be the criteria for selecting values for these parameters that should be the subject of discussion.

Meanwhile, that's exactly what Peter never mentioned. Or maybe I missed it.

And it would be interesting to listen.

 
NorthAlec:

(High-Low)/(Close-Open) ?

sorry, is the module lost?

the module is not lost

  for (i=Bars-1;i>0;i--) {
    double res = Close[i]-Open[i];
    if (res < 0) res = -res;
    SumCO += res;
    SumHL += High[i]-Low[i];
  }
  if (SumCO != 0) Alert("Для ",Symbol()," на интервале от ",TimeToStr(Time[Bars-1])," до ",TimeToStr(Time[0])," среднее (H-L)/(C-O) = ",DoubleToStr(SumHL/SumCO,8));
 
Candid:

I calculated with a simple script the ratio (High-Low)/(Close-Open) on 1.5 million minute bars.


And what can this ratio mean in terms of meaning? By definition, this ratio must be greater than 1. Also, it cannot be too high, because the price (almost always) moves at a terminal velocity. Clearly, there is an average value somewhere in between. And it should not differ much by instrument - the market mechanism is the same everywhere. If you draw the distribution (Close-Open) (without a module) inside a bar, we will most probably obtain a uniform distribution. And this will be the best confirmation that the value is purely random.

Maybe I do not understand something, but I have long ago stopped paying attention to Close and Open as sources of statistical data. First, their values are purely random (in relation to the data set of the corresponding minute), and second, they entirely depend on the start of timing, which is not good. Move the starting point by a few seconds, and these values will change. But the pair High and Low is another matter. This pair defines the corridor in which the price moves. It is essential, of course, if not to play within a bar. But if we do, then all of our indicator approaches are irrelevant. In addition, this pair sets the spread and volatility. IMHO, very important characteristics, which we just have to learn to use.

 
Yurixx:

What could this attitude mean in terms of meaning?

I wrote in another thread
 

So, there are a lot of unanswered questions about the Hirst indicator. I hadn't thought to do it, but the criticism, questions and comments by Nikolay(Candid), for which I am very grateful to him, convinced me that it should be dealt with in a real way. Without that the formula proposed above for calculating the Hurst Index seems to be simply taken from the ceiling.

It was also necessary to respond (including myself) to such an observation:

Candid:

But so far there are no sufficient grounds for comparing absolute values of this value with "calibration" for Hearst. i.e. to consider that at 0.5 the series is random, above it is trendy and below it is recursive.

For this characteristic you need to make your own calibration.



I will not describe the details of the trial, I will simply tell you what I have come to.

We will talk about a series of random numbers (SR) which is a model of tick flow: each tick gives +/- 1 point change of price. The model is, of course, very approximate, but we are not dealing with the market, but with Hirst. And, first of all, we need to deal with an equal probability flow, i.e. pure SB, when the probabilities of ticks +1 and -1 are equal to 50% each. This would also provide the calibration mentioned by Nikolai.

The calculation of Hurst index is based on the average range, i.e. the difference between the maximum and minimum price in the interval. In addition to this value, there are two other very relevant ones - the average modulus of increments and dispersion of increments. All three were involved in the study. The designations used below are as follows:

N is the number of ticks on the interval. The first point of an interval (initial price value) is the last tick of the previous interval and is not included in the current one. Therefore, the number of price changes within the interval is equal to its number of ticks.

K - number of intervals in the statistics.

R - average price spread by K intervals.

M - average increment modulus by K intervals.

D - dispersion of increments by K intervals.

The price increment in the interval is a convenient value easily represented in the analytical form, equal to the difference between the final and the initial price of the interval. Therefore M and D can be calculated without any problems. With the spread of R it is much more complicated. Since min and max prices on the interval can be reached at any point, the spread depends on the entire price path and can't be expressed in the analytic form at all. In other words, it's impossible to get a general formula for it (as Nicolai insidiously asked).

Nevertheless, the task to investigate the behavior of the Hurst index for SB is set and therefore we must obtain accurate results, not limit ourselves to approximate experiments.

In this situation, there is nothing to do but, based on the definition of the spread, to calculate its values "head-on".

 

For this purpose I had to write a script which, for a given number of ticks N in the interval, constructs all possible price trajectories. Since all these trajectories are equally probable for SB, it remains to determine the spread for each of them and calculate its average for all trajectories. This will be its "theoretical" value, or MO in short. Obviously, the total number of all possible price trajectories for the interval of length N is 2^N. According to the same law, the script counting time and memory consumption grows. So it is possible to calculate the spread MO only for a region of small values of N. Average modulus and variance of increments are calculated for completeness of the picture and for indirect check of correctness of calculations.

N R M D
1 1.0000 1.0000 1.0000
2 1.5000 1.0000 2.0000
3 2.0000 1.5000 3.0000
4 2.3750 1.5000 4.0000
5 2.7500 1.8750 5.0000
6 3.0625 1.8750 6.0000
7 3.3750 2.1875 7.0000
8 3.6484 2.1875 8.0000
9 3.9219 2.4609 9.0000
10 4.1680 2.4609 10.0000
11 4.4141 2.7070 11.0000
12 4.6396 2.7070 12.0000
13 4.8652 2.9326 13.0000
14 5.0747 2.9326 14.0000
15 5.2842 3.1421 15.0000
16 5.4806 3.1421 16.0000
17 5.6769 3.3385 17.0000
18 5.8624 3.3385 18.0000
19 6.0479 3.5239 19.0000
20 6.2241 3.5239 20.0000
21 6.4003 3.7001 21.0000
22 6.5685 3.7001 22.0000
23 6.7367 3.8683 23.0000
24 6.8978 3.8683 24.0000
25 7.0590 4.0295 25.0000

For the SB in question there is a simple formula which relates the variance of increments D to the number of ticks N:

D = N .

Apparently Hurst, postulating his formula for the average variance, relied on this theoretical result.

The table shows that the obtained values of D are in full agreement with this formula. It means that the algorithm for generating the whole set of price trajectories and the arithmetic for calculation of averages are written correctly. The calculation of max and min prices on the interval and their differences is so simple that the error probability is close to zero.

 

Now that we have something to compare it with, we can see how the Hearst exponent behaves for SB with different values of the interval N.

Let me remind you of the formula used to calculate the Hearst ratio as defined by its author.

H = (Log(R2) - Log(R1))/ (Log(N2) - Log(N1))

The two-point calculation scheme is due to the need to get rid of the unknown factor that is present in the Hurst formula.

To simplify the calculations, to be more clear and to maximize the research range, the number of ticks in the interval N was also changed in powers of two. That is, N = 2^n was taken. The base of the logarithm in the formula for H does not play a role. Therefore it was assumed to be 2, so Log(N ) =n.

The calculation algorithm was as follows:

  1. We set the number n, initial price p=0 and calculation accuracy acc=0.001.
  2. Calculate number of points in the interval N
  3. Use the built-in PRNG to generate the K-th interval - N unit tick increments
  4. Calculate for this interval the range and modulus of price increment
  5. Sum up cumulatively the amplitude, the modulus and the square to the variables
  6. Calculate the mean and variance for K intervals
  7. Determine whether the accuracy condition is fulfilled. If not, add one to K and proceed to step 3. If not, finish the script.

The results are in the table.

(Unfortunately, I failed to paste the whole table - the editor doesn't accept text of this size. I had to split it into 2 tables, saving the first two columns for convenience. The first one will be referred to as 2a, and the second one as 2b.)