Comparison of two quotation charts with non-linear distortions on the X-axis - page 6

 

How is this problem solved via DTW (example):

  1. We need to find similar situations on the history as the extreme 100 bars.
  2. The available history is 1 000 000 bars.
  3. First, we took 1,000,000 sequences of 50 bars and compared them with the pattern through DTW.
  4. Then we took another 1,000,000 sequences of 55 bars and compared them through DTW with the template.
  5. This time it is 60 bars.
  6. .....
  7. At 100 bars.
  8. .....
  9. 300 bars. And this is where we can stop.

In total we have done 50,000,000 comparisons using DTW algorithm which has complexity of O(N^2). That is very roughly 5 * 10^11 (500 billion) elementary computational operations were performed.

Now a new bar has arrived - we have done 500 billion calculations again.

Decided to run on history, starting with 200,000 of the outermost element. Roughly, it takes 200,000 times 500 billion calculations each to make a run. That's 10^17 calculations in total.

Even if there is a tricky optimization, it will not yield a gain of more than two orders of magnitude. I.e. it will have to perform 10^15 calculations at most.

 
hrenfx:

How this task is solved via DTW (example):

You see, at this point we are more interested in the DTW itself than in how to apply it.
 
hrenfx: How this task is solved with DTW (example):

If you don't mind, please show the solution of this problem in the code, it's not so much of practical interest, but rather of sporting interest.

 

No one in their right mind would undertake to implement an algorithm, the result of which would simply not wait.

The same Pearson QC would also be unsuitable, as would DTW, since its computational complexity is also O(N^2). But there is a significant algorithmic optimization of Pearson QC computation with complexity O(N * log(N)) that could allow to solve this problem in a reasonable time. The implementation of this algorithm has been posted in Codebase. To solve the problem raised, it remains to apply the same algorithm to the set of ZigZag-transformed ZVRs.

 
hrenfx:

No one in their right mind would undertake to implement an algorithm, the result of which would simply not wait.

The same Pearson QC would also be unsuitable, as would DTW, since its computational complexity is also O(N^2). But there is a significant algorithmic optimization of Pearson QC computation with complexity O(N * log(N)) that could allow to solve this problem in a reasonable time. The implementation of this algorithm has been posted in Codebase. To solve the problem raised, it remains to apply the same algorithm to the set of ZigZag-transformed ZVRs.


You should first read the problem that the author is facing and his answers.
 
Integer:


I tried it. I don't understand how to use it either. The output should either be a transformation path or transformed data.

The output of the algorithm is an "accumulated cost matrix" (rather than just the local distance matrix shown in the figure), with the method returning the value of the bottom right cell (by construction, it is the maximum in the whole matrix). To find the path now, we just need to move from cell (n,m) towards cell (1,1), choosing each time the option with the lowest value:

1: path[] <- new array
2: i =rows(dtw)
3: j =columns(dtw)
4: while(i>1) & (j>1)do
5:    if (i == 1) then
6:       j = j-1
7:    else if (j == 1) then
8:       i = i-1
9:    else
10:     if dtw(i-1;j) == min{dtw(i-1;j); dtw(i;j-1); dtw(i-1; j-1)} then 
11:        i = i-1
12:     else if dtw(i;j-1) == min{dtw(i-1;j); dtw(i;j-1); dtw(i-1;j-1)} then
13:        j = j-1
14:     else
15:     i = i-1; j = j-1
16:     end if
17:   path:add(i;j)
18:   end if
19: end while
20: return path
 
The resulting path array contains the optimum path for transforming one signal into another. This can be done in either direction.
 
The degree of similarity of the signals is determined by the value of the lower right cell of the matrix, the larger it is, the more different the signals are.
 
alsu: The degree of similarity of the signals is determined by the value of the lower right cell of the matrix, the larger it is, the more different the signals are.
OK, thank you, you have explained it quite clearly, one more question: is it necessary to normalise or reduce the data to the same order, does it affect the result?
 
IgorM:
OK, thank you, that was quite clear, one more question: is there a need to normalise or reordering the data, does it affect the result?
dtw[n][m] depends on scale, so if there are many pairwise comparisons, we should scale it up and centre it preferably.