Machine learning in trading: theory, models, practice and algo-trading - page 3260

 
Forester #:

This is the input matrix.
The output will be 15000 strokes to each of the 15000 rows. As in all other examples about 1.7 Gg each (if in Double by 8 bytes)

I agree that this is not how it counts.

 
fxsaber #:

So far, I don't see any technical obstacle to calculate a million-by-million matrix on a simple home machine. But the comparison of NumPy vs MQL5 is very important for me.

Are you sure?


For example, an input matrix with 50,000 columns/100 rows will give a correlation matrix of 50, 000 x 50, 000 x 8 bytes / (1024 x 1024 x 1024) = 18.63 GB

 
input int inRows = 100; // Длина строки
input int inCols = 15000; // Количество строк

bool IsEqual( matrix<double> &Matrix1, const matrix<double> &Matrix2 )
{
  Matrix1 -= Matrix2;  
  
  const bool Res = (MathAbs(Matrix1.Mean()) < 1 e-15);
  
  Matrix1 += Matrix2;
  
  return(Res);
}

#define  TOSTRING(A) #A + " = " + (string)(A) + " "

void OnStart()
{  
  double Array[];  
  Print(FileLoad("qwe\\arr.csv", Array)); // RAM-drive. https://www.mql5.com/ru/forum/86386/page3258#comment_49549438
  
  matrix<double> Matrix;  
  Matrix.Assign(Array);
  Matrix.Init(inCols, inRows);
  Matrix = Matrix.Transpose();
  
  ArrayFree(Array);  
  Print(FileLoad("qwe\\matr.csv", Array)); // RAM-drive. https://www.mql5.com/ru/forum/86386/page3258#comment_49549438

  matrix<double> Matrix2;
  Matrix2.Assign(Array);
  Matrix2.Init(inCols, inCols);
  Matrix2 = Matrix2.Transpose();
    
  ArrayFree(Array);
  
  matrix<double> Matrix1 = CorrMatrix(Matrix); // https://www.mql5.com/ru/forum/86386/page3256#comment_49538685

  Print(TOSTRING(IsEqual(Matrix1, Matrix2)));
}


Full coincidence of NumPy calculation values with MQL5.

1500000
225000000
IsEqual(Matrix1, Matrix2) = true 
 
Forester #:

This is the input matrix.
The output will be 15000 strokes to each of the 15000 rows. As in all other examples about 1.7 Gg each (if in Double by 8 bytes)

In general, alas, python does not know how to work with int - it converts it to double apparently.

import numpy as np
import time

def calc_corr_matrix():
    arr = np.random.randint(1, 101, size=(15000,100), dtype=np.int32)
    corr_matrix = np.corrcoef(arr)
    size_in_mb = corr_matrix.nbytes / 1024**2
    print("Array size:", size_in_mb, "MB")
    return corr_matrix

np.random.seed(123)

start_time = time.time()
corr_matrix = calc_corr_matrix()
end_time = time.time()

print("Time taken:", end_time - start_time, "seconds")
Array size: 1716.61376953125 MB
Time taken: 4.62926459312439 seconds
 
Aleksey Vyazmikin #:

In general, alas, python does not know how to work with int - it converts it to double, apparently.

Stop spamming rubbish. Correlation in ints does not count.

 
Maxim Dmitrievsky #:

Stop spamming bullshit. Correlation in ints doesn't count.

You don't need to open America. It's not common to count, but it's worth thinking about how it can be done.

 
Aleksey Vyazmikin #:

America doesn't need to be discovered. It is not common to consider, but it is worth thinking about how it can be done.

In a new thread, think of something

 
Maxim Dmitrievsky #:

in the new thread, come up with

What a bunch of people - I go to waste time for him and he's rude.

What the hell...

 
Aleksey Vyazmikin #:

America doesn't need to be discovered. It is not common to consider, but it is worth thinking about how it can be done.

I have already described the way - take Alglib f-iys (there are 8 pieces called from PearsonCorrM) and change data types. Even in 1 byte uchar. 4-byte ints won't give much gain.
Do it for yourself if you need to.
 
Aleksey Vyazmikin #:

I go and waste time for him and he's rude.

Fuck it.

I didn't ask you to waste your time for me.

Reason: