Machine learning in trading: theory, models, practice and algo-trading - page 3269

 
mytarmailS #:
Also testing and visualisation and fast action, probably

Final optimisation is also normal, to pick up parameters of deals

 
fxsaber #:

At first I tried to do the frontal variant - to count all lines each time. I got an impression that there is some error in Alglib, because I couldn't find it in my own.


The result often coincides.


But in some situations, it doesn't.


If it was always like this, it'd be my fault. But there's something fishy going on here.

I made

Print(Matrix1);
Print("------------------");
Print(Matrix2);


Matrix2 has the first 5 matching, then zeros.
I changed Res.Col(Corr.Row(0), i); to Res.Row(Corr.Row(0), i);

The matrices became similar, but still IsEqual does not pass. Something else somewhere...
 
Forester #:

Changed Res.Col(Corr.Row(0), i); to Res.Row(Corr.Row(0), i);

It seems to be wrong.

 
Forester #:
Made

Print(Matrix1);
Print("------------------");
Print(Matrix2);


Matrix2 has the first 5 matching, then zeros.
I changed Res.Col(Corr.Row(0), i); to Res.Row(Corr.Row(0), i);

The matrices became similar, but still IsEqual does not pass. Something else somewhere...

Found the problem

It should be like this

for (int i = 0; i < (int)Matrix.Cols(); i++)
{
if (i)
Vector.SwapCols(0, i);

CBaseStat::PearsonCorrM2(Vector, MatrixIn, MatrixIn.Rows(), 1, MatrixIn.Cols(), Corr);

Res.Col(Corr.Row(0), i);
}

 
And PearsonCorrM2 can be speeded up by 2 times if you count by triangle.i.e. go from the end. Count 100 line with all, then 99 with all 0-99, 99 and 100th already counted - you can just copy. ...50th line with all to 50th, etc. And do not count with itself because =1.

P.S.
In general, your code is beautiful and concise. I would do everything in loops))))
 
Forester #:
And PearsonCorrM2 can be speeded up by 2 times if you count by triangle.i.e. go from the end. Count 100 line with all, then 99 with all 0-99, 99 and 100th already counted - you can just copy. ...50th line with all to 50th and so on.
I read somewhere that you can quickly calculate correlation through fast fourier transform... Also as an option to speed up.
 
residuals_a = a_mat - a_mat. column_means residuals_b = b_mat - b_mat. column_means a_residual_sums = residuals_a. column_sums b_residual_sums = residuals_b. column_sums residual_products = dot_product( residuals_a. transpose, residuals_b)
sum_products = sqrt( dot_product( a_residual_sums, b_residual_sums))
correlations = residual_products / sum_products 
 
mytarmailS #:
I read somewhere that you can quickly read the correlation through a fast fourier transform... Also as an option to speed up.

I've done it. It makes sense when the string length is large. I'll show you sometime.

 
Forester #:

Found the problem

It should be like this

Exactly, thanks! I don't understand why the wrong option worked with inCols < 100.
 

Forum on trading, automated trading systems and testing of trading strategies

Machine Learning in Trading: Theory, Models, Practice and Algorithm Trading

Maxim Dmitrievsky, 2023.10.01 10:55 AM

residuals_a = a_mat - a_mat. column_means residuals_b = b_mat - b_mat. column_means a_residual_sums = residuals_a. column_sums b_residual_sums = residuals_b. column_sums residual_products = dot_product( residuals_a. transpose, residuals_b)

sum_products = sqrt( dot_product( a_residual_sums, b_residual_sums))

correlations = residual_products / sum_products

This seems to be a head-on calculation of the correlation matrix.