Machine learning in trading: theory, models, practice and algo-trading - page 3255

 
mytarmailS #:
correlation matrix between the rows of the given features, then the most correlated rows are selected


A correlation matrix (large) of features is built, and then the correlated rows are selected according to it? Are these like patterns?

Sort of

 

(un)by the way, or here again there is a general self-glorification of theoretical results :-))

everyone knows that in most cases training and tests are conducted practically for BO, but you are trying to use them on Forex ? and the introductory business rules are initially mixed and confused. One of the damn nuances

 
Maxim Kuznetsov #:

(un)by the way, or else there's general self-glorification of theoretical results here again :-)

everyone knows that in most cases training and tests are conducted practically for BO, but you are trying to use them on Forex ? and the introductory business rules are initially mixed and confused. One of the damn nuances

BO is what? What you often use and abbreviate - others don't use and have no idea what you mean. Normal authors write in full the first time and show the abbreviation, then abbreviations follow.

 
Maxim Dmitrievsky #:

sort of

For the life of me, I don't see any reason to convert the data into a kor matrix.


Here's a comparison of what the algorithm will see when it searches for patterns in regular rows with features and in a cor matrix with features.

I don't see any advantage...

#  признаки
set.seed(1)
x <- sample(1:5,1000,replace = T)
X <- embed(x,5)[,5:1]

#  кореляционная матрица
corX <- cor(t(X))

#  снижение размерности для визуализации
um_corX <- umap::umap(corX)$layout
um_X <- umap::umap(X)$layout

#  кластеризирую чтобы оценить кластера близких точек
db_corX <- dbscan::hdbscan(um_corX,minPts = 5)$cluster
db_X <- dbscan::hdbscan(um_X,minPts = 5)$cluster

par(mfrow=c(2,1), mar=c(2,2,2,2))
plot(um_X, col= db_X, main="поверхность обычныой матрицы с данными", pch=20,lwd=2)
plot(um_corX, col= db_corX, main="поверхность корреляцонной матрицы", pch=20,lwd=2)


I changed the data to something similar to time series, the result is the same.


 
Maxim Dmitrievsky #:

sort of

So just look for patterns in a regular dataset with signs, without cor. matrix, the result will be the same, almost guaranteed.


P.S. And why did I spend so much time on this... I could have watched it on YouTube...

 
mytarmailS #:

So just look for patterns in a regular dataset with features, without a cor. matrix, the result will be the same, almost guaranteed


P.S. And why did I spend so much time on it all... I could have watched it on YouTube...

Oh, that's it.

 
Maxim Dmitrievsky #:
Oh, everyone

agree

 
mytarmailS #:

agree

go watch YouTube ))

 
Forester #:

BO is what?

I was under the impression that on the site it is a well-established abbreviation : BO - binary options. Curves/slopes/derivatives but it is about discrete countdowns and + - in them. Some of the introductory business rules come from options. And some of them are not, which results in a parsley that doesn't work either here or there.

In reply to my namesake: I am both hands in favour of MO and any movement. But simply if there is no result for a very long time, it is necessary to look for it - maybe something is not initially laid out in the right way. I've never seen even a demo "made using Machine Learning" in a topic. Methodical search implies also criticism (questions/comments to)the basis.

 
Maxim Dmitrievsky #:
Memory overflow on small TFs. The memory overflows with 16 osu and swap file (swap on a mac) 30gig. For example, there is a 50k by 50k correlation matrix.

Apparently, some peculiarities of Python, because the algorithm is the same in MQL.

  1. Run on the 1d-array of Pos-variable.
  2. [Pos-n, Pos] - another pattern.
  3. We applied something similar to this pattern and 1d-array.
  4. We found situations where MathAbs(corr[i]) > 0.9.
  5. In these places we looked at m bars ahead of the price behaviour and averaged it.
  6. We found a lot of places and averaged them nicely? - saved the pattern data (values from step 2).
  7. Pos++ and on p.2.

This is a frontal variant. It's even faster with a sieve.


Let's assume one million bars. The length of the string is 10. Then 1d-array for 10 million double-values is 80 Mb. point 3. - Well, let it be 500 Mb in terms of memory consumption. What haven't I taken into account?

Reason: