Machine learning in trading: theory, models, practice and algo-trading - page 1923

 
mytarmailS:


Is it just me or are you dimensionally reducing without a teacher? I'm talking about "uwot" (umap).

 
Vladimir Perervenko:

Is it just me or are you dimensionally reducing without a teacher? I'm talking about "uwot" (umap).

Yes, only I use the "umap" package.

 
mytarmailS:

Yes, only I use the "umap" package.

That doesn't make sense. Reducing the dimensionality to a space that matches your target is the purpose of the conversion. Also these are the only two packages that provide the ability to handle new data(valid/test) and not just train like tSNE.

After transforming, cluster with dbscan. The obtained clusters as additional predictor for variables to be introduced. There may be variants here.

Good luck

 
mytarmailS:

Yes, but I use the "umap" package

What's the name of the method itself? What is it anyway? I'd like to see it in python.

some kind of amoeba and cell life in pictures

The magician showed similar transformations, by the way. He had dots stretching and shrinking into ellipses, I remember something like that.

 
Maxim Dmitrievsky:

What is the name of the method itself? What is it anyway? In Python I'd like to see it.

Some kind of life of amoebas and cells in pictures

In Python the umap package has the same name.

 
Vladimir Perervenko:

In Python, the umap package has the same name.

Thanks, I'll take a look

 
Vladimir Perervenko:

To reduce the dimensionality to a space that corresponds to your target is the purpose of the transformation.

Well, how do you do that, where do you get this target match from, and what do you mean by that?

Vladimir Perervenko:

Besides they are the only two packages which give you the ability to process new data (valid/test) and not only train like tSNE.

I know, that's why I chose this package.

Vladimir Perervenko:

After transformation I will cluster with dbscan. The resulting clusters as an additional predictor to the variables to be introduced. There are variants possible here.

I know)) I wrote about dbscan on the previous page.)

But it'll be hard to work with it too. Firstly we should play with clusters all the same and secondly it's too slow for recognizing new data.

I read somewhere that they were planning to make a package or in p-studio this feature was supposed to appear - that clusters could be selected manually with the mouse, haven't heard anything about it?

 
Maxim Dmitrievsky:

Thanks, I'll check it out.

I am just using it, or rather its wrapper in R

 
mytarmailS:

Well, how to do it, where to get this match target? And in general, what do you mean by this concept?

I know, that's why I chose this package.

I know))) wrote on the previous page about dbscan )

But with it, too, would be a hassle, first, with clusters all the same will need to play around and secondly he has a slow recognition of new data.

I read somewhere - or the package is planned to do or in p-studio chip should appear - that the cluster will be able to manually select directly with the mouse, did not hear about it?

In order:

You set the constants:

#---const-------------------------------
 evalq({
  n_n <- 15 L
  min_d <- 0.1
  n_c <- 3 L
  metr <- "euclidean" #"euclidean" (the default) "cosine" "manhattan"
  lr <- 0.75
  scale <- "none" 
  #               "none"
  #               "scale" or TRUE ,
  #               "maxabs" Center each column to mean 0, then divide each element by the maximum 
  #                         absolute value over the entire matrix.
  #               "range"
  init <- "spectral" # "spectral" # "normlaplacian". # "random".
  # "lvrandom". # "laplacian".  # "pca". # "spca".
}, env)

For learning with a teacher, we just need to add the target y to the formula and specify that we need to return a model (ret_model = TRUE).

#-----superveised------------------
 evalq({
  y <- factor(denoiseX1pretrain$origin$y)
  origin.sumap <- umap(X = x, y = y, approx_pow = TRUE, n_neighbors = n_n, 
                       learning_rate = lr, min_dist = min_d, n_components = n_c, ret_model = TRUE,
                       metric = metr, init = init, n_threads = 4 L, scale = scale)
}, env)

With the model in place we can convert the rest of the train/test/test1 subsets of the origin data group to 3-dimensional as well. Below is the code

#---train--------------------------------
evalq({
  set.seed(1337)
  umap_transform(X = X1$train$x, model = origin.sumap, n_threads = 4 L, 
                 verbose = TRUE) -> train.sumap
}, env)
#---test--------------------------------
evalq({
  set.seed(1337)
  umap_transform(X = X1$test$x, model = origin.sumap, n_threads = 4 L, 
                 verbose = TRUE) -> test.sumap
}, env)
#---test1-------------------------------
evalq({
  set.seed(1337)
  umap_transform(X = X1$test1$x, model = origin.sumap, n_threads = 4 L, 
                 verbose = TRUE) -> test1.sumap
}, env)

Substitute your x/y and you get three-dimensional data divided into two groups. Taken from an underwritten article. I have some pictures somewhere but I can't find them now. If I need them, I'll look for them tomorrow. But I think you can get yours.

Good luck

 

Found

resdimX1_origin