Machine learning in trading: theory, models, practice and algo-trading - page 141

 
Dr.Trader:

How to do it as standard I do not know, but like this, for example with the library:


Thank you very much!!! You know very well about packages, I was offered to write the format() class all over again on a special forum for the same question, and showed an example with ~ 300 lines of code, I already started thinking about crutches, and here's a great solution... thanks!

 

One more question - I have three dataframes with slightly different lengths, because the observations were made from different times,

How can they be synchronized in time to leave only those observations that are in all three frames and drop those that are found only in separate frames

> head(sec1)
        date  time   open   high    low  close vol
1 2016.09.06 08:45 3081.5 3082.5 3080.5 3080.5   6
2 2016.09.06 08:50 3081.5 3081.5 3079.5 3080.5   6
3 2016.09.06 08:55 3081.5 3082.5 3081.5 3082.5  19
4 2016.09.06 09:00 3083.5 3083.5 3081.5 3082.5  19
5 2016.09.06 09:05 3083.5 3085.5 3082.5 3085.5   8
6 2016.09.06 09:10 3086.5 3086.5 3084.5 3086.5  15
> head(sec2)
        date  time  open  high   low close vol
1 2016.09.13 13:00 95.34 95.40 95.33 95.39  36
2 2016.09.13 13:05 95.40 95.43 95.39 95.41  40
3 2016.09.13 13:10 95.42 95.44 95.40 95.42  37
4 2016.09.13 13:15 95.41 95.42 95.39 95.39  25
5 2016.09.13 13:20 95.40 95.41 95.38 95.38  21
6 2016.09.13 13:25 95.39 95.42 95.38 95.42  32
> head(sec3)
        date  time    open    high     low   close vol
1 2016.09.14 18:10 1.12433 1.12456 1.12431 1.12450 137
2 2016.09.14 18:15 1.12444 1.12459 1.12424 1.12455 139
3 2016.09.14 18:20 1.12454 1.12477 1.12446 1.12469 148
4 2016.09.14 18:25 1.12468 1.12474 1.12442 1.12453 120
5 2016.09.14 18:30 1.12452 1.12483 1.12442 1.12482 156
6 2016.09.14 18:35 1.12481 1.12499 1.12472 1.12474 126
 
https://www.mql5.com/en/blogs/post/650079

Interesting. Improving an existing strategy with machine learning. The article suffers from a lack of information about the sampling, but the idea is interesting.
 
mytarmailS:

One more question - I have three dataframes with slightly different lengths, because the observations were made from different times,

How can I synchronize them by time to leave only those observations that are in all three frames and throw out those that are found only in separate frames

This is the way it will be:


a <- data.frame(c1 = c('a','b','c','d','e','f'), c2 = c(1,2,3,4,5,6))

b <- data.frame(c1 = c('a','b','c','d','e'), c2 = c(1,2,3,4,5))

c <- data.frame(c1 = c('b','c','d','e','f'), c2 = c(2,3,4,5,6))


a$concat <- do.call(paste0, a[1:2])

b$concat <- do.call(paste0, b[1:2])

c$concat <- do.call(paste0, c[1:2])


concat_vec <- append(unique(a$concat)

    , c(unique(b$concat)

    , unique(c$concat)))

concat_vec_tbl <- as.data.frame(table(concat_vec))

concat_vec_tbl <- concat_vec_tbl[concat_vec_tbl$Freq == 3, ]



a <- a[a$concat %in% concat_vec_tbl$concat_vec, ]

b <- b[b$concat %in% concat_vec_tbl$concat_vec, ]

c <- c[c$concat %in% concat_vec_tbl$concat_vec, ]

 
Alexey Burnakov:
https://www.mql5.com/en/blogs/post/650079

Interesting. Improving an existing strategy with machine learning. The article suffers from lack of information about the sampling, but the idea is interesting.
Good article, I am experimenting with SMM too lately, but in a more usual way.
 
Alexey Burnakov:

In head-on it will be like this:

thanks
 
Alexey Burnakov:
https://www.mql5.com/en/blogs/post/650079

Interesting. Improving an existing strategy with machine learning. The article suffers from lack of information about the sampling, but the idea is interesting.

Alexey!

What an interesting person you are!

I've written here a hundred times that I use rf to improve the performance of the real TS on indicators, and you haven't responded.

Moreover, I have several times expressed the idea that:

1. take the real TS

2. We distinguish the problems of TS and begin to solve them using the R

In my mentioned case I used rf to solve the problem of lag of the indicator, which usually gives information on bar 1 (-1), and rf gives information a bar ahead. For H4 it is 8 hours! As a result I managed to drawdown significantly.

 
Alexey Burnakov:
https://www.mql5.com/en/blogs/post/650079

Interesting. Improving an existing strategy with machine learning. The article suffers from a lack of information about the sampling, but the idea is interesting.

The idea of this article is implemented a little differently in the article https://www.mql5.com/ru/articles/1628.

I wonder if everyone reads the machine learning articles on this site with a year's delay?(rhetorical question)

Good luck

Глубокая нейросеть со Stacked RBM. Самообучение, самоконтроль
Глубокая нейросеть со Stacked RBM. Самообучение, самоконтроль
  • 2016.03.31
  • //www.mql5.com/ru/users/vlad1949">
  • www.mql5.com
Статья является продолжением предыдущих статей по глубоким нейросетям и выбору предикторов. В ней мы рассмотрим особенность нейросети, инициируемой Stacked RBM, а также её реализации в пакете "darch".
 
SanSanych Fomenko:

Alexey!

What an interesting man you are!

I've written here a hundred times that I use rf to improve the performance of the real TS on the indicators, and you didn't respond.

Moreover, I have several times expressed the idea that:

1. take the real TS

2. We distinguish the problems of TS and begin to solve them using the R

In my mentioned case I used rf to solve the problem of lag of the indicator, which usually gives information on bar 1 (-1), and rf gives information a bar ahead. For H4 it is 8 hours! As a result I managed to significantly reduce the drawdown.

I understand. It's just difficult to assess the depth of thought without any specifics. And there were pictures in that article. Perervenko's is similar, too. And I read his article, too.
 
Alexey Burnakov:
I understand. It's just without the specifics it's hard to evaluate the depth of thought. And there were pictures in that article. Perervenko looks like that too. And I read his article too.

Well, now you're offended...

My goal is to steer the conversation in a practical direction, not to offend anyone in any way...

So far we have scattered bits and pieces.

Your academic approach.... For me the value of your calculations is unquestionable, but .... I expressed my doubts above.

I follow carefully the worksof Vladimir Perervenko. I have never seen evidence that the models are not overtrained. The last link. The importance of variables is determined by an algorithm of one of the tree variants. But trees, because of the convenience of available noise values, tend to use these noise predictors more often and as a result noise pops up in the importance estimation...

So you have to start with algorithms to remove the noise predictors. All other steps without these make no practical sense, as all model estimates cannot be extrapolated into the future.

Then training the model in windows, and the width of the window must be justified somehow. Then using the trained model in a pre-selection of predictors for the working window....

Something like this.

Reason: