Machine learning in trading: theory, models, practice and algo-trading - page 18
You are missing trading opportunities:
- Free trading apps
- Over 8,000 signals for copying
- Economic news for exploring financial markets
Registration
Log in
You agree to website policy and terms of use
If you do not have an account, please register
Just a few ideas:
I tried to make such a system purely in MQL4:
formed a vector of inputs (just the difference between prices with a lag)
formed ideal inputs and outputs (there are a lot of methods, I chose the one that caught my fancy).
All this for 5 minute bars for example.
Then at every new bar I increased the array and also tried to find similar patterns from the past and calculate the percentage of entries for buy and sell in a multidimensional sphere of a variable radius. That thing was being tested for a very long time. The results were unstable.
I want to try such research in R again sometime. That is, it is a search for entry and exit by Euclidean distance.
Cluster is a little different, well, say the market now is the cluster number 5, the next candle will be the cluster number 18, it will not give us anything because we do not have time to trade cluster number 5, and in SMM is the concept of state, the state can last a certain time
Or maybe I don't understand your thought?
Just a few ideas:
I tried to make such a system purely in MQL4:
formed a vector of inputs (just the difference between prices with a lag)
formed ideal inputs and outputs (there are a lot of methods, I took the one that caught my fancy).
All this for 5 minute bars for example.
Then at every new bar I increased the array and also tried to find similar patterns from the past and calculate the percentage of entries for buy and sell in a multidimensional sphere of a variable radius. That thing was being tested for a very long time. The results were unstable.
I want to try such research in R again sometime. That is search of entry and exit by Euclidean distance.
Long sequences of series (even 100 candlesticks) are clustered. It is possible to make a bunch of inputs based on these long sequences. And at some point the system switches from cluster 5 to cluster 45, but it happens so to say slowly.
And another question for connoisseurs R
library(kza)
DAT <- rnorm(1000)
KZP <- kzp(DAT,m=100,k=3)
summary(KZP, digits=2,top=3)
how can i get it out of "summary" http://prntscr.com/bhtlo9 so i can work with these digits
I tried to read the article on r-bloggers, but because I do not know English I do not understand anything, can you explain me in plain language what is the essence of this method of selection, how it works?
Purely intuitive and based on the first test results, I suspect that this method is very similar to the method of principal components, and maybe even the same ...
First test , i had a sample of 30 predictors , i trained RF error got Accuracy : 0.6511
then i selected them using your method
we got 14 predictors, the error was Accuracy: 0.6568
In fact, we've eliminated half of the predictors and improved the forecast result a little bit, which is not bad
I will try other datasets.....
Another question, why if I use the same data in rattle I get 3-6% error in verification data, how should I understand it?
The thing is that if we make a cluster on a sequence of 100 candles we know it's cluster #5 on 101 candles of cluster #5, i.e. we blew 100 candles) And in SMM being on cluster number 45, we already know that probably we will go to cluster number 5
I tried to read the article on p-bloggers, but because I do not know English I do not understand anything, can you explain me in plain language what is the essence of this method of selection, how it works?
Purely intuitive and based on the first test results, I suspect that this method is very similar to the method of principal components, and maybe even the same ...
First test , i had a sample of 30 predictors , i trained RF error got Accuracy : 0.6511
then i selected them using your method
we got 14 predictors, the error was Accuracy: 0.6568
In fact, we've eliminated half of the predictors and improved the forecast result a little bit, which is not bad
I will try other datasets.....
And another question, why if the same data are loaded into rattle, then there is a 3-6% error in the validation data, how do I understand it?
The method of main components with some modification is used.
The main idea of using this method is not to increase prediction accuracy. The main idea is that the obtained prediction accuracy would remain approximately the same in the future. In practice, and we know this from our tester, it is almost always possible to achieve amazing data from Expert Advisors through optimization. However, with the dull monotony in the future these grails drain the depo. It happens because during training an Expert Advisor learns some particularities that are not repeated in the future. Predictors that have no relation to the target variable, i.e. noise, are especially useful as such singularities. During optimization or model fitting in R it is always possible to extract from this noise some values that can radically improve performance. But that will not happen in the future and the EA will fail.
Once again: you use a tool that allows you to eliminate the differences in the model results on the training sample and on future quotes. As I see it, only having solved the retraining problem, we can move on.
Error less than 10% and all the more 5% - this is an evident proof that the model is retrained. And model overtraining is caused by the set of input predictors, not by the model itself.
3. an error of less than 10%, much less 5%, is clear evidence that the model is over-trained. And model overtraining is due to the set of input predictors, not the model itself.
The question is not why there are different results in R and rattle on the same data and the same model
Question two : what is the point of testing an "out of sample" model on rattle if it shows the hell
The question is not why there are different results in R and rattle on the same data and the same model
Question two: what is the point of checking the "out of sample" model on rattle if it shows the hell
You probably have different scaffolding parameters in r and rattle, so the results are different. In rattle itself you can also change the number of trees and variables.
And you have a 34% error in rattle on training data, and a 3% error on validation data? There's something wrong with the test data, either it somehow already existed in the training data, or you have a very small dataset and it just happened to be that way.