Machine learning in trading: theory, models, practice and algo-trading - page 91
You are missing trading opportunities:
- Free trading apps
- Over 8,000 signals for copying
- Economic news for exploring financial markets
Registration
Log in
You agree to website policy and terms of use
If you do not have an account, please register
a package that is able to select BPs that can be predicted and those that cannot, if I understand correctly
http://www.gmge.org/2012/05/foreca-forecastable-component-analysis/
http://www.gmge.org/2015/01/may-the-forec-be-with-you-r-package-foreca-v0-2-0/
And all comers. The z1 archive contains two files train and test. For Target, build model on train, apply to test, post results in % (successfully predicted
of cases) for both samples (train = xx%, test = xx%). Methods and models do not need to be announced, just numbers. It is allowed to use any data manipulation
and mining methods.
1. All of your predictors have no predictive power - without exception, they are all noise.
2. Three models were built: rf, ada, SVM. Here are the results
rf
Call:
randomForest(formula = TFC_Target ~ ,
data = crs$dataset[crs$sample, c(crs$input, crs$target)]
ntree = 500, mtry = 3, importance = TRUE, replace = FALSE, na.action = randomForest::na.roughfix)
Type of random forest: classification
Number of trees: 500
No. of variables tried at each split: 3
OOB estimate of error rate: 49.71%
Confusion matrix:
[0, 0] (0, 1] class.error
[0, 0] 197 163 0.4527778
(0, 1] 185 155 0.5441176
ada
Call:
ada(TFC_Target ~ ., data = crs$dataset[crs$train, c(crs$input,
crs$target)], control = rpart::rpart.control(maxdepth = 30,
cp = 0.01, minsplit = 20, xval = 10), iter = 50)
Loss: exponential Method: discrete Iteration: 50
Final Confusion Matrix for Data:
Final Prediction
True value (0,1] [0,0]
(0,1] 303 37
[0,0] 29 331
Train Error: 0.094
Out-Of-Bag Error: 0.157 iteration= 50
SVM
Summary of the SVM model (built using ksvm):
Support Vector Machine object of class "ksvm"
SV type: C-svc (classification)
parameter : cost C = 1
Gaussian Radial Basis kernel function.
Hyperparameter : sigma = 0.12775132444179
Number of Support Vectors : 662
Objective Function Value : -584.3646
Training error : 0.358571
Probability model included.
Time taken: 0.17 secs
On the test set (I mean rattle, not yours)
Error matrix for the Ada Boost model on test.csv [validate] (counts):
Predicted
Actual (0,1] [0,0]
[0,0] 33 40
(0,1] 35 42
Error matrix for the Ada Boost model on test.csv [validate] (proportions):
Predicted
Actual (0,1] [0,0] Error
[0,0] 0.22 0.27 0.55
(0,1] 0.23 0.28 0.45
Overall error: 50%, Averaged class error: 50%
Rattle timestamp: 2016-08-08 15:48:15 user
======================================================================
Error matrix for the Random Forest model on test.csv [validate] (counts):
Predicted
Actual [0,0] (0,1]
[0,0] 44 29
(0,1] 44 33
Error matrix for the Random Forest model on test.csv [validate] (proportions):
Predicted
Actual [0,0] (0,1] Error
[0,0] 0.29 0.19 0.40
(0,1] 0.29 0.22 0.57
Overall error: 49%, Averaged class error: 48%
Rattle timestamp: 2016-08-08 15:48:15 user
======================================================================
Error matrix for the SVM model on test.csv [validate] (counts):
Predicted
Actual [0,0] (0,1]
[0,0] 41 32
(0,1] 45 32
Error matrix for the SVM model on test.csv [validate] (proportions):
Predicted
Actual [0,0] (0,1] Error
[0,0] 0.27 0.21 0.44
(0,1] 0.30 0.21 0.58
Overall error: 51%, Averaged class error: 51%
Rattle timestamp: 2016-08-08 15:48:15 user
ROC analysis for randomforest.
Confirms the above.
Conclusion.
Your set of predictors is hopeless.
package that can select BPs which can be predicted and which can't, if I understand correctly
I read, judging by the description it's a very good package (ForeCA, it's even in the R repository, no need to download something from githab). The main feature is that it rates the "predictability" of the data.
And plus it's also important - it can be applied to reduce dimensionality of data. That is, from existing predictors this package will make two new ones, with surprisingly good predictability. At the same time it will sift out the garbage, etc. Reminds something of the Principal Component Method, only instead of components it will produce something of its own.
Very simply - give this package a table with a bunch of predictors (prices, indicators, deltas, garbage, etc.). ForeCA will give out a new table instead of the original one. We use this new table to train our predictive model (gbm, rf, nnet, etc.).
If a little more complicated, it is another package for nuclear transformation of data, with a bias for the stock market.
It all sounds great, just great, even too great, I'll have to check it out.
a package that is able to select BPs that can be predicted and those that cannot, if I understand correctly
http://www.gmge.org/2012/05/foreca-forecastable-component-analysis/
http://www.gmge.org/2015/01/may-the-forec-be-with-you-r-package-foreca-v0-2-0/
Extremely curious.
The package is installed, documentation is available.
Maybe someone will try it and post the result?
I read, judging by the description it is a very good package (ForeCA, it is even in the R repository, no need to download something from githab). The main feature is that it rates the "predictability" of the data.
And plus it's also important - it can be applied to reduce dimensionality of data. That is, from existing predictors this package will make two new ones, with surprisingly good predictability. At the same time it will sift out the garbage, etc. Reminds something of the Principal Component Method, only instead of components it will produce something of its own.
Very simply - give this package a table with a bunch of predictors (prices, indicators, deltas, garbage, etc.). ForeCA will give out a new table instead of the original one. We use this new table to train our predictive model (gbm, rf, nnet, etc.).
If a little more complicated, it is another package for nuclear transformation of data, with a bias for the stock exchange.
It all sounds great, just great, even too great, so I'll have to check it out.
Wouldn't that require a pre-screening?
Guys, take it!
Conclusion.
Your set of predictors is hopeless.
"post results in % (successfully predicted cases) for both samples (train = xx%, test = xx%). Methods and models don't need to be announced, only numbers".
We are waiting for more results. It is interesting what conclusions are obtained by Mihail Marchukajtes.
Okay)))) but read the conditions carefully -
"post results in % (successfully predicted cases) for both samples (train = xx%, test = xx%). Methods and models don't need to be announced, only numbers".
We are waiting for more results. It is interesting what conclusions are obtained by Mihail Marchukajtes.
You don't need a test!
The model cannot be trained! You can't test an empty space.
I read the description and it seems to be a very good package (ForeCA, ..............