Machine learning in trading: theory, models, practice and algo-trading - page 35

 

Can someone explain me a very simple but accurate enough language, according to what principles RF builds the scale of importance of predictors?

I have a set with two classes in the target class, the number of observations in one class is hundreds times more than in the second, I remember from somewhere that one of the criteria of importance of prodictors in RF is the frequency of occurrence of some observation.

So I wonder if RF suppresses that class in which there are few observations when calculating importance of predictors.

 
interesting article or rather there is almost no article there, interesting pictures about non-linear "pca") https://imdevsoftware.wordpress.com/tag/non-linear-pca/
Discriminating Between Iris Species
Discriminating Between Iris Species
  • imdevsoftware.wordpress.com
The Iris data set is a famous for its use to compare unsupervised classifiers. The goal is to use information about flower characteristics to accurately classify the 3 species of Iris. We can look at scatter plots of the 4 variables in the data set and see that no single variable nor bivariate combination can achieve this. One approach to...
 
mytarmailS:

I suspect that you are on the forex, there are no brokers on the forex and they do not trade, these are shops working on the documents of bookmakers

p.s. what do you think of my suggestion for the trait selection?

Well yes, the dealing center and not a broker. But no one cancels the exit to the interbank.

Your selection seems logical. But I decided not to select indicators according to my concepts because it has never improved the model. I'd rather let the algorithm select many indicators and let it decide what is good and what is not. Sometimes my moving averages also get to the final set of predictors, I think they may provide some information not by themselves but in combination with other indicators. But my results are still unstable and I cannot guarantee their usefulness yet.
Also I wouldn't try to predict exactly the reversal, in training data the "business as usual" class will have dozens of times more cases than "reversal", and they say that the ratio of classes for training is better to have a 50/50 ratio.

 
SanSanych Fomenko:

I have an algorithm that determines the predictive power of a predictor for a particular target variable. In short, it is oscillators and different increments. If a particular predictor has predictive power for a particular target variable, it does not follow that it will have predictive power for another target variable. Moreover, a predictor may have predictive ability in one window and not in another.

The algorithm works well. The predictors it selects do not lead to overfitting of the models.

PS

According to my algorithm, slips of any type have no predictive ability no matter how ridiculous it sounds.

All models with any dataset are retrained.

Another issue is that the probability of overtraining with the right choice, transformation of predictors is greatly reduced.

The probability of overtraining depends equally on the data set and the type of model.

There is no need for illusions.

Take a look at pbo package , it's interesting to consider this question there.

Good luck

 
Vladimir Perervenko:

All models with any dataset can be retrained.

Another issue is that the probability of overtraining with the right choice, the transformation of predictors is significantly reduced.

The probability of overtraining depends equally on the dataset and the type of model.

No need for illusions.

Check out the pbo package for an interesting discussion of this issue.

Good luck

Looked at. The premise is completely incomprehensible. Especially "an increase in the number of observations leads to overtraining"???

I use a perfectly understandable and, most importantly, practically valuable, criterion.

I have stated the essence of it many times. Let me repeat.

I use the following criterion of overtraining (overfitting): if the error during training is NOT equal to the error on other data, outside the training sample, i.e. on other time intervals, then the model is overtrained. That is, during training, the model picked up some specifics that it did not encounter in subsequent time intervals.

How this is implemented in practice.

We take a quotient, for example 10 000 bars.

We divide it mechanically by the bar number without any extravagances that is very important for me because in practice it will be like this and not like that.

So I take the first bars from number 1 to 7000. These bars are used for learning testing and validation. To divide them into three sets, I use, for example, sample or whatever is built into the model itself.

I get three numbers of model performance. If the model is not retrained, these figures are approximately equal.

Then the most important thing.

I take a file with bars 7001 to 10,000. I use the model trained on the previous bars. I get an error. If the error is slightly different from the previous three, then the model is NOT retrained. I usually consider that if the difference is 15-20%, then it is not retrained. If one of the numbers differs from any other by more than 50%, then the model is retrained.

So with my methodology I select some subset of predictors from some set of predictors. If they are found, which is not necessarily, then such models as randomforest, SVM, ada and their variants are NOT retrained! I don't know about other models - I don't use them.

This is not an illusion. This is a fact.

 
mytarmailS:

Can someone explain me a very simple but accurate enough language, according to what principles RF builds the scale of importance of predictors?

I have a set with two classes in the target class, the number of observations in one class is hundreds times more than in the second, I remember from somewhere that one of the criteria of importance of prodictors in RF is the frequency of occurrence of some observation.

So I'm wondering, doesn't RF suppress that class in which there are few observations when calculating importance of predictors?

You have wildly unbalanced classes, and this is not good. There are algorithms for balancing classes, but in your case I did not succeed. I tried to indicate the reversal of ZZ not by one bar number, but by several, before and after the reversal. This reduced the imbalance, but did not solve the problem.

I haven't found models that are guaranteed to work on NOT balanced classes.

 
SanSanych Fomenko:

You have wildly unbalanced classes, and this is not good. There are algorithms for balancing classes, but in your case it did not work. I tried to indicate the reversal of ZZ not by one bar number, but by several, before and after the reversal. This reduced the imbalance, but did not solve the problem.

I have not found models that are guaranteed to work on NOT balanced classes.

I'm interested in the feature selection question
 
mytarmailS:
I'm interested in the question about feature selection

I have answered everything I thought necessary.

If you have a set of features with a target variable, send it to me, I'll make a selection, then build models on the selected ones and see the result

 
Has anyone by any chance tried to use a non-linear pca?? the link to which I left above, I have trouble recognizing new data with it, it gives an error
 
mytarmailS:
No one by any chance tried to apply non-linear pca?? link to which I left above, I'm having trouble recognizing new data with it, it gives out an error

I don't think this package is enough to build a model capable of predicting the target variable. All I found in the help is to build a PCA model based on predictors, the target variable is not there at all.

#установка  пакета, нужно выполнить один раз и перезапустить R
source("https://bioconductor.org/biocLite.R")
biocLite("pcaMethods")        

#создание  pca объекта
library(pcaMethods)
browseVignettes("pcaMethods") #хелп  файлы
data(metaboliteDataComplete)
mdC <- prep(metaboliteDataComplete, scale="none", center=TRUE)
resNipals <- pca(md, method="nipals", center=FALSE, nPcs=5)

This would create a resNipals (Nonlinear Estimation by Iterative Partial Least Squares) object with 5 main components to analyze the metaboliteDataComplete table. Instead of metaboliteDataComplete, you can substitute your own table with predictors. It is important not to feed the target variable here, it will be used later.

But this will only be enough to analyze the relationships between the variables by examining the different graphs. In order to create a predictive model after that a linear regression model is built, which uses the main components PC1,PC2,PC3,PC4,PC5 as input variables (x1,x2,x3,...). And the linear model already feeds the target variable Y as the desired result. The problem is that resNipals is some object of class "pcaRes" from package pcaMethods. I couldn't find how to do all this with it in the help.

If it were a PCA model from the caret package, it would go like this:

#http://www.win-vector.com/blog/2016/05/pcr_part2_yaware/ (раздел Is this the same as caret::preProcess?)
newVars <- colnames(resNipals)
resNipals$y <- dTrain$y   #"y" в данном случае это название колонки с целевой переменной, dTrain - исходная таблица с данными
modelB <- lm(paste('y',paste(newVars,collapse=' + '),sep=' ~ '),data=resNipals)
print(summary(modelB)$r.squared)
#дальше нужно использовать функцию predict(modelB, newdata = <таблица с валидационной выборкой>) для прогноза на новых данных

But it doesn't work with resNipals, in theory the pcaMethods package should have some functions of its own to work with this object, but I haven't found anything.