Machine learning in trading: theory, models, practice and algo-trading - page 103

 
I also wonder how the foreca package determines the dependence of inputs and outputs. Maybe it's trivial, maybe not.
 
Alexey Burnakov:
I also wonder how in foreca package the dependence of inputs and outputs is determined. I don't know if it is trivial or not.

I just looked through the line, let me be corrected or confirmed, but according to my understanding the package based on some internal algorithm evaluates the predictive ability of a particular random variable, i.e. the ability to extrapolate its values of a random variable into the future. If it's a trend plot, we get one, and if it's price increments, we get 0.83%, which to me is quite obvious. And it's all about the method itself and in practice it's exactly the opposite, because increments are predicted much better - they're much closer to stationarity than trends in non-stationary time series. We have to understand the tool itself and apply this tool to the objects to which the tool fits.

In general terms, there are system analysis errors.

Mistake #1.

Applying the right methods to the wrong problem.

It is extremely common in statistics.

 
SanSanych Fomenko:

In general, I think the original point has been lost. To me, that original idea was that we need methods that are model-independent to determine the ability of each of the predictors used to predict the target variable.

An example of model-independent selection is the vtreat package. It somehow analyzes the data and then gives it a target-value fit.

data(iris)
iris[,5] <- as.numeric(iris[,5]) #пакет  не умет факторы, их нужно сконвертировать в числа
iris_rand <- runif(nrow(iris)*10, min(iris[,1:4]), max(iris[,1:4])) #новых 10 предикторов со случайными значениями
dim(iris_rand) <- c(nrow(iris), 10)
colnames(iris_rand) <- paste0("rand_", c(1:10))
iris <- cbind(iris_rand, iris)
library(vtreat)
treatments <- designTreatmentsN(dframe = iris, varlist = colnames(iris)[1:(ncol(iris)-1)], outcomename = colnames(iris)[ncol(iris)], verbose = TRUE)
treatments
format(treatments)
significance <- treatments$scoreFrame[,"sig"]
names(significance) <- treatments$scoreFrame[,"origName"]
barplot(significance)

For the iris table, it creates 10 new predictors with random values. designTreatmentsN will score each predictor, the lower the score, the better. In this example, the original 4 predictors (the last ones in the graph) clearly stand out, have almost zero score, which is very good. When selecting predictors, the first thing to do is to remove the ones with the highest values.
If there are only 2 target values (0/1, TRUE/FALSE, -1/1, factor with 2 levels, etc...) then there is the designTreatmentsC function for that case


This package was also used in y-aware pca. The vtreat package scales predictors into y-aware interval, and components are then created almost normally (just without re-scaling and centering). That is, if you want, you can do such interesting things as y-aware randomforest for example with this package.
Read more here:https://cran.r-project.org/web/packages/vtreat/vignettes/vtreatScaleMode.html

 
Dr.Trader:

An example of model-independent selection is the vtreat package. It somehow analyzes the data and then evaluates it against the target values.

Ten new predictors with random values are created for the iris table. designTreatmentsN will score each predictor, the lower the score, the better. In this example, the original 4 predictors (the last ones in the graph) clearly stand out, have almost zero score, which is very good. When selecting predictors, the first thing to do is to remove the ones with the highest values.
If there are only 2 target values (0/1, TRUE/FALSE, -1/1, factor with 2 levels, etc...) then there is the designTreatmentsC function for that case


This package was also used in y-aware pca. The vtreat package scales predictors into y-aware interval, and components are then created almost normally (just without re-scaling and centering). That is, if you want, you can do such interesting things as y-aware randomforest for example with this package.
Read more here:https://cran.r-project.org/web/packages/vtreat/vignettes/vtreatScaleMode.html

Well, there we go, back to basics.

Now next.

Using any models ONLY after being processed for them by these packages. The hope is that the models will NOT be sick of overtraining with this preprocessing.

 
SanSanych Fomenko:

Applying the right methods to the wrong problem.

About the evaluation of the time series with CaretFA I did not get, in the description is a lot of formulas. I caught something about correlation of new values with old ones, and analysis of frequency graph after Fourier transformation, analysis of how this graph changes on new data. You need to understand more about radiophysics there than forex :)

There are examples in the description of the package, where it is applied to indexes DAX, SMI, CAC, FTSE, are used exactly increments on D1, that is, both the package and the problem are correct.
There is a small nuance that in the example the time series of prices is first processed by the logarithmic function log() and only after that the deltas are found. diff(log(EuStockMarkets[c(100:200),]) * 100
However, I haven't noticed any difference if I use or don't use log(), the result of estimation has not changed; it was done rather for a more convenient display of data on the chart.

 
Dr.Trader:

About the evaluation of the time series with CaretFA I did not get, in the description is a lot of formulas. I caught something about correlation of new values with old ones, and analysis of frequency graph after Fourier transformation, analysis of how this graph changes on new data. You need to understand more in radiophysics there than in forex :)

There are examples in the package description where it applies to DAX, SMI, CAC, FTSE indices, it uses exactly the D1 increments, i.e. both the package and the objectives are correct.
There is a small nuance that in the example the time series of prices is first processed by the logarithmic function log() and only after that the deltas are found. diff(log(EuStockMarkets[c(100:200),]) * 100
However, I haven't noticed any difference in case I use or do not use log(), the result of estimation has not changed; it was done rather for convenient data representation on the chart.

Regarding this package, the question for me is fundamental:

  • Does it give the predictive ability of the individual predictor,
  • or the predictive ability of the target variable using the predictors?

 
SanSanych Fomenko:

With respect to this package, the question for me is fundamental:

  • does it give the predictive ability of the individual predictor,
  • or the predictive ability of the target variable using the predictors?

It does not look for a relationship between the target variable and the predictors. The author of the package writes about two applications -.

1) evaluate the time series to see if it is predictive at all (on a scale from 0% = "white noise" to 100%=sinusoidal) with Omega(). If the result is 0%, then trying to predict the behavior of the time series is impossible, no matter what predictors are used.
2) if I take some predictors, evaluate them using the same function, then create new predictors like pca, so the new predictors will have even better Omega() results than the original ones. Whether or not that helps you to better predict the target values is up to your luck, because the package is not interested in what you need to predict with these predictors. The point is that if the predictor is not noise then the models using it will predict more stable.

 
Dr.Trader:

It does not look for connections between the target variable and the predictors. The author of the package writes about two applications -

1) Evaluate if the time series is predictable at all (on a scale from 0% = "white noise" to 100% = sinusoidal) using the Omega() function. If the result is 0%, then trying to predict the behavior of the time series is impossible, no matter what predictors are used.
2) if I take some predictors, evaluate them using the same function, then create new predictors like pca, so the new predictors will have even better Omega() results than the original ones. Whether or not that helps you to better predict the target values is up to your luck, because the package is not interested in what you need to predict with these predictors. The point is that if the predictor is not noise then the models using it will predict more stable.

You have substantially confirmed my suspicions.

Thank you.

I think the package is useless for classification.

But for prediction of extrapolation type it can be useful.

For example, take the forecast package. It decomposes the series into three components, then extrapolates forward and adds up. It gets a forecast one step or more ahead.

Now the question is: which currency pair to take? We takeCaretFA and use it to calculate the predictive ability of several currency pairs. I suspect that in a limited window such predictive ability changes as the window moves. We choose a currency pair, predict it (or another package - there are plenty of them), trade it, after closing all the positions we choose the currency pair again.

 
Dr.Trader:

An example of model-independent selection is the vtreat package. It somehow analyzes the data, and then gives them a score for matching the target values.

You create 10 new predictors with random values for the iris table. designTreatmentsN will evaluate each predictor, the lower the score, the better. In this example, the original 4 predictors (the last ones in the graph) clearly stand out, have almost zero score, which is very good. When selecting predictors, the first thing to remove is the one with the highest values.
If there are only 2 target values (0/1, TRUE/FALSE, -1/1, factor with 2 levels, etc...) then there is the designTreatmentsC function for that case


This package was also used in y-aware pca. The vtreat package scales predictors into y-aware interval, and components are then created almost normally (just without re-scaling and centering). That is, if you want, you can do such interesting things as y-aware randomforest for example with this package.
Read more here:https://cran.r-project.org/web/packages/vtreat/vignettes/vtreatScaleMode.html

I look at your code with irises and random predictors and I understand that I can't program at all, what I had 10 lines you had in three....

And this selection by vtreat, does it differ from the same importense built into RF?

 
mytarmailS:

And this vtreat selection, is it any different from the same importense built into RF?

Vtreat is better. It evaluates everything statistically, how good/bad the predictor is overall for predicting the target variable, without adjusting for a particular prediction model. It is recommended to use predictors with a score of no more than 1/(number of predictors). For example if you have 200 predictors, you take from them only those that have a score less than 1/200. It is possible to estimate the predictors and if all estimates are higher than the threshold - instead of unsuccessfully trying to train the model and predict new data it is better to immediately start looking for other predictors.

There are a couple of disadvantages - the package works with predictors one at a time, and does not take into account their interaction. I also don't like that even if there are fully identical, or highly correlated predictors - vtreat will not remove repetitive ones, sometimes this is very annoying.