Machine learning in trading: theory, models, practice and algo-trading - page 188

 

New jPrediction 11 Release

Fixed one minor glitch (comma in CSV was not replaced by a point for numbers). Improved algorithm to choose significant predictors for models.

You can download it from my site (indicated in my profile), first post on the main page.

 

And I also wanted to write about the selection of predictors...

In addition to the main experiment, I'm doing a little bit of analysis on one stock market asset.

There are normal volumes there, which I also added to the features.

Then I applied usual linear model (OLS regression) [Target ~ Predictor] to each predictor separately for different outputs (11 of them).

Calculated f-stat.models. I got such a picture:

And here is a surprise - all blocks of predictors, connected to volumes, turned out to be unnecessary. Predictors based on autocorrelation of price increments were not needed either.

It is also clear that the greater the lag of the output variable, the worse is the significance.

Then I eliminated all noise by F critical (at 0.01 level).

It turned out like this:

This is without taking into account possible interactions, unfortunately...

But for some inputs the significance of the linear model is not bad.

 
Alexey Burnakov:


I try not to analyze the importance of predictors one by one. There was a good example in the thread:

There are two predictors.On which Visual it is very easy to work with the analysis of two at once, you can see that the second target forms clear clusters that can be found by the models. If you use these predictors one by one then each of them will be useless in prediction.

The picture is purely hypothetical. But concerning Forex, I can judge from a number of signs that good predictors somehow form similar clusters there, only I need 30 predictors instead of 2.
 
Once again, Reshetov outdid you all. Version 11 is just a miracle. Previously, in version 10 was such that the predictor using a greater number of inputs did not increase the generalization ability, and had to retrain, but now with an increase in predictors in the model increases and generalization ability of the model as a whole, well, the work of such a model lasts longer and better, so kudos to you Yura, but others can continue to hum and reinvent the wheel. Good luck!!!!
 
Mihail Marchukajtes:

...

Previously, in version 10 it was so that jPrediction using more inputs didn't increase generalizing ability, so I had to retrain, but now with more predictors in the model generalizing ability of the model increases, and the work of such models lasts longer and better...

Thanks for feedback!

Testing version 11 on the samples I have, I came to a similar conclusion. It was necessary to confirm this hypothetical conclusion with independent research (reproduction of the experiment). After all, everyone has different tasks. Therefore, there was a potential risk that the classifier would give opposite results for some problems. Moreover, the time needed for selection of significant predictors was significantly increased in the new version, which was not to everyone's liking.

As for model training time, you can potentially decrease it without worsening quality (generalizability) - it's already a matter of technique. The main thing is to get constructive feedback in time to understand: is it worth improving jPrediction in this direction, or the direction turned out to be wrong and you need to roll back? Otherwise we would have had to waste time and effort on futile features.

 
Vizard_:

...

I get 92.3% (oos) on the data I use.

...

My sincere congratulations! (If you are not lying).

And regrets that it is somewhere in your place and not in the public domain.

Discussing what is in the public domain is pointless, because it is impossible to prove or disprove your "claims" about jPrediction.

 

I just came across an article on the subject that I think is especially interesting for lovers of NS.

What I found interesting was the end of the article, which compares thein-sample prediction error and theout-of-sample prediction error: it does this through the correlation of these errors. In my terminology, this means that if the correlation is high (in the article 0.8), then the model is not retrained.

Predictability in Network Models
Predictability in Network Models
  • Jonas Haslbeck - r
  • www.r-bloggers.com
Network models have become a popular way to abstract complex systems and gain insights into relational patterns among observed variables in almost any area of science. The majority of these applications focuses on analyzing the structure of the network. However, if the network is not directly observed (Alice and Bob are friends) but estimated...
 
Dr.Trader:

I try not to analyze the importance of predictors one by one. There was a good example in the thread:

There are two predictors.On which Visual it is very easy to work with the analysis of two at once, you can see that the second target forms clear clusters that can be found by the models. If you use these predictors one at a time, then each of them will be useless in prediction.

The picture is purely hypothetical. But concerning Forex, I can judge from a number of signs that good predictors there somehow form similar clusters, only I need not 2, but 30 predictors.

Generally speaking, this is all true. There is additional informativity on the interactions that exceeds the sum of the information of the marginal inputs.

Decision trees, begging, and boosting model interactions easily. That is, without any additional effort on the part of the user. There are many problems for linear models. OLS regression takes into account the order of occurrence of predictors... Greedy alternating addition of predictors works in principle, but greed makes a lopsided model. The same applies to forests and trees.

As for the inclusion of dozens of predictors, I'd be cautious. Can you imagine yourself interacting with 30 variables? For a tree, that would be a depth of at least 30. You need a huge amount of data to model this without wild overtraining...

In practice the depth of interaction to 3-5 is already enough.

 
Alexey Burnakov:

Generally speaking, all this is true. There is additional informativity on interactions that exceeds the sum of the information of marginal inputs.

Decision trees, begging, and boosting model interactions easily. That is, without any additional effort on the part of the user. There are many problems for linear models. OLS regression takes into account the order of occurrence of predictors... Greedy alternating addition of predictors works in principle, but greed makes a lopsided model. The same applies to forests and trees.

As for the inclusion of dozens of predictors, I'd be cautious. Can you imagine yourself interacting with 30 variables? For a tree, that would be a depth of at least 30. You need a huge amount of data to model this without wild overtraining...

In practice, the depth of interaction to 3-5 is already enough.

To me, predictor interaction is an extremely questionable thing. There are so many issues there.....

And if the interaction in OLS, it is simply unthinkable. If you take and on a piece of paper carefully write out all the conditions under which OLS can be applied. And then compare everything that is written on paper with reality in the financial time series.

PS.

If you take almost any book on datamining, the procedures to remove correlated predictors are necessarily described.

 
SanSanych Fomenko:

If you take almost any book on datamining, the procedures for removing correlated predictors are necessarily described.

If you take almost any book on datamining and apply what you read to the market, you will see that it does not work.... Maybe to hell with the established stereotypes?