Machine learning in trading: theory, models, practice and algo-trading - page 90
You are missing trading opportunities:
- Free trading apps
- Over 8,000 signals for copying
- Economic news for exploring financial markets
Registration
Log in
You agree to website policy and terms of use
If you do not have an account, please register
I have tried different self-described validation methods, including those described in these articles. My conclusions are as follows:
In forex there is no strict dependence between the target variable and predictors, forex is not a formula that can be found and applied to calculate new data. All that the model can do is to find a certain regularity and extrapolate the results for trading on new data.
That is, there is a certain multidimensional space (dimensionality is equal to the number of predictors), in which there are several points (known target variables). The model constructs a hyperplane in this space that separates the points in the space (the "buy" class from the "sell" class). There are infinitely many ways to construct this hyperplane (in a simple case - draw four points on a sheet of paper, and draw a curved line between them so that there are 2 points to the right of the curve and two points to the left. The ways to draw a curve will be infinite). Therefore, there is no guarantee that the constructed model reflects the correct dependence of the target variable on the predictors. Validation is used to check the adequacy of the model - some of the points were not used during training, and you can easily find out whether the model has coped, whether it will show the correct result on these test points.
If the model could not pass validation correctly, there could be many reasons for it, for example
- the model found some non-existent dependencies, which exist only in the training examples
- there was some dependency in the training data, which is not present in the test data. For example, when all data for the test are taken later in time, and the behaviour of the forex symbol has changed
- the model itself is initialized with an unsuccessful seed. It often happens that the model trained on the same data will give different results during validation after many attempts to train it again
It is not known what caused a bad result in a particular case. All we can do is to estimate how good the model is on average - build the model dozens of times, make an estimate at validation. The data for training/validation should be divided again each time.
What I think is a valid way is to divide the data randomly in the ratio 50%/50% (not by time, but so that everything was evenly mixed, for example lines 1,2,5,7 for training, 3,4,6,8 for validation), train the model on the first part, then do validation on the second part, I use accuracy to evaluate the model. Repeat this 50 times (re-split the data into 2 random parts, training, validation). Then calculate the average accuracy on the training data, and the average on the validation data. Let's say the average accuracy on the training sample was 90%, and the average on the validation sample was 80%. The accuracy on the fronttest will be even lower, I use this rule of thumb: calculate the difference (90%-80%=10%), and subtract it from the validation result (80%-10% = 70%). It turns out that such a model on the fronttest will have an average accuracy of about 70%. Then, I genetically adjust the model parameters and predictors to increase this estimate from 70% (it is much harder than it seems, it is difficult to go beyond 50%).
But I don't like the fact that this result is just an average value, without guarantees. The real accuracy when trading will be 60% to 80%, or even 50% to 90%, depending on how unlucky you are. No matter how I try, I can't find the best model by any indication. Probably the only solution is to build dozens of models with the best parameters and predictors found, and take the result where the majority of people look (congress of models).
This is closely related to what SanSanych said at the beginning of the thread. You can also take away the last part of the known data for the last-control sample, as he advised. Do not use these data for training and validation, but just store them separately until the end of model training. Then test the finished model, or congress, on this data. The upside is that this will show how the model performs on new time-weighted data. The downside is that there will be less data left for training and validation, and the model will be a bit outdated at the start of trading. There is a small nuance here, that if you did not like the result on these control data, and you started to select a model that will show a good result on this site - then you started to use these data for validation, accordingly, the model is selected with them in mind, and therefore there is a small look into the future, control and this whole operation loses its meaning, and in this case it was easier not to make a control sample at all.
I tried different self-written validation methods, including the ones described in these articles. My conclusions are as follows:
In forex there is no strict relationship between target variable and predictors, forex is not a formula that can be found and applied to calculate new data. All that the model can do is to find some regularity and extrapolate the results for trading on new data.
In other words, there is a certain multidimensional space (dimensionality equals the number of predictors) with some points (known target variables). The model builds a hyperplane in this space that separates the points in the space (the "buy" class from the "sell" class). There are an infinite number of ways to build this hyperplane (in a simple case, draw four points on a sheet, and draw a curved line between them so that there are two points to the right of the curve, and two to the left as well. The ways to draw a curve are endless). Therefore, there is no guarantee that the built model reflects the correct dependence of the target variable on predictors. Validation is used to check the adequacy of the model - some points have not been used in training and you can easily find out if the model has coped with it, if it will correctly show the result on these test points.
If the model fails the validation, there may be many reasons for that, for example
- the model has found some non-existent dependencies, which are present only in the training examples
- there was a dependency in the training data, which did not exist in the test data. For example, in a case where all the data for the test is taken later in time, and the behavior of the forex symbol has changed
- The model itself has been initialized with an unsuccessful seed. It often happens that a model trained on the same data will give different results on validation after multiple attempts to re-train it
We do not know what caused the bad result in this case. All you can do is estimate how good the model is on average - build the model dozens of times, estimate on validation. The training/validation data needs to be re-divided each time.
What I think is a valid way is to divide the data randomly in a 50%/50% ratio (not by time, but so that everything is evenly mixed, for example lines 1,2,5,7 for training, 3,4,6,8 for validation), train the model on the first part, then do validation on the second, I use accuracy to evaluate the model. Repeat this 50 times (re-partition the data into 2 random parts, training, validation). Then calculate the average accuracy on the training data, and the average accuracy on the validation data. Suppose the average accuracy on the training sample was 90%, on the validation sample 80%. The accuracy on the fronttest will be even lower, I use this rule of thumb: calculate the difference (90%-80%=10%), and subtract it from the validation result (80%-10% = 70%). It turns out that such model on fronttest will have average accuracy about 70%. Next, I genetically adjust model parameters and predictors to increase this estimate from 70% (it's much harder than it seems, it's difficult to get even over 50%).
But I don't like that this result is just an average value, with no guarantees. The real accuracy when trading will be 60% to 80%, or even 50% to 90%, depending on how unlucky. No matter how I try, but I can't catch the best model by any indication. Probably the only solution is to build dozens of models with the best parameters and predictors found, and take the result where most people look (congress of models).
This is closely related to what SanSanych said at the beginning of the thread. You can also use his advice to remove the last part of the known data for the last control sample. Don't use this data for training and validations, but just store it separately until the end of model training. Then test the finished model, or congress, on this data. The plus is that it will show how the model is doing on the new time data. The minus is that there will be less data left for training and validation, and at the beginning of trading the model will already be a bit outdated. There is a slight nuance here, if you don't like results on the benchmark data and start choosing a model that will give good results in this area, you have begun to use this data for validation, respectively, the model has been chosen taking them into account and therefore there is a slight looking into the future and this entire operation becomes senseless, and in this case it would be easier not to do the benchmark selection at all.
Not yet)))
7 looked it up. Cuts are no better than the version of half a year ago, or when I looked I do not remember exactly. In the window and fylo statistics writes different. Selecting the importance of inputs
Questionable, compared head-on with rf and a couple more, and may give high priority and very unimportant. If you take the best cut (from the window) is still not good.
On this data I get at least 92%. Pribludka (as is) is still of little use for practical applications. For the effort in the development and flight of ideas - kudos.
All imho of course. Now for now)))
When we are dealing with a man of Reshetov's level, we can safely demand:
1. a review of analogues.
2. Indication of the shortcomings that are supposed to be overcome in these analogues.
3. Indication of the mechanism to eliminate these shortcomings (you can hide the specifics in the market economy)
4. Comparison of analogues with your own development. This comparison must prove that all the above-mentioned drawbacks of existing analogues have been eliminated. And we got a tool which is NOT worse than the analogues.
If a person of Reshetov's level does not do this, then: Reshetov's effort in development and flight of fancy - respect.
then you can safely demand:
Not yet)))
7 looked. Cuts are no better than the version of half a year ago, or when I looked I do not remember exactly. In the window and fylo statistics writes different. Selecting the importance of inputs
Questionable, compared head-on with rf and a couple more, and may give high priority and very unimportant. If you take the best cut (from the window) is still not good.
On this data I get at least 92%. Pribludka (as is) is still of little use for practical applications. For the effort in the development and flight of fancy - kudos.
All imho of course. Now goodbye)))
I see.Notepad compressed, and did not twist down)))But for comparison took from the window.
Deleted immediately for lack of use, although someone may be useful...
I wonder if this will help us.... So I understand that the processing power of such a thing is an order of magnitude higher, if not by several....
https://hi-tech.mail.ru/news/compact-quantum-computer/?frommail=1
of cases) for both samples (train = xx%, test = xx%). Methods and models do not need to be announced, just numbers. It is allowed to use any data manipulation
and mining methods.
And all comers. The z1 archive contains two files train and test. For Target, build a model on train, apply to test, post results in % (successfully predicted
of cases) for both samples (train = xx%, test = xx%). Methods and models do not need to be announced, just numbers. It is allowed to use any data manipulation
and mining methods.
Thank you! I'll give it a try.
Let's agree not to look into the test until the trained model is evaluated. I used to sin with this.
That is, we train until we are blue in the face the best model on train. Maybe two or three models. Then their one time test.