Machine learning in trading: theory, models, practice and algo-trading - page 1354
You are missing trading opportunities:
- Free trading apps
- Over 8,000 signals for copying
- Economic news for exploring financial markets
Registration
Log in
You agree to website policy and terms of use
If you do not have an account, please register
Keep the archives. See attachment.
Learn.csv - inputs. The very first digit in each line is history binding, it should be removed.
Cell.scv - target.
This should produce the following chart after learning from the data.
The filter is approximately equal to EMA(16) and the forecast is 5 min.
I will do the test later, when I need it.
It's not quite clear, you got this graph from which sample - is it from training or from the test?
Here is the CatBoost on the test - the last 100 values.
Histogram of deviations.
I took 4000 for training, 2000 for validation, and 100 lines for the test. I trained 1000 trees of depth 6, RMSE formula (replaced byPoisson).
Attached sample and settings, for playback you need to download CB and put it in the directory Setup.
On the training sample also the distribution does not look like yours
Added: I use wrong model - I've got on probability graphs...
It is not quite clear, you got this graph on what sample - it is on the training or on the test?
Here is the CatBoost on the test - the last 100 values.
Histogram of deviations.
I took 4000 for training, 2000 for validation, and 100 lines for the test. I trained 1000 trees of depth 6, RMSE formula.
Attached sample and settings, to play you need to download CB and put it in the directory Setup.
On the training sample also the distribution does not look like yours
My graph is only training on the whole sample. I have not done the test on this one. Will be approximately identical to the training.
Yes, I haven't done regression before, there are a lot of unclear fitness functions, as opposed to classification, give different results, and I took the wrong value.
Here on the test sample it turned out
And here is the training sample of 4,000 lines.
Histogram of deviations for the test sample
Here is the overall graph for the 3 samples
The metric that was used to train the test sample
He says that I could have stopped training at 250 iterations and the model is retrained.
Yes, I did not do regression before, there are a lot of unclear fitness functions, unlike the classification, give different results, and I took the wrong value.
I got it on the test sample.
And here is the training sample of 4,000 lines.
Histogram of deviations for the test sample
Here is a general graph for three samples
Seems OK.
Well, yes, if you want, you can improve - I just have no experience with regression models.
So the main predictors are working tools :)
I attached final version with settings - it trains 10 models with different Seed
Well, yes, you can improve it if you want - I just don't have any experience with regression models.
So the main predictors are working tools :)The input is a scaled price series. - There are 20 close values and that's it. The problem is not about predictors, but the formulation of the problem - it is solvable. And your forest will come up with predictors by itself).
Yes, it's about the problem statement, I agree. I just do not consider the price as a dough from which pies are molded, and for the shape of these pies we need predictors.
One of the classic techniques that can improve the model. Or rather, find the optimal one. The original application of Monte Carlo.
https://en.wikipedia.org/wiki/Importance_sampling
Isn't this the method you used in your article?
For off-policy (policy gradient) RL
https://medium.com/@jonathan_hui/rl-importance-sampling-ebfb28b4a8c6
Can you explain in Russian, in your own words, what the idea is? In English, so to speak.)
The LPF filter we have predicted quite successfully. Even now the two of us, not even just the NS, but the forest. Now let's try to predict the price, which is a pointless thing to do at all). It would be better to predict the LF component of the expected change in the price expectation, which (expectation) is unknown in the present. And here in the conditions of all sorts of movements, HF fluctuations and everything else.
I obtained the following: the forecast time is 5 ms on the timeframe of 1 meter.
As usual: x is the forecast, y is the real value. Well, a 45 degree slanted rectangle reminds me of a circle, thank God it's not a circle. If you move a little to the right-left on x from zero, you can even play with a probability of a little over 50% (see areas).
Of course, it would be nice to build all sorts of regression lines, distributions, but it is necessary to do slices, at least a few - that's for later.
PS Well, and a forecast with a slightly modified algorithm. Same 5 min at timeframe 1m.
It is already much better.) Starting from the forecast >2 and < -2 by X, loss-making trades are hardly expected if we simply close in 5 minutes.