Machine learning in trading: theory, models, practice and algo-trading - page 3522
You are missing trading opportunities:
- Free trading apps
- Over 8,000 signals for copying
- Economic news for exploring financial markets
Registration
Log in
You agree to website policy and terms of use
If you do not have an account, please register
It's a "symbol fri" study
When you increase the forecast horizon, you need to increase the length of the sub-series (order) for PE. Then the readings are equalised.
It is necessary to read why it is needed and what is investigated with it.Yeah, the entropy of the tags doesn't show anything. Still, the chip-tag relationship is stronger.
Let's investigate mutual information between chips and labels then :)
The point is to find a faster way to search datasets than through model training. For example, if you want to test a million different labels.When increasing the forecast horizon, it is necessary to increase the length of sub-series (order) for PE.
I made a small re-selection of settings - the data is the same as last time - 100 models
Looked for any dependence on PE on the sample train on the exam sample balance - used popular metrics.
And the same, but PE on the test sample.
And Recall instead of balance.
It seems that the impact is within the measurement error....
https://scikit-learn.org/stable/modules/generated/sklearn.feature_selection.mutual_info_classif.html
mi = mutual_info_classif(x,y).sum()
You can't see an explicit dependency either.
The point of the fuss is to find a faster way to enumerate datasets than through model training.
I think it can be done through my method. It is enough to estimate the probability bias in quantum segments, and to develop a metric to summarise the result for all predictors, for example, the percentage of quantum segments that passed the selection. If there are few of them, learning will be difficult, which indirectly means that the partitioning is not of high quality (if we believe that).
A single dataset is enough and one can oversample the markings.
However, this will tell about the ease of training on the train, but what happens next - you will not get an answer.
However, I think to collect statistics on predictors on different samples, then it will be clearer whether a predictor can be considered successful separately from the sample, or only by evaluating it relative to the markup you can make such a conclusion and choice.
I think, just through my method it is realistic to do it. It is enough to estimate the probability bias in quantum segments, and develop a metric to summarise the result for all predictors, for example the percentage of quantum segments that passed selection. If there are few of them, learning will be difficult, which indirectly means that the partitioning is not of high quality (if we believe it).
One dataset is enough and we can search markings.
However, this will tell you about the ease of training on the train, but what happens next - you will not get an answer.
However, I think to collect statistics on predictors on different samples, then it will be clearer whether a predictor can be considered successful separately from the sample, or only by evaluating it relative to the markup you can make such a conclusion and choice.
In principle, I have everything optimised and works fast, I'm just dabbling in what can be improved. The tester has been rewritten, now it calculates quickly.
I have a basic calculation (one iteration - which will be enough for a quick evaluation), takes about 2 seconds for a sample of 27000 rows by 5000 columns.
I have a basic calculation (one iteration - which would be enough for a quick estimate), takes about 2 seconds for a sample of 27000 rows by 5000 columns.
10 models (two in each model, main and meta)
And immediately ready TC.
I run in batches of 20-100 re-training with different parameters. The biggest influence is the markup.
So I want to find the way of the most correct markup.