Machine learning in trading: theory, models, practice and algo-trading - page 2109
You are missing trading opportunities:
- Free trading apps
- Over 8,000 signals for copying
- Economic news for exploring financial markets
Registration
Log in
You agree to website policy and terms of use
If you do not have an account, please register
select all files and download them, they will be zipped
different sampling lengths then if a part
Thank you, it is - you can download the archive, which is nice!
But the different lengths of samples - it's bad, I thought to allocate the most random columns, where small deviations are acceptable.
I think that on the sample does not need to apply this method - otherwise how to use it in the real world.
I start it for training, let's see what happens.
Thank you, it is - you can download the archive, which is nice!
But the different lengths of samples - this is bad, I thought to allocate the most random columns, where small deviations are acceptable.
I think that on the sample does not need to apply this method - otherwise how to use it in the real world.
I'm running it for training, let's see what happens.
I don't need it for exams, but it may come in handy.
Too lazy to convert)
I'll explain the point:
1) we sort the column
2) we count the average number of elements in a quantum, for example 10000 elements / 255 quanta = 39,21
3) in the loop we move by 39,21 elements at each step and add the value from the sorted array to the array of quantum values. I.e. array value 0 = value 0 quantum, 39th value = 1 quantum, 78th value = 2 quantum, etc.
If a value is already in the array, i.e. if we get into an area with many duplicates, we skip the duplicate and don't add it.
At each step, we add exactly 39.21, and then round up the sum to select the element in the array, so it would be equal. I.e. instead of 195 (39*5 = 195) elements take 196 ( 39,21 * 5 = (int) 196,05).
With uniform distribution is clear - I would first create an array of unique values and use it for cutting.
But there are other methods of splitting the grid:
With a uniform distribution is clear - I would first create an array of unique values and use it to cut.
But there are other methods of splitting the grid:
There must be a lot of samples, otherwise the model won't learn anything.
there must be a lot of samples, otherwise the model will not learn anything
These are the sample quantization methods for CatBoost - these are the boundaries by which the enumeration/learning then goes on.
My experiments show that the grid should be chosen for each predictor separately, then the quality gain is observed, but this is not able to do CatBoost, and I do not know how to build a grid and I have to build a grid and upload to csv, and then search them in order to assess the behavior of the target in them. I think it's a very promising tool, but I need to translate the code into MQL.
These are the sample quantization methods for CatBoost - it is by these boundaries that the enumeration/learning then goes on.
My experiments show that grid should be chosen for each predictor separately, then the quality gain is observed, but CatBoost can't do it, and I can't build a grid and I have to build grids and upload them to csv, and then iterate through them to evaluate behavior of targets in them. I think this is a very promising feature, but I need to translate the code into MQL.
Do you have it in the settings of the model (parameters)? I don't know what it is
if not in the settings, then it's bullshit.
in the settings of the model itself (parameters) is it? I dunno what it is
If it's not in the settings, then bullshit.
It is in the settings, at least for the command line
--feature-border-type
Thequantization mode for numerical features.
It is in the settings, at least for the command line
--feature-border-type
Thequantization mode for numerical features.
Does it make a big difference? It should be within a percent
With a uniform distribution is clear - I would first create an array of unique values and use it to cut.
But there are other methods to divide the grid:
And that makes a big difference? It should be within a percent difference.
Choosing the right partitioning makes a big difference.
Here's an example on Recall - up to 50% variation - for me that's significant.
Increasing the bounds from 16 to 512 in increments of 16 - though not in order on the histogram - my titles are a bit of a hindrance.
I am still experimenting with mesh selection, but it is already obvious that there are different predictors, for which I need different meshes, to observe the logic, and not only to fit.