Machine learning in trading: theory, models, practice and algo-trading - page 3554

 
mytarmailS #:
Don't give me that shit.
There are lossless compression algorithms

Did I say otherwise?

Don't we all use the same archivers so this would be news to us?

I don't understand why you chose this style of communication and what you want to show them.

 
Aleksey Vyazmikin #:

Did I say otherwise?

Maybe I don't understand something, but what else could it mean?

Aleksey Vyazmikin #:

Nah, it's all about one thing. By dividing into bins you lose information, which is compression.

 
mytarmailS #:

Maybe I'm missing something, but what else could it mean?

There is no statement that compression should be only with loss of information.

 
Dominik Egert #:
As a general information.

This is really interesting.

h ttps://www.forb es.com/sites/moorinsights/2024/06/17/ibms-instructlab-a-new-era-for-ai-model-creation-and-performance/

Interesting for general development, but for trading it's not clear how to use it...

Any ideas?

 
Aleksey Vyazmikin #:

There is no claim that compression should only be with loss of information.

I apologise
 
Aleksey Vyazmikin #:

I haven't stopped yet. The concept is to take only useful information from the predictor, then binarise it, and build a model on this data. But, here we face the problem of small responses, super sparse sampling, training on which is difficult for standard models. The alternative is clustering of these binary predictors, for this purpose I wrote the code of the clustering tree, but for now I put the development on pause. Because the main problem is that the selected quantum segments lose their efficiency on new data in large numbers, which leads to errors in classical models. That is why now I concentrate on increasing the collection of the percentage of effective quantum segments.

How to measure efficiency is also an open question, but I assume that a quantum cutoff should include more members of the same class than the sample average. Probability bias means that the percentage of representatives of class 1 or 0 is greater in the quantum segment by a threshold value than in the subsample.

Thus, if we have a set of quantum segments with a probability bias, we can build both new rules and ensembles, grouping quantum segments into groups according to the probability of synchronous triggering, which in theory should add confidence to the model.

Even fitting a quantum table to a predictor can improve learning.

So far I do not build final models using this method, I am not satisfied with the selection of quantum segments.

And so, models on binary sample are easier at catboost, not inferior to those on full data, but again there is no guarantee that the model will be profitable, but it is understandable - after all, the problem is in the shift of probability on new data....

Apart from the main problem, there is a production problem - you need to think and code :)

Lately, unsuccessful ideas, after testing them, knock me out of my rut for a few days, sometimes weeks. It's still summer now - I try to go for walks in the park more often.

There, in fact, I used a similar approach - a database of effective single settings of different filters/predictors is created, and then they are randomly selected (not all of them are used at once) with certain settings. This approach saves a lot of resources and the result is quite good when there are hundreds of settings to optimise. Essentially the same approach as with quantisation.

Most of the predictors I use in MO are based on the logic of that EA.

For MO, perhaps, I will mass produce bots with low price, but a little later.

Are you matching tables to fixed labels, or are the labels brute-force too?

Actually that's why I asked, because I see this as a further development. Well, and with binary selections, yes, such a big deal, of course. It seems to me that this question should be reconsidered.

Variants of using sampling via KBinsDiscretizer
 
mytarmailS #:
I apologise.

I accept.

 
Maxim Dmitrievsky #:
Are you matching tables to fixed labels, or are the labels reselected too?

I don't have an overfitting of different labels at the moment. The concept of basic strategy and improving it with MO.

But basic strategies can be different.

Maxim Dmitrievsky #:
Well, and with binary sampling, yes, it is such a big deal. It seems to me that this issue should be reconsidered.

Any other ideas?

 
Aleksey Vyazmikin #:

I don't have an override of different markings right now. Concept of basic strategy and its improvement with MO.

But basic strategies can be different.

Any other ideas?

No, haven't gone into your approach, but there's always something you can redesign/simplify

 
Maxim Dmitrievsky #:

No, didn't go into your approach, but you can always redesign/simplify things

Definitely. But in the process of development/research, on the contrary, a lot of additional features and complications appear. When the work is completed and everything is clear and obvious, then you can optimise and reduce/accelerate something.