Discretisation with a teacher is very curious thing, but it consumes a lot of computational resources - General

Maxim Dmitrievsky 2024.06.21 02:24 #35541

Aleksey Vyazmikin #:

Definitely. But in the process of development/research, on the contrary, a lot of additional tricks and complications appear. When the work is complete and everything is clear and obvious, then you can optimise and reduce/accelerate something.

I see that you can analyse bins, for example, as suggested in the article, and then select successful ones. It will take little code and will be very clear.

СанСаныч Фоменко 2024.06.21 07:33 #35542

Discretisation with a teacher is a very curious thing, but it consumes a lot of computational resources

discretisation::mdlp()

Aleksey Vyazmikin 2024.06.21 08:38 #35543

Maxim Dmitrievsky #:

I can see that you could analyse bins, for example as suggested in the article, and then select the successful ones. It will take little code and will be very clear.

What article? The link above, so there is no analysis there, but only a search of quantum tables. In general, CatBoost has various methods of quantisation built in (which are not in KBinsDiscretizer library) - experiment with settings. There is a possibility to save quantum tables and on them then to transform a sample for other methods of training.

Is there a pattern Initial deposit size - Point & Figure chart

Maxim Dmitrievsky 2024.06.21 09:05 #35544

Aleksey Vyazmikin #:

What article? That above link, so there is no analysis in essence there, but only enumeration of quantum tables. In general in CatBoost different methods of quantisation are built in (which are not in KBinsDiscretizer library) - experiment with settings. There is a possibility to save quantum tables and on them then to transform a sample for other methods of training.

You can select the number of bins and individual bins that outperform the others. An analogue of clustering. That's why I wrote before why not just use clustering.

In any case, without regard to brute-force or competent sampling of targets, it's a finger in the sky. Because the artificial limitation of options.

The main emphasis in the classification of time series (financial), when there is a choice of when to trade and when not to trade (i.e. already a discrete representation of BP), should be on the markup, not on the signs.

I have already written that my approach allows you to select such bins (clusters) and labels that you can limit yourself to only 2-5 signs. And it is done in minutes.

Is there a pattern Questions from Beginners MQL5 Big changes for MT4,

Aleksey Vyazmikin 2024.06.21 09:11 #35545

Maxim Dmitrievsky #:
You can select the number of bins and the individual bins that outperforate the others. An analogue to clustering. So wrote before why not just use clustering.

And by what criterion of selection? I don't see such an option in the library..... but again - this is binarisation, only under the bonnet.

Maxim Dmitrievsky 2024.06.21 09:12 #35546

Aleksey Vyazmikin #:

And what are the criteria for selection? I don't see it in the library... but then again, it's binarisation, only under the bonnet.

By criterion on new data, after training.

Aleksey Vyazmikin 2024.06.21 09:16 #35547

Maxim Dmitrievsky #:

On criterion on new data, after training.

So where is the selection of multiple bins from the set, or did I misunderstand you?

Maxim Dmitrievsky 2024.06.21 09:17 #35548

Aleksey Vyazmikin #:

So where is the selection of multiple bins from the set, or did I misunderstand you?

Make a choice )) just choose a bin and trade only on it, what's the problem?

Maxim Dmitrievsky 2024.06.21 09:21 #35549

The problem of TC/legality search is reduced to a simple two-dimensional optimisation problem. This is dictated by the very nature of BP (two-dimensionality).

When 2 errors are minimised:

classification error within a bin (buy/sell)
error in determining the current bin

The end result = a suboptimal RT.

How exactly you solve this is a purely technical question.

And here, again, it is not the method of discretisation that plays an absolutely critical role (although it can also be important), but the way of marking transactions inside bins.

[Archive!] Any rookie question, Typical mistakes and how Charles Dow's theory

Aleksey Vyazmikin 2024.06.21 09:26 #35550

Maxim Dmitrievsky #:

Make a choice )) just pick a bin and trade only on it, what's the problem?

You can always do it, I just thought that there is a ready-made functionality - I was interested in the selection criteria.

And so - in essence - if you need clustering, you can use any clustering, for which you can then apply new data, just pull out the thresholds by clusters and record in a quantum table, which can already be fed into CatBoost. This will speed up the process as you don't have to re-count the clusters when experimenting.

Any questions from a FOREX - Trends, Forecasts How can my robot

Machine learning in trading: theory, models, practice and algo-trading - page 3555