Machine learning in trading: theory, models, practice and algo-trading - page 3511

 

Microsoft Copilot, which I use +- often, has already started to take into account past sessions of conversations with me.

Today I asked him a question and he gave me an R code, I asked him.

а почему ты предложыл мне пример кода именно на R

and he said.

Copilot
Я предложил пример кода на R, потому что в контексте наших предыдущих бесед вы выразили интерес к языку программирования R,
 анализу данных, машинному обучению и статистическим тестам. Вы задавали вопросы, связанные с линейной регрессией, PCA, FFT и 
другими статистическими концепциями на R. Поэтому я предположил, что пример на R будет для вас наиболее полезным. Если вы 
предпочитаете другой язык программирования, пожалуйста, дайте мне знать, и я буду рад помочь! 😊


I'm scared.)

 
mytarmailS #:

Microsoft Copilot, which I use +- often, has already started to take into account past sessions of conversations with me.

Today I asked a question and he gave me an R code, I asked him a question

and he gives me


I'm scared.)

Hang in there!

He doesn't know the colour of your pants yet.

 
mytarmailS #:
What is then depicted on the carinae on the x- and y-axis

x - iteration number of the model building algorithm

y - percentage of selected quantum splits whose probability bias on new data is the same as on training data. In other words - the probability of choosing the correct split at random from the selected variants.

 
Maxim Dmitrievsky #:

+- 1 minute is the time to get such models, on automatic. Not taking into account the time to develop the algo :)

mean reversion clusters work especially well on flat pairs

There are peculiarities of such a kitchen, for example, by which chips to do clustering.

Yes, this is a promising direction. But again there is a question of intelligent search of combinations of predictors for clustering.

Maxim Dmitrievsky #:

Do you pull out "effective" segments or splits as separate rules at the final stage, or how do you separate them from the total mass? so that the final model only trades them.

That is, you need to explain it like for morons: you trained the model, defined the segments, then what? Is there a simple logic to reproduce it?

There are now three options for further action:

1. Select for the predictor an optimal base quantum table from the available set, based on the amount of useful data revealed by it at each iteration. The idea is that bounds may be useful in this table, but they have not been sufficiently explored. The analogue with code is described in my papers.

2. Generating a unique quant table for each predictor based on the results of the analysis, with inefficient quant segments combined into large ranges, making it difficult for the CatBoost learning algorithm to use them in the tree-building process.

3. binarisation, i.e. creation of a separate sample on the selected quantum segments and training on these data.

 
Aleksey Nikolayev #:
The fact that you imagine me as a beggar-formalist in no way affects the need to communicate only within a single conceptual space.

Why do you have such an idea of an erudite person? I didn't write anything about financial wealth at all.....

Rather, I'm the poor man here.

Aleksey Nikolayev #:

You want to talk about complex concepts without agreeing on simple ones, but it doesn't work that way.

I am just ready to agree and explain what I put in the meaning of words, but it is more important for you that I use other words, because for you the words I use are already filled with another meaning and you categorically do not want to correct their meaning in your mind to understand your interlocutor. Of course, I admit that I can be wrong with terms and I don't mind editing.

Aleksey Nikolayev #:
I can also share my perception of you. You just want recognition for the great work you do. But this forum is a completely unsuitable place for that.

Positive results are important to me - that's recognition for me. And, if I post on the forum about my work, it's not to brag about the work I've done, but to share the results, especially if I found them meaningful. I wanted to discuss my results, as the forum is practically the only place for that. In fact, I was able to capture and visualise the reason for the poor results after training on new data. It is now more obvious what needs to be done to improve the results. Also, I have shown that it is possible to get a good model trading in profit on new data by chance on any data. But, it's more interesting to discuss terms..... a pity.

 
Aleksey Vyazmikin #:

So quantisation is cutting a feature into smaller data segments?

In the upper left corner the original let it be conditionally the stochastic indicator, and all the rest are quantised versions of it.

Right?


 
mytarmailS #:

So quantisation is slicing a feature into smaller pieces of data?

In the upper left corner the original let it be conditionally the stochastic indicator, and all the rest are quantised versions of it.

Right?


Not exactly - initially there are no segments - there is continuous data, and then they are sliced into ranges in different ways. Learning the ranges reduces the total number of possible combinations.

 
Aleksey Vyazmikin #:

Not really - initially there are no segments - there is continuous data, and then it is sliced into ranges in various ways. Learning the ranges reduces the total number of all possible combinations.

so original is the original continuous data or what do you mean by continuous?

 
mytarmailS #:

well, is this the original continuous data or what do you mean by continuous?

I explained about the incorrect statement " Soquantisation is slicing a feature into smaller pieces of data? " - continuous data has the most "smallest" data, and quantisation on the contrary makes the largest of them from a range into an eigenvalue up to N.

 
Aleksey Vyazmikin #:

I clarified the incorrect statement " Soquantisation is slicing a feature into smaller pieces of data? " - at continuous data and is the most "smallest", and quantisation on the contrary makes the largest them from a range into an eigenvalue up to N.

well, here's the slicing from the smaller original into larger chunks, what's wrong ????? What contradicts what?