Machine learning in trading: theory, models, practice and algo-trading - page 3509

 

An anecdote:

who told you that your class labels help the model to identify the best quantile cuts (clusters) and not degrade the selection result if you just used clustering?

So you have to decide if you have a model based or model agnostic approach. You can do both, but the second one is preferable because it relies only on the data, not on the model structure.
 
Aleksey Nikolayev #:
Well, I asked you why you think your algorithm is not greedy (in the conventional sense) - I didn't see any answer.

Once again - you emphasise the terms but not the content. I explained earlier why I used such a term and why I think it is appropriate. Instead of noting for yourself that you have a different opinion about the appropriateness of the term and proceed to the discussion of the essence, you emphasise that I do not use the term correctly, as if this will improve your understanding of the issue. It turns out that you are interested in discussing the form of presentation of thoughts, but the essence is not interesting at all. That is why I conclude that you do not ask questions on the essence, but only want to prove that I use the term incorrectly, which will probably give you a sense of pride and satisfaction due to the fact that you know better terminology. Thus, the discussion is not on a substantive level, but is aimed at identifying the characteristics of one individual in relation to another, while making some sort of qualitative judgement about the individual. This style will lead to conflict - as it is like self-assertion at the expense of others.

Once again, to clarify the point I was making - when selecting a split there are generally accepted metrics that are used to evaluate the effect of the split, the greedy method involves selecting the maximum effect. My algorithm reduces the number of candidates to select, so it is no longer a greedy method relative to the variants of the standard metric, as what was the best may disappear from the selection. That was the point.

The next sentence is literally "And already from them we choose by some criterion - not necessarily by greed." - Here I write that yes, if we consider my algorithm separately, then the selection can also be considered to be based on greed, but by its own evaluation, which includes many other metrics and the total will be the maximum. "Not necessarily by greed" - means that the algorithm allows to use both random choice and deliberately effective with respect to new data, in addition, there is an implementation of the algorithm of economic efficiency of choice - when the price of the consequences of the choice is estimated (here I published earlier the work of such an algorithm - gifki).

If even after such a detailed explanation of the reason for the choice of the term in my post, you want to discuss the admissibility of its use, then I will be very sorry that the desire to look for flaws in the style of presentation will override the desire to discuss the results and methodology of the research.

 
mytarmailS #:
You are not the first)

He has a desire to discuss terms, and he tried to understand the essence. And where and what is not clear to you - if you had written specifically, you would have given an answer.

 
Maxim Dmitrievsky #:

Scored your last few posts on chatgpt4:

Well, at least ChatGPT appreciated my approach with a kind word.

I don't even know - it turns out that it captures the context better than the people here.....

Maxim Dmitrievsky #:

Self-explanatory:

who told you that your class labels help the model to determine the best quantiles (clusters) and not worsen the selection result if you just used clustering?

So you have to decide if you have a model based or model agnostic approach. You can do both, but the second one is preferable because it relies only on the data and not on the model structure.

It's not about better or worse labels, it's about being able to find a pattern with any labels and selecting predictors to do so.

I don't understand your message - you want me to abandon my method and use only clustering? What's the point?

 
Aleksey Vyazmikin #:

Well, at least ChatGPT appreciated my approach with a kind word.

I don't even know - it turns out that it captures the context better than the people here....

It's not about better or worse labels, it's about the possibility of finding a pattern with any labels and selecting predictors for that.

I don't understand your message - you want me to abandon my method and use only clustering? What's the point?

No, it's always a give-and-take, which is why it's often used on the forum as proof of what they're saying.

the point is to reduce your work to a scientific paper, with normal definitions that others understand.
 
Maxim Dmitrievsky #:

No, he's always giving me the benefit of the doubt.

the point is to reduce your work to a scientific paper with proper definitions.

You've been told the difference between clustering and clustering. I don't pretend to be a scientist.

You'd better write what you see the sense in using hierarchical clustering and how it is better than K-Means and my tree based on K-Means.

 
Aleksey Vyazmikin #:

You've been told the difference between clustering and clustering. I don't pretend to be a scientist.

You'd better write what you see the point of using hierarchical clustering and how it is better than K-Means and my tree based on K-Means.

I don't see how it's better. I see the point of applying any clustering without labels to analyse and select data, and then on top of that to label and train any model. It's kind of intuitive or something.

 
Maxim Dmitrievsky #:

I don't see how it's better. I see the point of using any clustering without labels to analyse and select data, and then on top of them to label and train any model. It's kind of intuitive or something.

Yeah - normal basic scenario.

 
Aleksey Vyazmikin #:

Once again - you are emphasising the terms but not the content. I have already explained earlier why I used such a term and why I think it is appropriate. Instead of noting for yourself that you have a different opinion about the appropriateness of the term and proceeding to the discussion of the essence, you focus on the fact that I do not use the term correctly, as if this will improve your understanding of the issue. It turns out that you are interested in discussing the form of presentation of thoughts, but the essence is not interesting at all. That is why I conclude that you do not ask questions on the essence, but only want to prove that I use the term incorrectly, which will probably give you a sense of pride and satisfaction due to the fact that you know better terminology. Thus, the discussion is not on a substantive level, but is aimed at identifying the characteristics of one individual in relation to another, while making some sort of qualitative judgement about the individual. This style will lead to conflict - as it seems to be self-assertion at the expense of others.

Once again, to clarify the point I was making - when selecting a split there are generally accepted metrics that are used to evaluate the effect of the split, the greedy method involves selecting the maximum effect. My algorithm reduces the number of candidates to select, so it is no longer a greedy method relative to the variants of the standard metric, as what was the best may disappear from the selection. That was the point.

The next sentence is literally "And already from them we choose by some criterion - not necessarily by greed." - Here I write that yes, if we consider my algorithm separately, then the selection can also be considered to be based on greed, but by its own evaluation, which includes many other metrics and the total will be the maximum. "Not necessarily by greed" - means that the algorithm allows to use both random choice and deliberately efficient choice with respect to new data, besides, there is an implementation of the algorithm of economic efficiency of choice - when the price of consequences of choice is estimated (here I published earlier the work of such an algorithm - gifs).

If even after such a detailed explanation of the reason for the choice of the term in my post, you want to discuss the admissibility of its use, then I will be very sorry that the desire to look for flaws in the style of presentation will override the desire to discuss the results and methodology of the research.

As an epigraph, from Goethe's Faust: "Student: Yes, but words correspond to understandings."

Emphasising the correspondence from the epigraph. The questions of interest are: (1) do you think that your algorithm gives a guaranteed (or at least with higher probability) global maximum of your custom metrics? (2) If the answer to the first question is "yes", what makes it possible?

If the answer to the first question is "no", you just flush down the toilet the possibility of adequate communication because of arbitrary substitution of concepts.
 
Are you saying that the whole point of this clever algorithm is to train not on ordinary data, but on the centroids of clusters of this data?

And then just select the best rules from the model based on some metric.
Reason: