Machine learning in trading: theory, models, practice and algo-trading - page 2723

 
mytarmailS #:

you can decompose the series into components (decomposition), find the random, find the deterministic part. discard the random, leave the rest... A standard case for BP processing ...

But no, it doesn't work either.

At some point on new data, random will become deterministic, and deterministic will become random....

So neither does it.)

Of course, the tutorials have simple BPs to make it easier to explain. In Real, it's more complicated than that.

There are other non-financial VRs that are also difficult to predict, with some minimum possible error. But in trading it is easier, because it is not necessary to predict all the time, you can set the model to do it selectively. I did so in the last article, now the topic of selecting informative features remains. I have already thought of a method, now I just need to experiment.

I want to finalise my approach and it should become universal, according to the idea, that is, it should give results on any currency

And you with your Events also make the 3rd class not to trade in essence, there is nothing new there. You divide BP into what is predicted and not predicted.
 
Maxim Dmitrievsky #:
And you with your Events also do 3rd class no trading essentially, nothing new there. You divide BP into what is predicted and not predicted.

That's your fantasy, not reality


The way I see it so far is this:

there is data, let's say 100k observations, I'll call it "X.big".

1) we select (pattern, rule, event, signal from TS) what we are interested in, I call it "initial rule" (Alexey Funtz activation. The name is extremely unfortunate but...).

2) we select the "initial rule" from the data and now we have 100-1000 observations, not 100k, we have reduced the search space "X.small".

3) in "X.small" we start a full-scale search of features, I see features in the form of sequential rules, rules are automatically generated through genetic programming, I want a full-scale search, but I'm not sure if I can do it

4) an array of the created attributes is fed into the model, say 1000 pieces at a time.

5) the model (some AMO) selects features by importance and saves good features, gets a new array, etc....

6) as a result of the search we will get several thousand working attributes according to a specific " initial rule ".


Thus, if we call all this a model, the model itself comes up with billions of features and selects what is necessary.

 
mytarmailS #:

It's your fantasy, not reality


That's how I see it so far:

There's data, let's say 100k observations, I'll call it "X.big".

1) we identify (pattern, rule, event, signal from the TS) what we are interested in, I call it "initial rule" (Alexey Funtz. activation. the name is extremely unfortunate but...).

2) we select the "initial rule" from the data and now we have 100-1000 observations, not 100k, because we have reduced the search space "X.small".

3) in "X.small" we start a full-scale search of features, I see features in the form of sequential rules, the rules are automatically generated through genetic programming, I want a full-scale search, but I am not sure if it will work.

4) an array of the created features is fed into the model, say 1000 pieces at a time.

5) the model selects features by importance and saves good features, gets a new array, etc....

6) as a result of the search we get several thousand working attributes according to a specific " initial rule ".


Thus, if we call all this a model, the model itself comes up with billions of features and selects what is necessary.

Again 25... this is no different from a 3-class classification in terms of meaning.

 
Maxim Dmitrievsky #:

Again 25. it's no different from a 3-class classification in meaning

Let me create a dataset and see whose algorithm is better.

 
Maxim Dmitrievsky #:
One says I stole his ideas, the other says he taught me...who are you guys anyway?))))

"Stole" - did you just say that? You're shameless, I can tell.

Forum on trading, automated trading systems and testing trading strategies.

Machine learning in trading: theory, practice, trading and more

Aleksey Vyazmikin, 2020.12.03 19:11

So the idea is to evaluate the model, and the model actually unravels confusing targets, and we can evaluate its success in this case, not just to see how confusing everything is.

I'm thinking of trying the cascade learning method (a term I coined myself - there may be something different). The graphs show that there are areas where training is successful - leave this area, and what goes beyond this area to train again, having previously removed from the sample examples that fall into the distribution of the left area. I have already tried to do it by hand - the effect was good, now I think to automate it, but the second day still no luck - I am afraid that the effect was accidental - I do not want to be upset. What is your opinion on this matter? I think it is easy to do it in Python.

It was discussed in the article.

I quote:

"

We want to write an algorithm that will be able to analyse and correct its own errors, iteratively improving its results. To do this, we propose to take a bunch of two classifiers and train them sequentially as suggested in the diagram below.

"

"

The intuition of this approach is that losing trades are classification errors of the first kind for the underlying model, in the terminology of the confusion matrix. That is, they are cases that it classifies as false positives. The metamodel filters out such cases and gives a score of 1 for true positives and 0 for everything else. By filtering through the metamodel dataset to train the base model, we increase its Precision, i.e. the number of correct buy and sell triggers. At the same time, the metamodel increases its Recall (completeness) by classifying as many different outcomes as possible.

"

The ideas are the same, but you've done the implementation and worked out the details - I've only said the concept, and I'm not sure I've published my experiments and implementation code.

I reminded you of it in the context that you may not understand what it's about now, and then use it later when understanding comes. And that not understanding is not a reason to behave inadequately and to express your value judgements about people's personalities and their logic.

 
mytarmailS #:

Let me create a dataset and see whose algorithm is better.

What will the dataset consist of? I only take quotes as input.

 
Aleksey Vyazmikin #:

"Stole" - did you just say that? He's shameless, I can tell.

That's what the article was talking about.

Quote:

"

We want to write an algorithm that will be able to analyse and correct its own errors, iteratively improving its results. To do this, we propose to take a bunch of two classifiers and train them sequentially as proposed in the diagram below.

"

"

The intuition of this approach is that losing trades are classification errors of the first kind for the underlying model, in the terminology of the confusion matrix. That is, they are those cases that it classifies as false positives. The metamodel filters out such cases and gives a score of 1 for true positives and 0 for everything else. By filtering through the metamodel dataset to train the base model, we increase its Precision, i.e. the number of correct buy and sell triggers. At the same time, the metamodel increases its Recall (completeness) by classifying as many different outcomes as possible.

"

The ideas are the same, but you've done the implementation and worked out the details - I've only said the concept, and I'm not sure I've published my experiments and implementation code.

I reminded about it within the framework that you may not understand now what we are talking about, and then use it later, when understanding comes. And that not understanding is not a reason to behave inadequately and to express your value judgements about people's personalities and their logic.

So let's take the code from the article and check it. Why should I understand what we are talking about if there is nothing yet.

I have a lot of variants of such implementations, including those with retraining rather than retraining again, retraining by baselines and so on.

Especially when you start doing it, the result is different from the fantasy one that was originally planned

 
Maxim Dmitrievsky #:

What will the dataset consist of? I only take quotes as input

OHLS prices for the last 200 pieces and a label, you can add time.

 
mytarmailS #:

OHLS prices last 200pc and a tag, time can be added.

200 pieces? What's so little. I don't have tags, I have autopartitioning.

I'll have to rewrite it.

 
Maxim Dmitrievsky #:

200 pieces? That's not enough. I'm unmarked, auto-tagged.

I'm gonna have to rewrite it.

200 pieces per observation.