Machine learning in trading: theory, models, practice and algo-trading - page 2637

 
Aleksey Vyazmikin #:

Suppose we have found patterns that occur periodically and accompany a particular price movement once they occur.

Has anyone done any research on the relationship between the frequency of occurrence of a pattern and the subsequent event?

We are talking about probability clusters, if there is such a term.

Suppose we can expect that if a pattern hasn't appeared for a long time, there will be a predictable (concomitant) price movement after it has occurred, and then there will be a fading, as the pattern has become visible to all and has thus eliminated market inefficiencies.

I think that the development of metrics to assess these transient states over time (from more likely to equally likely or even negative prediction) may help to find and select such patterns, and a model that can account for this may prove quite effective.

I am working in this direction, but I lack mathematical apparatus and theoretical knowledge.

I can work only with very simple patterns, like movement after a vertex breakout. In the sense that there should be enough of them for the frequency to serve as a good estimate of probability.

From my observations, when the pattern marking is fully formalized, market inefficiencies (in the sense of being different from SB) become of little significance. Conventionally speaking, within the spread. There is a natural desire to make the pattern design more complex, but this usually leads to a reduced sample size and unstable results.

 
Maxim Dmitrievsky #:
Think of features like increments, but more informative. For example, find the average price for the entire history and subtract the rest from it. You need maximum variation, but it has to be within a range that is known with the new data.

Reminds me of spread trading theory. And there's some pretty complicated maths there, judging by the plethora of contrived articles on the subject.

Maxim Dmitrievsky #:
Fractional differentiation works that way (while maintaining stationarity the maximum spread), but I want something new.

Maybe some "slope lines" from time and deduct from them the prices, decibels, f-from time, any kind of turbidity, as long as the stationarity and maximum scatter conditions are observed.

But it will be something like a z-score, when there is only stationarity on the history? Although, of course, attempts to bring to stationarity cannot be avoided in principle - without it, you can't make much of a trade.

 
Aleksey Nikolayev #:

Reminds me of spread trading theory. And there's some pretty complicated maths there, judging by the plethora of contrived articles on the subject.

But it would be something like z-score when there is only stationarity on history, wouldn't it? Although, of course, trying to achieve stationarity is impossible in principle - you can't really trade without it.

Something like that happens and we should try to stabilize it for some time - therefore we should take the entire history on a monthly chart and use it to prevent the latest data from extremums during such transformations - otherwise the MI will cease to work adequately

Something like a long term trendline that changes slowly, but that again is closer to increments. Maybe there are other unexpected solutions :)

It is possible that it could be tied to macro fundamentals like GDP, but I wouldn't want to
 
Aleksey Nikolayev #:

You can only work with very simple patterns, such as movement after a top has been broken. In the sense that there must be enough of them for the frequency to serve as a good estimate of probability.

How many is enough? Suppose I have from 5% to about 15% in the sample selecting simple "patterns", and the sample for training, let's say 15k examples, is that too little?

Aleksey Nikolayev #:

My observation is that when the pattern markup is fully formalized, market inefficiencies (in the sense of differences from SB) become of little significance. Conventionally speaking, within the spread. There is a natural desire to make the design of patterns more complex, but this usually leads to a reduced sample size and unstable results.

The question is how to formalize these observations better to get results quickly and to discard/classify patterns with or without any regularity.

 
Aleksey Vyazmikin #:

How much is enough? Suppose I have from 5% to about 15% in the sample selected simple "patterns", and the sample for training, let's say 15k examples, is this too little?

It's better to count in pieces of patterns. The same pattern breaks (significant for work) may not occur more than a few hundred per year. I would call it a limiting number. If you try to make more complicated patterns out of them - for example, a pair of nodes breaking through, which also meet some conditions, then it may be dozens per year. And that's not enough.

 
Aleksey Vyazmikin #:

The question is how to formalise these observations better to get results quickly and to discard/classify patterns with and without a pattern.

Something like a loop over all possible constructions of a pattern of a given type? Did something similar once with the same vertex breakdowns. In principle, it's possible to think of something, but the search will be (in general case) not iterative, but recursive. Again, most of the patterns would be meaningless due to complexity and rarity. It's probably easier to manually collect a list of meaningful patterns and bypass it in a regular loop, choosing the optimal one.

 
without life or FA binding TA of complex systems can only work under stable conditions.
 
mytarmailS #:
Why?

Little meaningful real data.

Zy. generating data with a random environment gives the impression of lengthening the data. This is a mistake. As much data as there is. 211 bars means 211 and no more.

 

Hi!

The truth is out there... (Fox Mulder "The X-Files")

You're obviously close to the target. You need to push harder.

 
Aleksey Nikolayev #:

It's better to count in pattern pieces. The same pattern breaks (significant for work) are not more than a few hundred per year. I would call it a limit number. If you try to make more complicated patterns out of them - for example into pairs of consecutive nodes breaking, which also meet some conditions - you may get dozens of these patterns per year. And that's not enough.

Yes, I agree that the data is not enough, that's why I take the maximum amount of history. Of course, the more examples, the more confident the result is, in theory, but it is what it is.

Aleksey Nikolayev #:

Something like a loop on all possible constructions of a given type of pattern? Did something similar once with the same vertex breakdowns. In principle, it's possible to come up with something, only the enumeration will (in the general case) be recursive rather than iterative. Again, most of the patterns would be meaningless due to complexity and rarity. It's probably easier to manually collect a list of meaningful patterns and bypass it in a regular loop, choosing the optimal one.

I just need a tool with an accurate metric to assess the trend/distribution/type of waves over time to identify any trend within them. What are these trends, for example:

- If there was a positive outcome after a pattern for a long time, a negative outcome is more likely when a new pattern emerges;

- The outcome is equally distributed across all sections of the sample (how do you divide into sections correctly?);

- If there has been a negative pattern outcome for a long time, a positive outcome is more likely to occur;

- If there was no pattern for a long time, positive/negative outcome is more likely when it appears.

Basically something like this. It is a statistical cutting, but it should be at intervals of history, and my question is the best way to cut these intervals (measure on different intervals, make a shift), so the estimation would be correct and expressed preferably by some generalized coefficient.