Machine learning in trading: theory, models, practice and algo-trading - page 1588

 
Aleksey Mavrin:

Have there been any attempts to apply statistical methods to charting, candlestick analysis, and other higher-level stuff?

I don't use returns.

Even if they are used as primary data (chart representation), additional predictors are required to compress the information in order to create proportions and vectors.

 
sibirqk:

In my opinion, there are periodic fluctuations in the planet's temperature due to natural causes. In the last hundred years, natural warming has begun, with anthropogenic factors superimposed on it.

If we take it apart in a nutshell:

1. The greenhouse effect is just one of many factors affecting the average temperature on Earth.

2. To account for the anthropogenic impact, the percentage of anthropogenic CO2 to all that is in the atmosphere is important. Right now it is at one percent, i.e. quite small. Much more of it is brought in by forest fires and grass burning. Plus CO2 sequestration, deforestation, reduces it.

3. the balance of CO2 entering the atmosphere and being sequestered is almost like the balance of supply and demand in the forex market, many different channels with different investment times disposal. It's not exactly easy to simulate. But there are experimental observations.

At the end of the 20th century, machines like AMS machines - gas pedal mass spectrometers - appeared primarily for the needs of archaeologists. Their main feature is that samples to determine isotope ratios can be very small - milligrams. They were quickly adapted for other purposes of technological, medical, and in particular climatic research. These machines measure the C12/C14 ratio very accurately. In natural genesis, it is determined by the cosmic background and this ratio is quite stable. But when the era of nuclear testing began, the concentration of C14 increased dramatically, it was spread all over the world and absorbed by trees. The locations and dates of the tests are known, the annual tree rings are easily counted, you can determine exactly how the C14 concentration was changing in the place where the tree was growing. By making such measurements around the world it was possible to track how quickly CO2 migrated in the atmosphere - it turned out that very quickly half a year to a year the concentration leveled out around the world. And more importantly it took less than ten years for the concentration to decrease to background concentration. i.e. all atmospheric CO2 is constantly renewed. This means that the current concentration is a balance of emissions/absorption in which the role of man-made CO2 from burning coal, oil, gas is not as significant as it is declared in the media.

That is, in my opinion:

a) The amount of man-made CO2 does not significantly increase its natural concentration as it is propagandized.

b) CO2 is not the only cause of the greenhouse effect.

c) The greenhouse effect is far from being the only cause of temperature change on Earth.


As far as I know, water vapor affects the greenhouse effect noticeably more than CO2 and, in any case, human influence on the climate is exaggerated. But that's not what I meant when I was talking about the substantive side of the article:

1) A formally deterministic but fairly complex system cannot be studied without the methods of a matstat.

2) The answers given by the matstat always have some uncertainty. It is impossible to avoid it completely, because it is the very nature of the subject of this science.

3) There is always a temptation to use this uncertainty to get the "right" answer.

4) To avoid fitting the answer to the desired result, it is always necessary to evaluate the statistical significance of the conclusions.

 
Aleksey Nikolayev:

In our case, we can meaningfully work only with non-stationarity, which in one way or another is reduced to stationarity. Piecewise stationarity, autoregressive models, hmm, etc.

The main reason is that only one realization of the process is always known. For example, if we take speech recognition, there any word we can say as many times as we want. The quotes for a certain instrument over a certain period of time are a single variant. By the way, this is probably the reason why many people here do not distinguish a random process from its realizations.

This is why MO will never work with such data, they must be processed into such a series that will repeat itself and it's quite realistic

Why isn't it practically discussed? because it's the No. 1 question.
 
mytarmailS:
Why is this almost never discussed? ... Because that's the #1 question.
Smoke my last link.
 
Aleksey Mavrin:

Have there been any attempts to apply statistical methods to charting, candlestick analysis, and other higher-level stuff?

Of course there are, for examplehttps://journals.plos.org/plosone/article?id=10.1371/journal.pone.0212320

But you know what the result will be if you use them for real...

Forecasting stock prices with a feature fusion LSTM-CNN model using different representations of the same data
Forecasting stock prices with a feature fusion LSTM-CNN model using different representations of the same data
  • journals.plos.org
Forecasting stock prices plays an important role in setting a trading strategy or determining the appropriate timing for buying or selling a stock. We propose a model, called the feature fusion long short-term memory-convolutional neural network (LSTM-CNN) model, that combines features learned from different representations of the same data...
 

I used to do mathematical modeling (MM) and program optimization problems using simplex linear programming method.

And when machine learning(MO) started to spread, here I thought it was the same as MM. But it's not quite the same.


For forex first of all we need to create a model of a trading strategy (TS) taking into account many factors.

The robot itself is not able to create a TS model from the very beginning. It comes from the fact that some program will not be able to create the idea of limitation by itself, i.e., that factor that affects the model or TS.

The robot will only be able to find the limits of that constraint.

If you have created a bad model with bad constraints, no optimization will give the desired result.

You have to know what factors affect the TS. And here you can't do without the human factor.


Let me give you an example of only one of these factors that I use in my trading robot. Let me give you a little "secret" :)

This factor is well-known to many people: the speed of price change. However, in my calculations, I do not only determine the speed, but also the acceleration and deceleration of the speed by inertia.

The velocity is determined at 1-second intervals. Not only the frequency of incoming ticks is taken into account, but also the number of points (length) between ticks.

For what, or in what cases, this factor is used.

It is used when opening an order. By introducing the limitation on speed, it prevents you from opening an order during strong jumps in price movements.

And it doesn't allow you to open an order until the velocity decreases to a certain value and still when a certain amount of time has not passed.

I also use it to determine the angle of the trend. The higher the speed, the higher the trend angle.

 

Colleagues to all hello,

I'm sorry for the stupid question, but in the tester of MT5 works the event OnBookEvent? I'm trying to test, but somehow in the loop does not go to the feeling that it is ignored. But in theory the quotes change in the market review. HMM...

 
Aleksey Nikolayev:

In our case, we can meaningfully work only with non-stationarity, which in one way or another is reduced to stationarity. Piecewise stationarity, autoregressive models, hmm, etc.

The main reason is that only one realization of the process is always known. For example, if we take speech recognition, there we can say any word as many times as we want. The quotes for a specific instrument in a specific time interval are in a single variant. By the way, it seems to be the reason of fuzzy distinction of a random process from its realizations.

It is amusing to see how people mock the good old statistical (non) stationarity, implying anything at all, but not the relative persistence in time of the distribution. Probably some "guru" of econometrics of the past made such a throw-in once, probably on some other occasion and in some narrow theoretical context and the topic of non-stationarity as the main obstacle to the creation of the "grail" went viral. It is clear that statistically nonstationary cumulative price would hardly interest anybody in its pure form, and even if returns were stationary (without changing of distribution), it would not give much for trading anyway (options would disappear as a tool, perhaps).

Probably it is worth to define and/or specify the concept of "(non) stationarity in Forex", so that people familiar with classical statistics would understand what we are talking about.

Markets by their nature have not statistical, but "game" non-stationarity with "disturbances" (fundamental factors), that is, "crowd" predicts the price between "disturbances", each participant tries to predict the rest of the crowd on average, and "fundamental" (politics, economy, ragtime shifts...) then everything breaks down.

The whole problem is how to detect the "market change" as fast as possible and at the same time "train" the system on the data "from the current market", because training on the past markets will only confuse the system, the old markets don't exist anymore, it's not only senseless and harmful, but to train on very small data windows is also not cool, it makes sense only for soft users, and how common people working with 15M and watch - is a mystery...

 
Andrew:

It's fun to watch people mock the good old statistical (un)stationarity, meaning anything at all but the relative persistence over time of the distribution.

......

The whole problem is how to detect "market change" as fast as possible and at the same time " train" the system with data "from the current market", since training on past markets will only confuse the system, the old markets no longer exist, it's not only senseless and harmful, but it's not cool to train on very small data windows, it makes sense only for soft users, and how common people working with 15M and clock - a mystery...

Not the "relative persistence of the distribution", but the independence of the MO, variance and distribution function from time.

And how do you "detect market change"?

Well, you have detected a "market change" - you need a sample of sufficient length to train the system on the new data. And if before or when the sample is long enough "market change" occurs again - what to do?

 
Dmitry:

Not "relative persistence of the distribution," but the independence of the MO, variance, and distribution function from time.

No, just the dependence, and a constant dependence.)

Dmitry:

And how do you "detect market change"?

Well, you have detected "market change" - to train the system on new data you need a sample of sufficient length. And if before or when the sample reaches a sufficient length "market change" again - what to do?

You can try to detect with MO.

Dimitri:

Well, you've detected a "market change" - you need a sample of sufficient length to train the system on the new data. And if before or when the sample reaches a sufficient length, "market change" again - what to do?

This is the right question, you don't need to do anything, wait until there is a barely statistically significant sample, any actions in this situation will be for good luck if there is no insider.

Reason: