Market model: constant throughput - page 3

 

The idea itself is curious, but the underlying message is really strange. I personally don't like it. If the amount of information is always roughly constant, then there's not much going on in the marketplace. But it's not like that. On the marketplace there are regular catastrophes in which the amount of information changes exactly (like the transition to another phase state).

 

The man proposes to analyze bits of compressed information, to predict a new "frame" in the stream, what, for example, is better than an MPG4 frame compared to an MPG1 frame to predict the next frame?

Maybe it's easier to follow the plot of the film :)

 
hrenfx:

Information is a set of bits that cannot be compressed in any way to transmit.

It is assumed that the market, as a relatively closed system, generates a constant (or slowly changing) amount of information per unit time.

What does this mean?

Market data is anything that can be obtained from the market. The simplest is prices.

Let the unit of time be Time. It is assumed that always in Time the amount of information in the market is N. It is simpler:

We have collected during Time the data from the market. We compress it to maximum (it's impossible to compress it more), thus getting a set of incompressible bits - it is the information the amount of which is constant(N) for the time unit Time.

Compressing as much as possible is a theory. There are a lot of compression algorithms. The stronger the compression algorithm, the closer it is able to estimate the amount of information contained in the available data. I.e. we cannot exactly determine the amount of information, but we can estimate it.

How to use this model for trading is described here.

Verification of adequacy of the model is not very difficult. It is sufficient to have a large amount of historical market data. Take a sliding window of Time size. And for each position of the window, perform compression (algorithms can be different) obtaining the number of bits. As the result we will obtain BP of information amount estimate. It remains only to analyze this BP and make appropriate conclusions.



Claude Shannon who incidentally abandoned his scientific career and is now rumored to be engaged in the stock market, introduced the notion of a measure of information:

some event which has M possible outcomes Xi with probability P[Xi], describing the probability of occurrence of the i-th outcome, contains

information, the value of which is defined by the expression:

I[Xi] = ln(1/P[Xi]) = - ln P[Xi]

The expected or average value of this information I equals entropy H,

That is, entropy is a measure of uncertainty, remember the average temperature in a hospital? :))) Here,

that is uncertainty, entropy :).

"Information is a set of bits that in no way can be compressed to transmit" Sounds!

But it seems to me that the only way in which a set "cannot becompressed in any way" is

a set consisting of just one bit, i.e. when there's no redundancy, then there's

there is nothing to compress! That is, when this bit takes one of two values "0" or "1", but!

is a complete certainty! So you're expressing the hope that there are procedures that can

to bring the randomness contained in the forex market, that there are procedures that can completely

eliminate that randomness to the point where it can't go any further? Hmm. And it is all the more impossible because the forex market is not a closed-loop system .

This is proved by strong non-stationarity, i.e. volatility of statistical parameters, quotes of currency pairs and

The empirical view of the market as a combination of technical analysis and fundamental analysis, which as you know

are concerned with the internal mood of the market and the analysis of the situation outside the market respectively.

That's why I wrote so much, because your, er, hypotheses just seemed to me to be completely upside down.

 
TheVilkas:

I am familiar with the basics of Information Theory. Seems to have given an ambiguous definition of information. Let me paraphrase:

The amount of information contained in data is the minimum number of bits needed to recover the data.

That is, the number of bits in the maximum compressed (recoverable) data is the amount of information in that data. The so-called pure information contained in the data.

 
hrenfx:

Information is a set of bits that cannot be compressed in any way to transmit.

It is assumed that the market, as a relatively closed system, generates a constant (or slowly changing) amount of information per unit time.

What does this mean?

Market data is anything that can be obtained from the market. The simplest is prices.

Let the unit of time be Time. It is assumed that always in Time the amount of information in the market is N. It is simpler:

We have collected during Time the data from the market. We compress it to maximum (it's impossible to compress it more), thus getting a set of incompressible bits - it is the information the amount of which is constant(N) for the time unit Time.

Compressing as much as possible is a theory. There are a lot of compression algorithms. The stronger the compression algorithm, the closer it is able to estimate the amount of information contained in the available data. I.e. we cannot exactly determine the amount of information, but we can estimate it.

How to use this model for trading is described here.

Verification of adequacy of the model is not very difficult. It is sufficient to have a large amount of historical market data. Take a sliding window of Time size. And for each position of the window, perform compression (algorithms can be different) obtaining the number of bits. As the result we will obtain BP of information amount estimate. It remains only to analyze this BP and make appropriate conclusions.


lossless archiving implies compiling a new alphabet whose description + encoding of the information to be archived will be smaller in size than the information itself. Roughly speaking, it is an allocation of some patterns. But this is effective for models like regular grammars - where there are strict and unambiguous rules, or deviations from them are not frequent. If there is noise, for example, archiving efficiency drops dramatically. If in a text there is a word 100 times, but each time with a mistake or a couple of letters swapped, the lossless compression algorithms can not keep it in a separate pattern. Lossy compression algorithms, such as those for image, video and sound, are efficient here. But all of them still can not take into account contextual rules, such as changing word endings depending on case, etc. etc. For example, they will highlight the most frequently used letter combinations in the text and that's it. The same for the market - it will isolate the most common elementary patterns, but not the fact that their use will enable probabilistic forecast. It may be even more exact, a profitable forecast. Otherwise there will be forecast with 90% probability that there will be continuation of such-and-such scenario. But the financial loss from the remaining 10% scenario will be the same as the profit from using those 90st.

In short, everything depends on the archiver. Allocation of depth rules is a job for artificial intelligence (or natural :)), and not rar :) And of course, the main thing is not their globality, but the possibility of profitable use.

 

I don't understand the first post of the topic turned into formulas, but imho you are trying to talk about entropy

ZS: I hate the theory of information transfer, because of one single typo ( I mixed up bps with baud), my report card got a "O" instead of an "A".

 
Mathemat:

The idea itself is curious, but the underlying premise is really strange. Personally I do not like it. If the amount of information is always approximately constant, nothing much happens in the marketplace. But it's not like that. On the marketplace there are regular catastrophes in which the amount of information changes exactly (like the transition to another phase state).


I hope forum members will remember this thread.https://www.mql5.com/ru/forum/105740

the very first page

of which a special role in flow theory is played by a first-order moment function called the flow intensity (FTI):

may another way of saying IP is the amount of information per unit of time. Some analogue of this can be considered the number of ticks per unit time, if you don't also analyse the news. By the way, in my opinion, you can not compress, compress (not compress) the amount of information does not change

Z.U., you'll have a hard time without a tickframe. checking on history won't work either https://www.mql5.com/ru/forum/1031/page1#comment_6372 history in the form of minutes kills this information...

 
hrenfx:

Testing the model for adequacy is not very difficult. It is enough to have a large amount of historical market data. Take a sliding window of size Time. And for each window position perform compression (different algorithms can be used), obtaining the number of bits. As the result we will obtain BP of information amount estimate. All that remains is to analyze this BP and draw appropriate conclusions.

Tested. I took a sliding window of size per day(288 M5) and shifting it each time for 5 minutes applied RAR and 7Z LZMA compression from the beginning of 2010 till October 2010 - almost 60 000 sliding windows compressed by each archiver. This is how the charts of compressed windows size of FOREX market samples(AUDUSD, EURUSD, GBPUSD, USDCHF, USDJPY, USDCAD, NZDUSD, SILVER, GOLD) look like:

Surprisingly, RAR showed extremely unstable results. The size of compressed windows fluctuates tremendously. 7Z LZMA showed stable results and smaller compressed window size. Therefore, 7Z LZMA was chosen for further research.

Then I started doing the same thing, but I varied the market sample: first I added one symbol(AUDUSD), then one more and one more, until I had 9 symbols(AUDUSD, EURUSD, GBPUSD, USDCHF, USDJPY, USDCAD, NZDUSD, SILVER, GOLD). The task was to find out how the archiver finds correlations with the introduction of new symbols. If there are correlations, the average size of a compressed window should non-linearly grow when a new symbol is added. This is how it turned out:

We can see that already with 8 tools at least 20% of the data are superfluous (do not contain any information). I.e. there is a correlation and not a small one. Interestingly, addition of the 9th financial instrument(GOLD) didn't reveal interrelations (MI has not decreased). The RMS with the addition of financial instruments increased by more than 50%(9 instruments) relative to the beginning(1 instrument).

The graphs of change in the compressed windows sizes (MO are reduced to one) look as follows for different sets of financial instruments:

The distributions of these graphs:

What conclusions can be drawn?

It was impossible to refute or confirm the model. The compression algorithms show very well the presence of elementary (the algorithms are very simple) relationships between financial instruments (more than 20% of redundant data is eliminated in 8 financial instruments). Many would say it's natural, because Solid-conversion is used. But it's not exactly so. The example is Gold, for which the archiver can't find connection with 8 other symbols.

P.S. Crosses were not analyzed intentionally, because we know that they are completely correlated with majors and therefore do not contain any additional information. Therefore, only the majors.

P.P.S All data on window sizes is attached.

P.P.P.S. It was interesting to solve the problem. I had to use some new methods. In particular, I had to use a RAM-disk to perform more than half a million compressions of various windows. In the end it was relatively quick.

Files:
4analyse.rar  497 kb
 
hrenfx:

...

If you don't mind, please do the same but with artificially generated BP using RMS. Very interesting to see what happens.
 
hrenfx:

Then I did the same, but I changed the market sample: first I added one financial instrument(AUDUSD), then I added one more until I had 9 financial instruments(AUDUSD, EURUSD, GBPUSD, USDCHF, USDJPY, USDCAD, NZDUSD, SILVER, GOLD).

And how exactly did the adding take place?