Machine learning in trading: theory, models, practice and algo-trading - page 3325

 
Aleksey Vyazmikin #:

What's the trolling?

Here's the video

It'slike this.

 
mytarmailS #:

like this

On my end it was about a place on the internet, ie a link.

 
Aleksey Vyazmikin #:

On my end it was about a place on the internet, i.e. a link.

I don't remember the exact article,

but there aren't a million of them, look it up.

 
mytarmailS #:

I can't remember which article,

but there aren't a million of them, look it up.

So I did a search, and I haven't found anything yet.

 
Aleksey Vyazmikin #:

That's what I realised. Just asking if the cause of this has been determined. Not what's broken, but why the signals are missing.

The reason is simple, as it was intended - the signals are missing because on new data the signals are out of a narrow acceptable range.

Well, it can be compared to classification, there are clear known patterns and there are obscure unknown ones. As time goes by, there are more and more unknowns and there is nothing left in the "known" class.

 
Aleksey Vyazmikin #:

It is claimed that with this algorithm it was possible to win first places on cagle, I do not think that there were simple tasks...

Shall we try to figure it out? I don't understand formulas - to my great regret.

I'm not into formulas either, but into ideas.
And if you break down the idea, it's so bad for market data.

It suggests deleting pairs of examples of different classes that are very close to each other. If we look at the 3rd example, ideally all examples from 0.2 to 0.8 will be removed and only areas below 0.2 and above 0.8 with absolute purity of classes will remain. Any model would further easily classify them.
Earlier I have already shown that such a simple example and the tree will easily divide, if you use leaves with high purity of classes (and do not divide leaves up to 1 example in a leaf).
But this is an artificial example.

On market data there will be no such pure blocks with predominance of one class. I.e. you will have to clean almost everything. For example, there were 1000 points, 900 were cleaned, the rest of them somehow reached the cleanliness of leaves, for example 70% - it seems not bad, and you can earn money. But when you start to really trade, then there will be examples, which we got rid of when cleaning (9 rubbish for 1 remaining one) and indicators from 70% will fall to 53% for example and you will lose on spread, slippages, etc.

I prefer a tree and a leaf with honest 53% of purity of one of the classes. And will not use it.
 
Aleksey Vyazmikin #:

So I did a search and nothing's come up yet.

it happens

 
Aleksey Vyazmikin #:

I don't see the connection here. What does it follow from?

From reading the text on your link, there was even a theorem about their connection. Don't be lazy to read at least your links.
 
Forester #:
I'm not using formulas either, but ideas.
And if you break down the idea, it's not so good for market data.

It suggests removing pairs of examples of different classes that are very close to each other. If we look at the 3rd example, ideally all examples from 0.2 to 0.8 will be removed and only areas below 0.2 and above 0.8 with absolute purity of classes will remain. Any model would further easily classify them.
Earlier I have already shown that such a simple example and the tree will easily divide, if you use leaves with high purity of classes (and do not divide leaves up to 1 example in a leaf).
But this is an artificial example.

On market data there will be no such pure blocks with predominance of one class. I.e. you will have to clean almost everything. For example, there were 1000 points, 900 were cleaned, the rest of them somehow reached the cleanliness of leaves, for example 70% - it seems not bad, and you can earn money. But when you start to really trade, then there will be examples, which we got rid of when cleaning (9 rubbish for 1 remaining one) and indicators from 70% will fall to 53% for example and you will lose on spread, slippages, etc.

I prefer a tree and a leaf with honest 53% of purity of one of the classes. And won't be using it.

At this point our thoughts are in agreement, regarding the outcome. Yes, I expect a highly thinned sample, but as I understand the process is iterative, which means you can know the measure and stop much earlier and use the same data to build the same wood models, which will have fewer splits and more reliable values in leaves.

Do I understand correctly that the initial centres are randomly located?

 
Aleksey Nikolayev #:
From reading the text in your link, there was even a theorem of sorts about their connection. Don't be lazy to read at least your links.

That's why I asked you to quote....