How to make money in the financial sector? - General

СанСаныч Фоменко 2024.04.21 17:50 #34801

Aleksey Vyazmikin #:

I don't even know if there were already synonyms in his native language back then.....

Synonyms are somewhere in poetry, in novels. But in the exact sciences, there are no synonyms. Although there are plenty of dilettantes who don't know the exact meaning of terms.... and start talking rubbish.

Forester 2024.04.21 17:54 #34802

Aleksey Vyazmikin #:

So I thought, and ordered the values by probability shift after quantisation, training became 7 times faster - instead of 400 trees - only 60, but the financial result on the other two samples became much worse. It turns out that due to the haus of the class membership probability distribution, randomisation makes learning slightly better.

This is the result of your mixing. In essence, you have introduced additional noise. Just like the Features Permutation Importance method of estimating predictors, which shuffles a column, thereby making it noise. You also shuffled it, but in blocks/quanta.

From theory to practice. is there something wrong Marketplace rental products.

mytarmailS 2024.04.21 17:54 #34803

СанСаныч Фоменко #:

Synonyms are somewhere in poetry, in novels. But in the exact sciences, there are no synonyms. Although there are plenty of amateurs who don't know the exact meaning of terms. and start talking rubbish.

Even if you don't talk rubbish, but something clever, you can't understand it.

Aleksey Vyazmikin 2024.04.21 18:04 #34804

Forester #:

That's the result of your stirring. In essence, you've introduced additional noise. Just like the Features Permutation Importance method of estimating predictors, which shuffles a column, thereby making it noise. You also shuffled it, but in blocks/quanta.

What does that have to do with noise? I simplified the training for the algorithm. The algorithm uses fewer splits/trees to arrive at the "same" result.

Aleksey Vyazmikin 2024.04.21 18:06 #34805

СанСаныч Фоменко #:

0.57/0.448 = 1.2723, i.e. 27% difference. the model can be discarded.

I might agree if we were talking about stationary systems and had a representative sample. Otherwise, it is an empty heuristic that can be easily fitted.

Forester 2024.04.21 18:14 #34806

Aleksey Vyazmikin #:

What does noise have to do with it? I simplified the training for the algorithm. The algorithm started using fewer splits/trees to arrive at the "same" result.

Shuffling the column makes it noisy. You wrote yourself that the financial result on the other two samples is significantly worse

What is the same result?

Aleksey Vyazmikin 2024.04.21 19:06 #34807

Forester #:
shuffling the column makes it noisy. You wrote yourself that the financial result on the other two samples is significantly worse

How is that the same result?

The noise would be with a random change. Here we are, in effect, renaming the information. As you know, CatBoost does splits on a quant table obtained once. I.e. we get discrete polyhedral (by number of dimensions) cubes with content range of values. The tree deals with the fact that it groups them by one of the faces or set - as they go in order.

The cubes are originally scattered with no order, and I just grouped them immediately.

As I have shown earlier, the probability of choosing the right cube is within 20% in this sample. So, it turns out that with a complex structure of dice arrangement you have to make more iterations than with an ordered one, which accidentally allows you to find some complex dependencies, but worsens the efficiency of learning. The efficiency here is the increase of logloss index from iteration to iteration.

As for the result - on the train sample everything is quite good even in financial terms, but on the test and exam samples the model very rarely produces a probability greater than 0.5, so there are mostly zeros.

I will try to reduce the learning rate, but the nature of this phenomenon is not quite clear yet, because the logloss results are comparable.

Is there a pattern Gathering a team to Elite indicators :)

Forester 2024.04.21 19:18 #34808

Aleksey Vyazmikin #:

The noise could be due to random modification. Here we, in fact, rename the information. As it is known, CatBoost makes splits on the quantum table received once. That is, we get discrete polyhedral (by the number of dimensions) cubes with the contents of the range of values. The tree is engaged in that Groups them by one of faces or by set - as they go in order.

The cubes/quanta are initially sorted among themselves. You change their order, i.e. you shuffle them. The OOS clearly shows you that. No pattern is found. And traine will learn well on any rubbish.

The market is a Help me fix my From theory to practice

Aleksey Vyazmikin 2024.04.21 20:02 #34809

Forester #:

The cubes/quanta are initially sorted among themselves. You change their order, i.e. you mix them. The OOS is clearly showing you that. No patterns are found. And traine will learn well on any rubbish.

Learning is the rules for selecting cubes. Their order is set by the predictor algorithm. If I swap them around, the information isn't lost. My algorithm finds all the cubes, and for the tree it becomes an important change, because the algorithm works not with one cube, but with a group of cubes at once, and for it the content of the group of cubes has changed. By splitting groups into subgroups it will choose different changes to cut through the split, since the group statistics have changed.

We need a more pattern-rich sample to test this hypothesis.

Working with your hands Is there a pattern Questions from Beginners MQL5

Forester 2024.04.21 20:09 #34810

Aleksey Vyazmikin #:

Training, these are the rules for dice allocation. Their order is set by the predictor algorithm. If I swap them around, the information is not lost. My algorithm finds all the cubes, and for the tree it becomes an important change, because the algorithm works not with a single cube, but with a group of cubes at once, and the content of the group of cubes has changed for it. By splitting groups into subgroups, it will choose different changes to cut through the split, since the group statistics have changed.

We need a more pattern-rich sample to test this hypothesis.

The information is mixed/randomised. Like permutation. Only there they don't mix in groups, but specifically each element in a column... which kind of turns off the predictor and then compare how much the model result has changed, and this is the estimation of column importance.

It's up to you what to spend your time on. Nothing more to say on this topic.

Retrieving a price stream Margin Requirements Money management strategies. Martingale.

Machine learning in trading: theory, models, practice and algo-trading - page 3481