Is there a pattern to the chaos? Let's try to find it! Machine learning on the example of a specific sample. - page 3

 
Aleksey Vyazmikin #:

Can you check the model exactly on the exam.csv file?

Have you tried any manipulations with sampling?

Here is the balance on the exam sample after weeding out some of the predictors.

Of course, the graphs of the model response distribution show that only a little bit of training was done - Recall is very low, but it is already some result.

train.csv


exam.csv

There are 9046 lines in exam. I have 9000. There will be almost no difference.

Your curve is much better. I'll try some more tinkering with the parameters.
 
elibrarius #:

What's the best balance you've got?

Now I've searched in different variants, it kind of turns out that this result - there is also on the round of the commission 3 points are taken on the idea.


 
elibrarius #:
There are 9,046 lines. I have 9000. It won't make much difference.

You have a much better curve. I'll try some more tinkering with the parameters.

Well, if it is exam file data, then yes - there is not much difference, I just thought that maybe it is train file. Did you merge the three files together originally?

Try it.

 
Aleksey Vyazmikin #:

Well, if it's the exam file data, then yes - it doesn't make much difference, I just thought it might be the train file. Did you merge the three files together originally?

Try it.

Yes, I merged all 3, then I just specify the lengths of the sections.
 
elibrarius #:
Yeah, I combine all three, then I just enter the lengths of the sections.

I see, that's fine then.

I think there is a possibility to improve the training by reducing the sample, let's say to train on 1/10 - this will allow to train some one phase/structure of the market - I haven't required it yet.

 

Only by changing the learning rate was it possible to obtain two models out of 100 that met the criterion.

One.

The second.

It turns out that yes, CatBoost is capable of much, but it is necessary to tune the settings more aggressively.

 
Aleksey Vyazmikin #:

Right, well, that's fine then.

I think there is a possibility to improve the training by reducing the sample, let's say to train on 1/10 - it will allow to train some one phase/structure of the market - not required yet.

I have tried training with valking forward at 1000 and at 20000 - everything fails.
 
Wo teach one class trade/don't trade?
Or separate buy and sell?
 
elibrarius #:
Wo teach one class to trade/not trade?
Or separate buy and sell?

The results are shown from samples without target transformation, i.e. yes - trade and don't trade.

But really, making separate buy and sell samples would be easier to train.

elibrarius #:
tried to learn by valing forward at 1000 and at 20000 - all drains.

Hmm, strange. What method do you use for training - random forest?

 
Aleksey Vyazmikin #:

Hmm, weird. What method do you use for training - random forest?

Redesigned from Alglibow.
I'm running more trees now. By morning, I think it'll calculate a new version.

Or maybe I did something wrong, if the result is much worse than yours.