Machine learning in trading: theory, models, practice and algo-trading - page 235

 
Vizard_:
I see they've already started writing books on r)))
allitebooks.com/automated-trading-with-r-2/
If they start to describe in books any method "how to make money", then this method either stopped working or never worked. Indicative, however. That is, when the description of the methods in the book makes more money than the method itself.
 
Vizard_:
For the sake of sport interest overtake a little, on it and I will stop. Resources free a little, but the desire is not and was not at all,

only supported the idea of an "admission ticket to the branch. < 0.69, at a glance, it is not much of a problem to leave. < 0.68 dunno, you have to think about it)))

https://numer.ai


Tell me please, what model, as taught?
 
lucky_teapot:
Please tell me what model, how was it taught?

Class prediction error is less than 30%. It can be much less than 30%, but I could not get less than 20%. But there is good reason to say that the model is NOT retrained. The main thing is that it is not over-trained - an over-trained model is not needed at all, dangerous garbage.

Models: random forest and ada. In general, the choice of model has little effect on the result, if it is trained at all. On my predictors, nnet is not trained at all.

 
SanSanych Fomenko:

Class prediction error is less than 30%. It can be much less than 30%, but less than 20% could not be achieved. But there are good reasons to say that the model is NOT retrained. The main thing is not to be over-trained - an over-trained model is not needed at all, dangerous garbage.

Models: random forest and ada. In general, the choice of model has little effect on the result, if it is trained at all. On my predictors nnet is not trained at all.

What kind of logloss do you have there?
 
lucky_teapot:
What kind of logloss do you have there?
What is logloss?
 
SanSanych Fomenko:

Class prediction error is less than 30%. It can be much less than 30%, but less than 20% could not be achieved. But there are good reasons to say that the model is NOT retrained. The main thing is not to be over-trained - an over-trained model is not needed at all, dangerous garbage.

Models: random forest and ada. In general, the choice of model has little effect on the result, if it is trained at all. On my predictors nnet is not trained at all.

SanSanych Fomenko:
What is logloss?

I guess you're talking about your gauge athttps://numer.ai but not in logloss but in terms of haeming (% error), butlogloss is tricky thing, you need not only guess the class but also the probability

 
lucky_teapot:

Well, I understand you're talking about your gauge athttps://numer.ai but not in logloss and hamming (% error), butlogloss is tricky thing, there you need not only guess the class but also the probability

I'm writing about my Expert Advisor. It has a model.

I use packages and they have a rough estimation that has nothing to do with logloss. Moreover, results from packages can be evaluated by other means.... Loglos I don't remember.

And the class, in the packages I've seen, is derived from probability, i.e. the actual probability is counted and then calibrated. It's standard in half for the two classes, but you can get in and rule.

 
SanSanych Fomenko:

I am writing about my Expert Advisor. It has a model.

I use packages, and they have an estimation, at a glance, that has nothing to do with logloss. Moreover, results from packages can be evaluated by other means.... Loglos I don't remember.

And the class, in the packages I've seen, is derived from probability, i.e. the actual probability is counted and then calibrated. It's standard in half for the two classes, but you can get in and steer.

I can't say anything then, at the very least you need a dataset on which you got those results. Loglos I agree, for our case is not really the right choice, it's a tribute to kaggle, not retrained 20-30% error in accuracy, that sounds very powerful to me, frankly hard to believe.

For loglos the trick is that for example for two classes if you have 0% error, then the probabilities of answers will be close to 100% and 0%, {0,1,0,1....} when the error is 10%, it affects not only the probability of incorrectly defined clssas, but also the probability of correct answers, that is, those correct answers that were 1 and 0, should now be for example 0.8 and 0.2, when the error is 45% then everything should oscillate around 0.5 +- 0.1, so that the logloss was optimal, such alchemy ....

 
Dr.Trader:

Just happened to look at lucky_teapot's profile, there was a thread on the forum with a link to an article in it. I think it was transferred from mql4.com forum, which I almost never studied, thanks to MetaQuotes if it really was from there.
This article is almost 9 years old, but I've found a lot of useful stuff, which is good to try now. I think I even understood about dimensional lag space, which in this thread Alexei mentioned a couple of times.
The article itself, I find very useful -https://www.mql5.com/ru/articles/1506

I read it... I was interested in one thing, the author says that you can slightly modify the data, thereby increasing the sample and the model will then work better because the knowledge base of the model will be wider...

I work with reversals, and the reversals are still small compared to the total sample

I think if we draw a multimillionth sample and train the model to catch reversals, there will be plenty of examples and I think the pattern of reversals is the same there(I mean many patterns)

I have another question, or rather an idea but no solution yet ...

I think if we separate all reversals from this multimillion sample at once and leave only those as a training sample, the model will learn only the reversals itself, it's kind of fast, but when we have to distinguish between reversals and non-turns in the new data how will the model do that if it has no idea what a non-turn is...? (...)?

 
mytarmailS:

I read it... I was interested in one thing, the author says that you can slightly modify the data, thereby increasing the sample and the model will then work better because the knowledge base of the model will be wider...

I'm working with reversals, and the reversals are still small compared to the total sample


What is a reversal? Is it one bar like in ZZ?

I really like the idea (I picked it up here in a thread) when a reversal is considered as a certain sequence of bars, after which, in the future, there will be a predetermined profit. This approach would greatly reduce class imbalance. This is one. Two, the teacher himself will have a clear predictive property.