Machine learning in trading: theory, models, practice and algo-trading - page 402

 
Aliosha:

XGB - https://en.wikipedia.org/wiki/Xgboost is the thermonuclear weapon of machine learning.

3 months for xft is not quite enough, for a full simulation cycle, since the model needs to be tested on different markets, mode changes, flash roofs and different swans, synthetic stress testing can't do that, since the real market. The final model for the most part will not use more than a week's worth of data, but to configure it you will need to race it on 1-3 year samples to make sure it doesn't screw up everywhere. In 3 months the data can be trained and if the datascientists know their stuff, it will turn out to be a regular money maker, but one day, maybe in 3 months maybe in half a year, everything can break abruptly, for "unknown" reason, or rather known, since the model has not encountered such meta state of the market and turned on amateurishly.


ah, well, it is necessary to retrain systematically, what is the sense to train hft in 5 years, there is neither the nerves nor the resources to do it

Ah, the gradient boosting... I heard and heard, but have not done it. The farther into the woods, the more complicated the terms

 
Maxim Dmitrievsky:


They have no sense to train hft in 5 years, they have neither the nerve nor the resources to do it.

Perhaps the guys fromLTCM, too, and so argued)) They say that if they had looked twice as far back in their models, they wouldn't have merged so fiercely.

In any case, the training does not go immediately to the entire sample of 5 years, it is clear the sliding window takes a sample, with constant learning, but it is important to know how quickly the model "understands" when something strongly changes in the market and this is not an outlier, not someone's stupidity, and will not bravely fill against a sudden trend, the call of Kolya.

 
Alesha:

Perhaps the guys fromLTCM, too, and so argued)) They say that if they had looked twice as far back in their models, they wouldn't have merged so fiercely.

They say if they look twice further back in their models, then they would bleed so fiercely. Learning in any case does not go immediately to the entire sample of 5 years, it is clear the sliding window takes a sample, with constant learning, but it is important to know how quickly the model "understands" when something strongly changes in the market and this is not an outlier, not someone's stupidity, and will not bravely fill against a sudden trend, then Kolya's phone call.


Mdya...... you manage to shit on the work, while you yourself are talking about completely different things. As a rule, the model is considered unfit when it breaks the support line of the balance curve, and then it is restored, as I showed earlier... It is a philosophical question to trust or not to trust the work of the model. No one says that three months of data is too little. But I have a question, how do you know how I collect them and what they refer to?????? Just curious. Why do you suddenly think that the model will lose its meaning if during three months it has been presented with all possible variants of market development?????

You have to understand what it is before you can predict the market. Yes, the market changes in a global period, but I use data that is the cause of the price, that is, it is this data that changes the price, not vice versa. Moreover, after the unloading I use a unique data cleaning procedure to remove the garbage, and that's why I obtain such models. Although I have the trade itself is not a shitty. I urgently need a robot. And a question for the audience... Does anyone have a skeleton of a trading robot that takes into account requotes, ouds, and other stuff in real trading?????

 
Mihail Marchukajtes:


No one is saying that three months of data is too little. But I have a question, how do you know how I collect them and what they refer to?????? Well just curious. What makes you suddenly think that the model will lose its meaning if all possible market development options were presented to it for three months?????

If you trained on 3 months of data, you can't expect the model to last much longer. The market that the model has seen, will be able to trade. Your dataset is nonsense, trading with it is like guessing with coffee grounds. The same is true for "Reshetov's machine" that assigns coefficients for a linear model, while the data are not linear at all. You have to be not at all distant to believe in this nonsense that on a dataset of <500 points a linear model took weeks to learn, because it is "AI")))))))))..... I don't know.... this is more garbage than martingale and "depo boosting"

 
Alyosha:

If you trained on 3 months of data, you can't expect the model to last much longer. What market did the model see, it will be able to trade. Your dataset is nonsense, trading on it is like guessing with coffee grounds. The same is true for "Reshetov's machine" that assigns coefficients for a linear model, while the data are not linear at all. You have to be not at all distant to believe in this nonsense that on a dataset of <500 points a linear model took weeks to learn, because it is "AI")))))))))..... I don't know.... it's more nonsense than martingale and "depo boosting"


The meaning of the week???? Alyosha, you are really ALESHA Ha ha ha.... What kind of people have become. Alyosha, you're our Alexey, here's an article first read here and you'll understand that my 500 points I collected for three months, because I do not shove the classifier every bar, and do it at a certain time and therefore 500 points I cover the market for 3 months, and the fact that your model could not train on it correctly, it's because you have a "crap" system of AI. I even put it in brackets for some kind of praise :-) Eh Alyosha, Alyosha......Then I see that by the weekend and trolls woke up..... Okay I personally do not care, I'll let you in the end one more secret, just to develop, so that you understand who "you are spitting at.

To all those who thought my dataset was worse than 50%:!!!!!!! Your AI system is not built correctly or there is an error in it OR!!!!!!!! And then there's the drum roll of.... Your system is limited by the amount of material to learn, there are you using grids that are able to train correctly for 2-3 weeks (let's take as a rule that there is no retraining at all) and work for one week or two. Such systems exist and there is nothing wrong with them THIS WORKS!!!! BUT when you drop a big dataset on it, it starts to overlearn or underlearn terribly, which eventually leads to big training errors and you start to think that the dataset sucks, quite reasonable for your AI systems. BUT, when an AI system is really cool, it's capable of building a model (and not overtraining) even on this dataset that YOU failed on.... You see!!!! Just the model will consist of a large number of inputs, I think so 10-12 and the polynomial will be long enough and believe me, such a model will have more than 50% profit. Or rather, even for the REACH optimizer there is a limit, but it is much more than those who failed..... Something like this.....

 

And the most interesting thing is that the main result of this optimizer is percentage of generalization and I built models with 100% level of generalization. But with increasing the sample percentage begins to fall, and there will come a time when it falls below 50% and then the model will begin to plummet. But it's more likely to be approaching the 50 mark. Anyway, if there are fish in the data, then he will find it, if not, then it's a pity.....

It also very well answers the question of how good the data is to the selected variable, if the data is full of shit, it will be seen immediately....

I'm hilarious, broke the dataset I posted, and one of the samples is 138 lines, ran it, waiting for.... i'll post the results if i'm interested....

 
Mihail Marchukajtes:


I mean weeks???? Alyosha, you're really Alyosha Ha ha ha .... What kind of people have become. Alyosha our Alexey, here's an article first read here and you'll understand that my 500 points I collected for three months, because I do not shove the classifier every bar, and do it at a certain time and therefore 500 points I cover the market for 3 months, and the fact that your model could not train on it correctly, so this is because you have a "bad" system of AI. I even put it in brackets for some kind of praise :-) Eh Alyosha, Alyosha......Then I see that by the weekend and trolls woke up..... All right, I personally do not care, I'll tell you at the end of another secret, just to develop, so you understand who "you are spitting at".

To all those who thought my dataset was worse than 50%:!!!!!!! Your AI system is not built correctly or there is an error in it LYBE!!!!!!!! And then there's the drum roll of.... Your system is limited by the amount of material you learn, there are grids you use that are able to train correctly for 2-3 weeks (let's take as a rule that there is no retraining at all) and work for one week or two. Such systems exist and there is nothing wrong with them THIS WORKS!!!! BUT when you drop a big dataset on it, it starts to overlearn or underlearn terribly, which eventually leads to big training errors and you start to think that the dataset sucks, quite reasonable for your AI systems. BUT, when an AI system is really cool, it's capable of building a model (and not overtraining) even on this dataset that YOU failed on.... You see!!!! Just the model will consist of a large number of inputs, I think so 10-12 and the polynomial will be long enough and believe me, such a model will have more than 50% profit. Or rather, even for the REACH optimizer there is a limit, but it is much more than those who failed..... Something like this.....


And the most interesting thing is that the main result of this optimizer is generalization percentage and I was building models with 100% generalization level. But with increasing the sample percentage starts to fall, and there will come a time when it falls below 50%, and then the model will begin to plummet. But it's more likely to be approaching the 50 mark. Anyway, if there is a fish in the data, he will find it, if not, it's a shame.....

It also very well answers the question of how good the data is to the selected variable, if the data is full of shit, it will be seen immediately....

I'm hilarious, broke the dataset I posted, and one of the samples is 138 lines, ran it, waiting for.... i'll post the results if i'm interested....

I'm not going to argue with you, there is nothing to argue with, you are full of nonsense, "100% generalization" )))))) I think you don't even understand the difference between linear and nonlinear model. And "Reshetov's machine" is linear, it can't even c XOR, it's stupid optimization of coefficients of dividing hyperplane by some pseudo-genetics, babbling of children...

That's it, I don't offend kids, I'm a bad, evil uncle, told that Santa Claus doesn't exist)))

 
Alyosha:

I will not argue with you, there is nothing to argue with, you are full of nonsense, "100% generalization" )))))) I think you don't even understand the difference between linear and nonlinear model. And "Reshetov's machine" is linear, it can't even c XOR, it's a stupid optimization of hyperplane coefficients by some kind of pseutogenetics, babbling of children...

That's all, I don't offend kids, I'm a bad, evil uncle, told that Santa Claus doesn't exist)))


YES!!!!! So I am Santa Claus!!! And every year I congratulate the children on this beautiful holiday. You even managed to lose to me here :-)
 
Alyosha:

I will not argue with you, there is nothing to argue with, you are full of nonsense, "100% generalization" )))))) I think you don't even understand the difference between linear and nonlinear model. And "Reshetov's machine" is linear, it can't even c XOR, it's stupid optimization of coefficients of dividing hyperplane by some pseudo genetics, babbling of children...

That's all, I don't offend kids, I'm a bad, evil uncle, told that Santa Claus doesn't exist)))


Now I'll calculate the model with 100% level of generalization....
 
Mihail Marchukajtes:

I will send you a model with 100% level of generalization....
No need to blow the grail, trade urgently on it, trade! Everyone will thank you :)
Reason: