Machine learning in trading: theory, models, practice and algo-trading - page 450

 
Did Reshetov introduce himself to God?
I didn't...
Recently wrote something like.... Winter...
 
Alexander Ivanov:
Did Reshetov introduce himself to God?
I didn't know...
Recently wrote something like.... In the winter...

That's what I wanted to know what happened to him.

 
The end of the world is coming. And God is taking his own away....
And the Antichrist will appear.....
 
The output:

Hmmm... do you use the late Yury Reshetov's software? XGB grinds this set to 65-67% accuracy in a minute. When ML works more than an hour, I believe that something is done wrong, because neural networks long ago became chilled.

No, not Jura's neural network. But I train the model not once, but I try different combinations of predictors and different model parameters. The output should be statistical data on the importance of each predictor, and model parameters so that everything can be trained without fitting.

 

I've got it so far, the selection of model parameters and predictor weights is still far from complete, in the future it should be much better.

I took 10% of train.csv (random) for training, otherwise it's a long process.
Weights of predictors are.
0
0
3467.50163547078
0
0
184258.95892851
22315.6831463224
0.144079977475357
0
0
0.000324672622477092
39775.9969139879
6053.73861534689
0
0

What is zero and close to it is garbage and useless, the higher the weight the greater the influence of the predictor on the result.

logloss on training (10% of lines from train.csv) - 0.6895723, accuracy 0.6402786

The logloss on the test (the whole test.csv) is 0.6928974, accuracy is 0.6239073.
I need to increase the number of training examples, 10% of what I took is very small, so the logloss has noticeably dropped on the test. For example for numerai I need to take at least 50% of training examples, otherwise the results on the new data is nothing at all.


I have to take at least 50% of training examples, otherwise the results in the new data is nothing:

XGB grinds this set to 65-67% accuracy in a minute.

Respect XGB, in capable hands a strong thing. I have 4 hours and it's worse.


What kind of data is this? Forex, stock exchange, paid subscriptions? Would 62% really bring me a profit if I gathered a similar set of predictors?

 
Dr. Trader:

What kind of data is this, anyway? Forex, stock exchange, paid subscriptions? Would 62% really make a profit if I collected a similar set of predictors?


In my opinion this question should be asked in the beginning )) to predict something without understanding the source of data is of course a high level :)

It's like walking around with a girl, getting acquainted through acquaintances, and then only by the end of the evening - listen, what's your name :) and she says you have a good reaction in general in life, you'll go far

 
Alexander Ivanov:
Did Reshetov introduce himself to God?
I did not know...
Recently wrote something like.... In winter...

Shocked myself.

May he rest in peace.

 
Vladimir Gribachev:

Shocked myself.

May he rest in peace.

Heavenly kingdom to him.
 
Alexander Ivanov:
God rest his soul.
But his work will live on, I read his work, a very interesting man with unconventional thinking. I was even surprised that before I raised the topic again no one had particularly discussed it, except Michael.
 
Dr. Trader:

For training took 10%of train.csv

The training logloss (10% of lines from train.csv) is 0.6895723, the accuracy is 0.6402786

The logloss on the test (the whole test.csv) is 0.6928974, the accuracy is 0.6239073.

I need to increase number of training examples, 10% of what I took was very little, so logloss noticeably dropped in the test.

Not tried to take 10% but I think 62% is good, I had about 66% in the test, Wizard said he had 67%, of course at 100% of samples lerna in training.

For example for numerai I need to take at least 50% of training samples, otherwise the results on new data is nothing.

To be honest with them everything is quite cloudy, you can not understand how good is the scoro, that's why they put the answers in the event, by which they count the preliminary logloss, it is not clear why they need it, people who were in the first place suddenly rush for the 500th with a logloss >0.7, it all smells random ...

Respect XGB, in the right hands a strong thing. In my 4 hours, it's worse.

Strong, especially when I rebuilt it myself in C++.

What kind of data is this? Forex, stock exchange, paid subscriptions? Would 62% really make a profit if I were to gather myself a similar set of predictors?

The data are all from FORTS with Quicksilver, Metatrader and free parsed data from webpages like http://www.investing.com etc.. The porting is realistic, but the trading infrastructure for a moderate HFT (10 sec-1min position holding) will have to be done on Quicks or Plasma, from scratch this is a man-year job. If you have a goodC++/Java/C# coder (25-50K$ if you are local), but we should consider that the prospects of HFT in the world are decreasing all the time, especially ultra, so they are monopolized by well-funded organizations and are not available for ordinary traders, we should focus on forecasting next minute, not second, there accuracy ~55% is the limit of dreams

Reason: