Machine learning in trading: theory, models, practice and algo-trading - page 1120

 
toxic:
This case, and with your dataset... I'm sorry, but a lot of people told you many times that you need at least thousands of samples, and given the noisiness of market data it's desirable to have hundreds of thousands of points, but when you learn Java and use XGB for example, you'll laugh at your persistence in the past))

That's an incorrect statement

 
Mihail Marchukajtes:

So I came up with an improved metric for the Matthews coefficient, but what can I say if you blow here and blow there? :-(

I'm stuck with passing an array from one class to another, and I'm screwed... no one to ask :-(

why bother with metrics when you have your own for TC? for example, you may measure the profit factor on samples and that's it

and the internal model estimates are secondary, because the smallest error does not mean the greatest stability with new data

just choose an external criterion for evaluation through the trading performance on new data

But as long as you don't have a big sample for the test you'll be licking the screw... a small sample for a tray is not critical, everything's fine here

 
toxic:

You can argue, but in the market it's not important fidelity, it's more important to make it work, and with me usually the more samples the better the result)))

just take timber or xgb - they don't care how many samples to retrain :) it's just the model will weigh gigs

but recursive enumeration of features with external metrics started to yield results even on small subsamples, and not even external
 
Vizard_:

Trend = 100k lines. On the remaining 8k+(test)you apply the model. The data is shuffled.
The metric is logloss. Post the result. Trend =... test =...

With the time series of course this is not done. But just for fun, I put it into LightGBM, which is almost default, without touching the data at all:

Train: 0.6879388421499111

Test: 0.6915181677127092


Test sources, bonus with CatBoost:

https://yadi.sk/d/55DDn-hViNWP6Q


What are your results?

test_xz.ipynb
test_xz.ipynb
  • disk.yandex.ru
Посмотреть и скачать с Яндекс.Диска
 
Maxim Dmitrievsky:

Why bother with metrics, when the TC has its own metrics? for example, the profit factor can be measured on samples and that's it

the internal model estimates are secondary, because the smallest error does not mean the greatest stability with new data

just choose an external criterion for evaluation through the trading performance on new data

But as long as you don't have a big sample for the test you'll be licking the screw... a small sample for a tray is not critical, everything's fine here

There should be an unlimited amount of space for the test.

If someone wants to check the quality of their data, they have to put it not in the form of some cloudy CSV files, but in the form of indicators.

It is possible with the template, although the targets may not be marked, it is clear that they must be profitable.

Then it will be possible to train any model, to create an Expert Advisor and objectively test it together with the initial indicator.

Well, this is if there is a desire to do something, and if you just want to talk...

 
the problem:

The more noise, the more samples, it should be clear at the level of elementary statistics, and market data is very noisy, retraining is another issue, if you teach properly built features and correct targeting it is really hard to achieve retraining on a test with tens or hundreds of thousands of samples. The good thing about big datasets is that they are hard to retrain, unless, of course, datascientist or algotrader are masahists or near-marketers and mix targets with chips. Obvious reason to worry - the chip correlates with target by more than 3-5%, so it should probably peek, and better to build chips so there is no such a possibility in principle, it will complicate algorithms a bit, but it will probably get rid of The main error of beginner algotraders.

I'm scolding near-market traders, while you yourself are obviously a cool, market algotrader, and judging by your posts you know on which period to teach and on which to trade.

I don't know and when I saw the discussion yesterday I didn't get involved and decided just to try it, so I trained demo on EURUSD M1 from 8 to 18 of October, as long as I had at my broker and I executed the Expert Advisor in real time.

So he trades in profit so far, and the question for you, as an expert, is when he will start to lose, login - 2096584180, password - na3tbvr, Tradize-Demo, but specifically, not about spaceships that prowl the expanse of the big theater (c)).


 
I have a real time trade, MT4 tester, neuronet:

The training sample is miserable, the tester and optimizer logic is not transparent (black box)...

Conclusion - 99.99999999999% of the Expert Advisor is random, equity is a random geometric wandering with a downward slope at the expense of trade costs.

Depending on frequency of trades I can only vang negative SR<-0.5 for the year

1. There is real-time trading, MT4 tester, neural network.

2. the answer is 100% wrong - the Expert Advisor is not random.

3. trained in 8 days of trading data, and the forecast for the year ...?:)


ZS: I asked specifically, for example - the ratio of the period of trade to the period of training to 30% and the advisor the day after tomorrow will start to drain, or 10% - today, but since science is silent ...

 
toxic:

Conclusion - 99.99999999999% that this EA is random, equity is a random geometric wandering with a downward slope due to trade costs.

Hmm, that's my picture, what do you see there?

;)

 
I'mnot sure:

random I tell you.

I would also say that it can't be because it can never be. (c) And it's unrealistic to train on 50 trades, but we need to see another 30-40 trades (that's 3-4 days) to draw conclusions. If we see them, of course.

But, in general, it's already strange.

 
toxic:

Random, I'm telling you.

here's his tester run - and you think it's random?

Files: