Machine learning in trading: theory, models, practice and algo-trading - page 3467

 
Arty G #:

The problem is that in mean reversing mode the system works well, but in case of strong trends it starts to tinker: it gains a position not in the right direction of midprice changes, due to which there is a drawdown. Well, that is a classic problem for this kind of trading system.

Hello. Why not try to predict the trend expectation? Or just detect it, since there is such active trading? Or is a strong movement within a minute bar considered a trend?

Arty G #:
I decided to run through the backtester 50-100 variants of more or less good combinations of parameters, collect chips along the way and train a model that, looking at chips and combinations of parameters, could predict whether a period of time on these parameters will be profitable or not.

It's not clear what predictors you use to describe these parameter sets? Or do you have a multiclassifier?

Arty G #:
but since this is essentially HFT and the data starts weighing gigabytes as early as a dozen days, it becomes a bit problematic.

You can use quantisation before saving the data, which will allow you to switch to the uchar data type .

Try CatBoost - it can handle large amounts of data.

 
mytarmailS #:
Started the search for patterns on buy, left the computer overnight, in the morning will be the result and will make a backtest of more purchases

when modifying the script from sell to buy, I made a typo...

all night the computer was counting random! (((((((((((

Oh, shit.

 
Ivan Butko #:

Here is the "if, then" rule.

If the price is lower and if there is something else - draw a line to the new extremum.


How can this be interpreted "differently"?

so and trade on words simply - "buy cheap, sell expensive" :-)

what is an extremum? why is this pip considered an extremum, but the neighbouring one is not...and some one behind them is not at all a counter extremum?

The number of ways to define an abstract extremum is limited only by imagination. There is no standard zigzag, no common practice and no reference implementation. There are a darkness of variants in codebehaviour, with a variety of parameters (and of course errors)

It is quite dangerous to mark data with a zigzag - it may well turn out to be slightly different for everyone, it may vary from run to run, it may depend on the starting moment and noise, and it is not identical in Python and MQL either.

 
Aleksey Vyazmikin #:

Hello. Why not try trend expectation forecasting?

It seems to me that forecasting the future is a half-measure.

You need to learn to exit a position in time.

And the price for a mistake is a short loss.

That is, we teach how to enter and teach how to exit.

Simply forecasting a trend is to sit to one side and not wait for it to work out.


 
Maxim Kuznetsov #:


what is an extremum?

Are you pretending or what?


Extrema ( Latinextremum - extreme) inmathematics is the maximum or minimum value of afunction on a givenset. The point at which the extremum is reached is called theextremum point.


Maxim Kuznetsov #:

why is this pip considered an extremum, but the neighbouring one is not...and some point behind them is a counter point in general

And why is today Tuesday and tomorrow is not?

 
Ivan Butko #:

I think predicting the future is a half-measure.

you have to predict

Ivan Butko #:

One should learn to exit a position in time.
And the payment for a mistake is a short loss.

this is also a forecast

 
Maxim Dmitrievsky #:

As far as I understand, it is necessary to find such market regimes in which this TS works well. For example, the modes can be determined by volatility.

In the last article I suggested clustering by volatility and then selecting the best clusters for trading. If you have not read it, it may be useful.


Ran across this article. I'll read it again, thanks!


The goal is not really to find market regimes in which a strategy works well, but to determine for a regime a set of parameters for the strategy that will work well.

What I do is run a hundred backtests on the same time frame with different settings. Then I put it all into one dataframe and train the model. For the model, the combination of trading strategy settings is the same chip as other market chips like volatility or, say, overweight in trades balance. Thus the model learns to understand which parameters for the strategy can be profitable at which market chips. That is, X for the model looks like this: [volatility_short, volatility_long, rsi, trade_imbalamce, strategy_param1, strategy_param2]. And y is the sign of what has been earned per minute. Since orders are sent up to 10 pieces per second, you can trade enough in a minute to understand how the parameters work. In the dataframe, in which we uploaded all this stuff, market chips are duplicated as many times as there are unique combinations of strategy parameters. That is, the dataframe does not just contain trading results for the whole backtest, but minute-by-minute data:

X:[volatility_short, volatility_long, rsi, trade_imbalamce, strategy_param1, strategy_param2], y:[step_profit]


We got the model. Now, running a backtester or live trading, we can cycle through the model every minute, passing it the current set of features and available combinations of parameters for the trading strategy, until we find such a combination of parameters, at which the model sees a high probability of profitable trading.

That is, the hypothesis is that for different market regimes there is a different set of parameters at which the trading system will be profitable and also that it is possible to teach the ML model to find a suitable set of parameters for different regimes.


Aleksey Vyazmikin #:

Hello. Why not try to predict the trend expectation? Or just detect it, since there is such active trading? Or is a strong movement within a minute bar considered a trend?

It is not quite clear what predictors you use to describe these parameter sets? Or do you have a multiclassifier?

You can use quantisation before saving the data, which will allow you to switch to uchar data type .

Try CatBoost - it can work with large amounts of data.

In general, the idea is not only to escape from the trend, but to select parameters for the current market regime, which have a high probability of profitability on the trained data. As I described above, when training the model, combinations of parameters are also chips when training the model. Before a new trading rand we ask the model - market chips are like this, and we have, for example, 100 parameter combinations. Here, dear model, each of the sets of parameters of the trading strategy separately - which of them has a high chance of being profitable at the current market chips?


I will read your article on quantisation. I used deciles, but when I switched to real values, the accuracy of the predictions increased dramatically. I understood that for trees, data normalisation is not so important, but the granularity of the data was lost a lot when using deciles.


CatBoost will give it a try, thanks. It's just that I have backtester and live trading code written in Numba in Python and using trained models in sklearn doesn't work there. For that I found a way to export the random forest model into a set of sheets and dicts, which allowed me to use the trained model under Numba. I will investigate if this or similar is possible with CatBoost.

About the data size - I meant that the backtest data weighs too much and it takes a lot of space and time to run a backtest on a hundred other parameters on a year of HFT data (including all book updates and trades).

 
mytarmailS #:

you have to predict

that's also a prediction

Yes

That's the point: predict a trend - enter. The price starts to curve, forecast an exit (HOLD) - exit, fuck this trend, let's look for another one.

Traders do the same: forecast a trend/impulse. Goes in. Price starts to curve somehow, yes in such a way that the trader doesn't like his entry. He predicts 50/50 or "I don't understand what's going on". And he is like "fuck it, I'd better get out". The fee is a small minus or a small plus.

He does not forecast "reverse signal". He doesn't like the realisation of a trend/impulse.


That's how it should be done: predict entry, predict continuation of the trend, predict exit or reverse entry.

 
Arty G #:

What I do is run a hundred backtests on the same time interval with different settings. Then I put it all into one dataframe and train the model. For a model, the combination of trading strategy settings is the same as a chip

You're doing it a little bit artisanally.

You run many TS settings through the model (many datasets) and the model gives a probability of the best TS settings, right?

You create a huge number of datasets at each new bar.


I think it is better to take a regression algorithm with many outputs, where the outputs will be ready parameters for the TS.


You have a scheme like this : many datasets + model with one output.

Or better : one dataset + model with many outputs


example

https://stackoverflow.com/questions/57704609/multi-target-regression-using-scikit-learn

 
Ivan Butko #:

Yes

That's the point: predict a trend - enter. The price starts to curve, predict an exit (HOLD) - exit, fuck this trend, let's look for another one.

Traders do the same thing: predict a trend/impulse. Goes in. Price starts to curve somehow, yes in such a way that the trader doesn't like his entry. He predicts 50/50 or "I don't understand what's going on". And he is like "fuck it, I'd better get out". The fee is a small minus or a small plus.

He does not forecast "reverse signal". He doesn't like the realisation of a trend/impulse.


That's how it should be done: predict entry, predict continuation of the trend, predict exit or reverse entry.

Yes, ideally it's like that