Machine learning in trading: theory, models, practice and algo-trading - page 803

 

How best to balance classes, who uses what methods?

And I would like to ask what predictors in your opinion are interesting, can I ask you in person?)

 
forexman77:
What is the best way to balance classes, who uses what methods?

There are functions

 caret:: downSample() - обрезает большой класс до меньшего

The downSample will randomly sample a data set so that all classes have the same frequency as the

minority class. upSample samples with replacement to make the class distributions equal

 caret:: upSample() - добавляет меньший класс до большего


In general, I recommend to study caret: you can use it as a "what happens" tutorial - it contains tools for the full cycle:

  • preparation of initial data,
  • selection of predictors (very awesome tools),
  • under 200 models (regression and classification models) and
  • model evaluation (nothing in common with the tester).

You can get quite industrial designs.



PS.

I don't think anyone will reveal predictors - that's the most important thing, and models are a matter of technique and diligence.

 
SanSanych Fomenko:


I don't think anyone will reveal the predictors - that's the most important thing, and the models are a matter of technique and diligence.

Well, it was just a "fool's errand" to ask. After all, they create silly branches, "how to triple the deposit", "tell me a profitable Expert Advisor," etc.

But, I need advice with indicators, it is clear, it is the first thing that comes to mind. But here is something unusual, where very few people have dug.

 
forexman77:

Well, it's just a "foolish" question. After all, create the same silly threads, "how to triple the deposit", "tell me a profitable adviser," etc.

But, I need advice with indicators, it is clear, it is the first thing that comes to mind. But here is something unusual, where few have dug.

Take digital filters from the last part of the article. It will be useful, I think, to read all the previous parts as well.

Good luck

 
SanSanych Fomenko:

There are functions

downSample will randomly sample a data set so that all classes have the same frequency as the

minority class. upSample samples with replacement to make the class distributions equal


In general, I recommend to study caret: you can use it as a tutorial "what happens" - it contains tools for the full cycle:

  • preparation of initial data,
  • selection of predictors (very awesome tools),
  • under 200 models (regression and classification models) and
  • model evaluation (nothing in common with the tester).

You can get quite industrial designs.



PS.

I don't think anyone will disclose predictors - that's the most important thing, and models are a matter of technique and assiduity.

Models are created from ready-made stuff, and here you can use a blank sheet.

 
forexman77:

Well, it's just a "foolish" question. After all, create the same silly branches, "how to triple the deposit", "tell me a profitable adviser," etc.

But, I need advice with indicators, it is clear, it is the first thing that comes to mind. But here is something unusual, where few have dug.

In the simplest case, I use a set of bars as predictors - the integral value calculated using a given formula, which is laid down as input data during training.


I do the balancing when sampling from the end-of-training label deep into the history, in a loop, until the number of samples reaches a given number and is compared.

 
Ivan Negreshniy:
In the simplest version, I use a set of bars as predictors - the integral value calculated by a given formula, which is laid down as input data during training.

Actually, the best predictor is the price series itself. Any processing is a loss and delay of information. And if you also take into account that we do not really know what kind of information we need, then...

I work with 1m, and a delay of even 30 seconds is the death of the system. And even very simple indicators give a delay of 1-3 meters.

Even if you work on an hourly timeframe, the delay on the indicators will be 1-3 h.) In any case, 1-3 candles, no matter how you spin it.

ZY2 Another plus of time series - we don't need to pick up predictors. Just learn to use BP itself in this capacity.

 
Yuriy Asaulenko:

Actually, the best predictor is the price series itself. Any processing is a loss and delay of information. And considering that we don't really know what kind of information we need, then...

I work with 1m, and a delay of even 30 seconds is the death of the system. And even very simple indicators give a delay of 1-3 meters.

Even if you work on an hourly timeframe, the delay on the indicators will be 1-3 h.) In any case, 1-3 candles, no matter how you spin it.

ZY2 Another plus of time series - we don't need to pick up predictors. Only learn to use BP itself as such.

How can you not respond with a phrase like this. The hourly chart can be viewed under a magnifying glass 15m, 5m 1m. It is sad, gentlemen.

 
Uladzimir Izerski:

How can you not respond with a phrase like this. The hourly chart can be viewed under a magnifying glass 15m, 5m 1m. Sad, gentlemen.

Don't be sad, I know.) If you feel comfortable, you may continue.

 
Please tell me, for the selection of data in the initial stage is enough to look for correlation with the target data, if so, what threshold of correlation should be used?
Reason: