Machine learning in trading: theory, models, practice and algo-trading - page 2846

 
Aleksey Nikolayev #:

Judging from the description, it can be understood that first a part of the best passages is selected according to one criterion, then from the selected ones a part of the best passages according to the second criterion is selected, and so on.

"It allows you to select the best passes step by step: first by the number of trades, then from this sample by the mat. expectation of profitability, then by the recovery factor and so on."

the criterion is calculated at once, at each pass of optimisation, not at the end of optimisation taking into account all the results of each pass separately. that is why there is an inconsistency with the fact and the description in the help.
 
Maxim Dmitrievsky #:

I didn't immediately see any difference or advantage

A new way to generate tabular data. How much better is it? Or is still GMM out of the competition?

https://github.com/kathrinse/be_great

 
Evgeni Gavrilovi #:

A new way to generate tabular data. How much better is it? Or is GMM still out of the competition?

https://github.com/kathrinse/be_great

I don't know, I don't analyse tabular data
Not good for time series
Some T-gan would probably be better

⚙️ Time-series Transformer Generative Adversarial Networks


Github: https://github.com/jsyoon0823/TimeGAN


Paper: https://arxiv.org/abs/2205.11164v1


Stock data: https://finance.yahoo.com/quote/GOOG/history


Energy data: http://archive.ics.uci.edu/ml/datasets/Appliances+energy+prediction



@ai_machinelearning_big_data


 
Maxim Dmitrievsky #:
Some T-gan would probably be better

And how do you check the plausibility? Compare the distributions of real and synthetic data separately for each series?

 
Evgeni Gavrilovi #:

How do you check the likelihood? Compare the distributions of real and synthetic data separately for each series?

I've seen a visual comparison via PCA somewhere, I can't remember right away. Maybe later.
 
Evgeni Gavrilovi #:

How do you check the likelihood? Compare the distributions of real and synthetic data separately for each series?

https://hackernoon.com/a-gan-approach-to-synthetic-time-series-data-pe2r33fd

A GAN approach To Synthetic Time-Series Data | HackerNoon
A GAN approach To Synthetic Time-Series Data | HackerNoon
  • hackernoon.com
Although sequential data is pretty common to be found and highly useful, there are many reasons that lead to not leverage it
 

What predictors can be invented for histograms?

I have attached them as files, as images don't want to be inserted - probably another bug.

Files:
 
Aleksey Vyazmikin #:

What predictors can we come up with for histograms?

)))))))
What is the difference between a histogram and points? I'm embarrassed to ask, other than visualisation
 
mytarmailS #:
)))))))
What is the difference between a histogram and points? I'm embarrassed to ask, other than visualisation

You can visualise any shape with dots. Visualisation is needed to stimulate abstract thinking, which stimulates the generation of ideas.

Indeed, in the histogram is a binary predictor of the sample, the red bars mean that the signal is gone (zero), and their height means how long there was no signal "1" in the sample.

I assume that the different character of the frequency distribution of signal occurrence in the sample can serve to classify the further use of this predictor in training. Accordingly, the predictor can be excluded or recommended for use only for the construction of upper root splits.

This is why predictors are required to describe histograms. Yes, we can also make predictors for TP+FP balance - ideas for its description are also interesting, except for the well-known ones.

 
Aleksey Vyazmikin #:

You can visualise any shape with dots. Visualisation is needed to stimulate abstract thinking, which stimulates the generation of ideas.

Indeed, in the histogram of the binary predictor of the sample, the red bars mean that the signal is missing (zero), and their height means how long there was no signal "1" in the sample.

I assume that the different character of the frequency distribution of signal occurrence in the sample can serve to classify the further use of this predictor in training. Accordingly, the predictor can be excluded or recommended for use only for the construction of upper root splits.

This is why predictors are required to describe histograms. Yes, we can also make predictors for TP+FP balance - ideas for its description are also interesting, except for the well-known ones.

This is not a histogram, or not a histogram in the conventional sense, as Pearson invented it.