Machine learning in trading: theory, models, practice and algo-trading - page 2374

 

Prada deal marking

4 simple ways to label financial data for Machine Learning ⋆ Quantdare
4 simple ways to label financial data for Machine Learning ⋆ Quantdare
  • quantdare.com
We have seen in previous posts what is machine learning and even how to create our own framework. Combining machine learning and finance always leads to interesting results. Nevertheless, in supervised learning, it is crucial to find a set of appropriate labels to train your model . In today’s post, we are going to see 3 ways to transform our...
 
Maxim Dmitrievsky:

Prada deal marking.

This stuff is more interesting. I just don't get it, does it only work from the command line? Has anyone looked at it?

SigCWGAN, a new generation GAN architecture for Time Series Generation. ⋆ Quantdare
SigCWGAN, a new generation GAN architecture for Time Series Generation. ⋆ Quantdare
  • quantdare.com
As a continuation to our last post on Time Series Signatures and our running list of posts regarding GANs and synthetic data we now want to present the Signature Conditional Wasserstein GAN, shortened as SigCWGAN, a new GAN architecture presented in [1] that is specifically designed to generate time series of arbitrary length and dimensions. 2...
 
Vladimir Perervenko:

This material is more interesting.

This is another topic, which is not limited to GANs

 
Maxim Dmitrievsky:

Pradov's Deal Marker

Unclear language and unfamiliar functions... and the author is misleading.

By fixed_time_horizon() there is this line:

idx_lower = data[data[name] < - threshold].index

he wrote above that

threshold : int.
The predefined constant threshold to compute the labels.

And the pictures below are not int (i.e. 0,1,2,3...), but 0.05, 0.01...

It became clearer with double - this is the same thing I did with TP=SL=some value of price change.

But it is unclear why I called the method and function fixed_time_horizon(). It's a fixed price change, not time.

---------

As for the method of quantized_labelling() - I don't understand anything from the code. I suppose that it is not a fixed value, like 0.05, but according to the quantile that is constantly changing with price volatility.

 
elibrarius:

Unclear language and unfamiliar functions... and the author is misleading.

By fixed_time_horizon() there is this line:

idx_lower = data[data[name] < - threshold].index

he wrote above that

threshold : int.
The predefined constant threshold to compute the labels.

And the pictures below are not int (i.e. 0,1,2,3...), but 0.05, 0.01...

It became clearer with double - this is the same thing I did with TP=SL=some value of price change.

But it is not clear why I called the method and function fixed_time_horizon(); where is the fixed time? It's a fixed price change, not time.

---------

As for the method of quantized_labelling() - I don't understand anything from the code. I suppose that it is not a fixed value, e.g. 0.05, but according to the quantile that keeps changing with the volatility of the price.

I have not read the code. The main thing is the markup not based on a chart but on the increments. This leads to a number of interesting features, such as the ability to apply the markup to a squelch chart or to some specific BP components

there must be a mistake with the int, it wasn't Prado himself who wrote it, but some types

the fixed horizon refers to the selected incremental lag, I guess

 
Maxim Dmitrievsky:

I haven't read the code. The main thing is the partitioning not by chart, but by increments. This leads to a number of interesting features, such as applying partitioning to a squelched chart, or to certain BP components

there must be a mistake with the int, it wasn't Prado himself who wrote it, but some types

fixed horizon refers to the selected incremental lag, probably

Somebody there is stupid or the Prado or his types.

 

By the method quantized_labelling()

I see little sense in teaching it. You can learn the classification well at moments of low volatility, and it is worse at high volatility. And then a 40% error at low volatility + 51% error at high volatility will return the system profitability to 0. A lot of small gains can be compensated by a few big losses.
 
elibrarius:

Somebody out there is stupid or Prado or his types.

everything is zshibizzy, we should try it, but I'll do it differently

His book is a little different, I think. I'm too lazy to look.
 
Maxim Dmitrievsky:

everything is cool, I should try it, but I'll do it differently

His book describes it a little differently, I think. I'm too lazy to look for it.
I tried TP=SL=fixed value. The result is 50% on new cross validation data.
For quantiles I don't see the point, see post above
 
elibrarius:
TP=SL I tried. The result is 50% on new cross validation data.
I don't see the point in quantiles, see post above

Here are the increments, without sl and tp

I did this through clustering, marked up. Overall the curve on marked up data is not great, but more robust on new data
Reason: