Machine learning in trading: theory, models, practice and algo-trading - page 448

 
mytarmailS:
what is the target function in your classifier?
There is no target function, it works according to the principle that the further from the average for all predictors in the aggregate the faster they should converge to this average, i.e. it works according to Bayesian classifier principle, it finds such weights that in the aggregate predictors would give the largest deviation from the average in each particular case, and in the end they should converge back. And since we take predictors in a stationary form, it is clear that the average is 0. If the output is >0, we sell if <, we buy.
 
Maxim Dmitrievsky:

In general I came to the conclusion, that MLP is an ugly monster, ugly retarded and unpromising for trading, especially since it copies the working mechanism of real neurons very primitively and not the way it actually happens in the brain :) The only normal and perspective NS is convolutional ns for pattern recognition, while they are not able to predict, and if so an ensemble of simple and fast classifiers is enough.

Bayesian classifier is better, but worse than RF.

Interestingly, I came to the exact opposite conclusion about "behemoths ugly.")

RF requires selection of predictors, which is a nontrivial task given the requirement of their at least linear independence. MLP I just tuck the time series, and the requirement of linear independence is solved by a committee of several NS, whose input are discharged time series (analog of several TFs). Time delays of NS, for real trading, I suppose, are insignificant.

What will happen to the real TS I do not know yet, but like NS is quite trainable. See a piece of the graph of the output of the trained NS. I cannot yet say for sure how well it is trained.) But it is learning.)


 
Yuriy Asaulenko:

It is interesting that I came to the exact opposite conclusion about "behemoths ugly.")

RF requires selection of predictors, which is a nontrivial task given the requirement of their at least linear independence. MLP I just tuck the time series, and the requirement of linear independence is solved by a committee of several NS, whose input are discharged time series (analogue of several TFs). Time delays of NS, for real trading, I suppose, are insignificant.

What will happen to the real TS I do not know yet, but like NS is quite trainable. See a piece of graph of trained NS output.


Just throw predictors in the form of oscillators on the chart and look at them and see if they are linearly or non-linearly dependent.) No numbers are needed. The NS may retrain itself, it may not come up with any super non-linear correlations if they are not there from the beginning or are inconsistent

Or it is necessary to use nuclear machine before NS, as in Jpredictor, which raises dimension of inputs by polynomials and then through SVM and some other crap then leaves the most informative, and on the other hand because of these very polynomials it can overtrain like hell

 
Maxim Dmitrievsky:

Why, just throw predictors in the form of oscillators on the graph and there you can see linearly or non-linearly). No numbers needed. NS can also retrain, it will not be able to make up any super non-linear correlations there from nothing, if they are not there from the beginning or they are inconsistent.

Not everything is as simple as it seems. So it's been a year since SanSanych has been messing around with predictors and jumping from one forest to another (from package to package).

Maxim Dmitrievsky:

Or maybe it's necessary to use nuclear machine before NS, like in Jpredictor which raises dimension of inputs by polynomials and then leaves the most informative ones through SVM and some other crap.

Linear independence and non-linearity have nothing to do with each other. These are different concepts.Linear independence
Линейная независимость — Википедия
Линейная независимость — Википедия
  • ru.wikipedia.org
имеет только одно — тривиальное — решение. − 5 ⋅ ( 1 , 0 , 0 ) + 1 ⋅ ( 5 , 0 , 0 ) = ( 0 , 0 , 0 ) . {\displaystyle -5\cdot (1,0,0)+1\cdot (5,0,0)=(0,0,0).} Пусть будет линейное пространство над полем и . называется линейно независимым множеством, если любое его конечное подмножество является линейно независимым. Конечное множество M ′...
 
Maxim Dmitrievsky:

Why, just throw predictors in the form of oscillators on the chart and there you can see whether they are linearly dependent or non-linear)

PS By the way, MLPs, unlike single-layer P. are essentially nonlinear, and are quite capable of generalizing nonlinear signs.
 
Yuriy Asaulenko:
PS By the way, MLPs, unlike single-layer P. are inherently nonlinear, and are quite capable of generalizing nonlinear features.

They can, RF can too, but they are no less overlearned
 
Maxim Dmitrievsky:
There is no target, it works on the principle that the farther from the average for all predictors in the aggregate the faster they should converge to this average, i.e. it works on the principle of Bayesian classifier, it finds such weights that in the aggregate predictors would give the largest deviation from the average in each specific case, and as a result they should converge back. Since we take predictors in stationary form, it is clear that the average is 0. If the output is >0, we sell if <, we buy.
I don't quite understand if the training is with or without a trainer? If so, what is the buy signal for the classifier?
 
mytarmailS:
not quite clear, training with the teacher or without? if with the teacher, what is the buy signal for the classifier?
Without a teacher in the optimizer weights are selected, already discussed article and example is, look in the theme RNN Reshetov
 
Maxim Dmitrievsky:
Generally NS has no advantages over RF, it takes long time to calculate, error is more... if you want to train fast then RF+optimizer is unambiguous

Regarding the speed of the NS.

Especially for this purpose I made an experiment on the speed. For it I took an MLP with the structure of layers [15,15,15,8,2]. Training sample size: input - 15 x 10378, output - 2 x 10378.

MLP training on this data of 10 epochs is about 10 minutes.

Directly working with the data - the input signal 15 x 10378 is calculated less than 3 seconds. I.e. ~0.0003 c/sample.

More than enough for building TC).

 
Yuriy Asaulenko:

Regarding the speed of the NS.

Especially for this purpose I made an experiment on the speed. For it I took an MLP with the structure of layers [15,15.15,8,2]. Training sample size: input - 15 x 10378, output - 2 x 10378.

MLP training on this data of 10 epochs is about 10 minutes.

Directly working with the data - the input signal 15 x 10378 is calculated less than 3 seconds. I.e. ~0.0003 c/sample.

More than enough time to build a TS).

Something too fast, such should learn mb an hour or several hours, by what algorithm L-BFGS? I also had 15 inputs but only one hidden layer of 15-20 neurons, my Alglibian NS was trained... so I didn't wait and reduced the size of input vectors) My 3 inputs with 10k vector took me 5-10 minutes to train, and that with one hidden layer. And it's not slow backpropagation but fast with epochs 1-3. i5 process

Imagine that even with 10 minutes you don't have a ready strategy and you have to search through N number of predictors, vector lengths, number of hidden layers etc... in your optimizer to find a strategy...

Reason: