Machine learning in trading: theory, models, practice and algo-trading - page 3081

 
СанСаныч Фоменко #:

This article is a perfect illustration of advertising promotion of trivial results.

The very name "Causal Effects" pokes us in the nose at our backwardness, because while studying various sines we did not realise that this is the result of Causal Effects from giving input data to the sin input and getting the result.

The author takes RF, gives input data and gets an error as a result.

To make everyone realise that we are dealing with a completely new direction in MO, then the input data (predictors) are called covariates, the RF algorithm is called a meta learner, and the whole process is called Causal Effects.

The apologists of Causal Effects are not aware that sometimes in Russian covariates are those predictors that have an effect not only on the target variable but also on neighbouring predictors, i.e. the term should be used more precisely to avoid ambiguity.

Calling the RF algorithm a "meta learner" is another publicity stunt in Causal Effects, since this algorithm produces rules is certainly NOT a learner. But from the advertising point of view in machine learning there should be students and for the importance of "meta" and basta.

The paper justifies in some detail the choice of RF as the base algorithm, specifically stating that any (?) MO algorithm can be used instead ofRF. As a generalisation of this thought, the term nuisance, i.e. unpleasant, obnoxious, annoying , is used. If by text, it should probably be translated as "a function of noise", i.e. the RF algorithm is a "function of noise". But how intricate and beautiful it sounds, and most importantly the reader, who previously thought that RF produces rules with some error, just enjoys it.

It is possible to continue, but the above is enough to refer all this Causal Effects to purely advertising, by the way very successful, when the real nonsense sold and got a place of professor at Stanford University, got followers who want to keep up with the new advanced trends.

So who is the author of the supposed newest cutting edge trend in ME? Judging by the number of references, one Victor Chernozhukov, a man who has no profile education, graduated from an agricultural institute in the early 90s. I remember this time very well, when millions of Chernozhukovs, under the cries of unclouded consciousness with education and facts, were running and moving all kinds of nonsense. and many of them became billionaires and top politicians.


Today the whole world lives according to the laws of advertising, all spheres, thought that MO will pass this cup. Well, no.

Thanks for the analysis, because I didn't read it. The video on the same topic was enough.
 
СанСаныч Фоменко #:

This article is a perfect illustration of advertising promotion of trivial results.

The very name "Causal Effects" pokes us in the nose at our backwardness, because while studying various sines we did not realise that this is the result of Causal Effects from giving input data to the sin input and getting the result.

The author takes RF, gives input data and gets an error as a result.

To make everyone realise that we are dealing with a completely new direction in MO, then the input data (predictors) are called covariates, the RF algorithm is called a meta learner, and the whole process is called Causal Effects.

The apologists of Causal Effects are not aware that sometimes in Russian covariates are those predictors that have an effect not only on the target variable but also on neighbouring predictors, i.e. the term should be used more precisely to avoid ambiguity.

Calling the RF algorithm a "meta learner" is another publicity stunt in Causal Effects, since this algorithm produces rules is certainly NOT a learner. But from the advertising point of view in machine learning there should be students and for the importance of "meta" and basta.

The paper justifies in some detail the choice of RF as the base algorithm, specifically stating that any (?) MO algorithm can be used instead ofRF. As a generalisation of this thought, the term nuisance, i.e. unpleasant, obnoxious, annoying , is used. If by text, it should probably be translated as "a function of noise", i.e. the RF algorithm is a "function of noise". But how intricate and beautiful it sounds, and most importantly the reader, who previously thought that RF produces rules with some error, just enjoys it.

It is possible to continue, but the above is enough to refer all this Causal Effects to purely advertising, by the way very successful, when the real nonsense sold and got a place of professor at Stanford University, got followers who want to keep up with the new advanced trends.

So who is the author of the supposed newest cutting edge trend in ME? Judging by the number of references, one Victor Chernozhukov, a man who has no profile education, graduated from an agricultural institute in the early 90s. I remember this time very well, when millions of Chernozhukovs, under the cries of unclouded consciousness with education and facts, were running and moving all kinds of nonsense. and many of them became billionaires and top politicians.


Today the whole world lives according to the laws of advertising, all spheres, thought that MO will pass this cup. Well, no.

This is just the apotheosis of your professional ineptitude, when new information does not go into the bowl. Or problems with translation. I can only sympathise :)

 
Maxim Dmitrievsky #:

It's just the apogee of your profanity when new information doesn't go into the bowl in any way anymore. Or translation problems. I can only sympathise :)

all terms are twisted, basic information is distorted beyond recognition.

Can you convey to the plebs the undistorted information?

 
СанСаныч Фоменко #:

...

The paper justifies in some detail the choice of RF as the base algorithm, specifically stipulating that any (?) MO algorithm can be used instead of RF. As a generalisation of this thought, the term nuisance, i.e. unpleasant, obnoxious, annoying, is used. If by text, it should probably be translated as "a function of noise", i.e. the RF algorithm is a "function of noise". But how intricate and beautiful it sounds, and most importantly the reader, who previously thought that RF produces rules with some error, just enjoys it.

...

I was reading and looking for a practical application of all this - so you didn't find one?

It seemed to me that the article is supposed to give a tool for evaluating the measurement of the deviation of the aggregate sample area from the sample on which the training took place. Accordingly, having this tool it is possible to detect anomalous parts of the sample. Do you think it is there or not?

 
Aleksey Vyazmikin #:

Can you convey unadulterated information to the plebs?

I can sympathise

 
СанСаныч Фоменко #:

thought I was getting a blowjob ..... Oh, no.

And I'm of the same opinion.))

These profound words describe this entire thread
 
Aleksey Vyazmikin #:

I read and searched for a practical application of all this - so you didn't find one?

It seemed to me that the article should give a tool for evaluating the measurement of the deviation of the aggregate sample area from the sample on which the training took place. Accordingly, having this tool it is possible to detect anomalous parts of the sample. Do you think it is there or not?

It is not in the article.

It describes the usual fitting with different division of the original predictors, including cross validation. A routine that has been camouflaged with words.

 
СанСаныч Фоменко #:

It's not in the article.

The usual fitting with different division of the original predictors, including cross validation, is described. A routine that has been camouflaged with words.

Thanks for the expert opinion.

 
Maxim Dmitrievsky #:


and nuisance functions(or parameters) are not noise functions, but auxiliary ones, which are not target functions for a particular task


Can I have a link in the article to a view of these "auxiliary" functions?

At the same time, the reasons for using RF, which is called a basic function and which computes a lot of information as a result of the work, are described in quite a lot of detail:

An object of class randomForest , which is a list with the following components:

call

the original call to randomForest

type

one of regression, classification , or unsupervised .

predicted

the predicted values of the input data based on out-of-bag samples.

importance

a matrix with nclass + 2 (for classification) or two (for regression) columns. For classification, the first nclass columns are the class-specific measures computed as mean descrease in accuracy. The nclass + 1st column is the mean descrease in accuracy over all classes. The last column is the mean decrease in Gini index. For Regression, the first column is the mean decrease in accuracy and the second the mean decrease in MSE. If importance=FALSE , the last measure is still returned as a vector.

importanceSD

The "standard errors" of the permutation-based importance measure. For classification, a p by nclass + 1 matrix corresponding to the first nclass + 1 columns of the importance matrix. For regression, a length p vector .

localImp

a p by n matrix containing the casewise importance measures, the [i,j] element of which is the importance of the i-th variable on the j-th case. NULL if localImp=FALSE .

ntree

number of trees grown.

mtry

number of predictors sampled for splitting at each node.

forest

(a list that contains the entire forest; NULL if randomForest is run in unsupervised mode or if keep.forest=FALSE .

err.rate

(classification only) vector error rates of the prediction on the input data, the i-th element being the (OOB) error rate for all trees up to the i-th.

confusion

(classification only) the confusion matrix of the prediction (based on OOB data).

votes

(classification only) a matrix with one row for each input data point and one column for each class, giving the fraction or number of (OOB) 'votes' from the random forest.

oob.times

number of times cases are 'out-of-bag' (and thus used in computing OOB error estimate).

proximity

if proximity=TRUE when randomForest is called, a matrix of proximity measures among the input (based on the frequency that pairs of data points are in the same terminal nodes).

mse

(regression only) vector of mean square errors: sum of squared residuals divided by n .

rsq

(regression only) "pseudo R-squared": 1 - mse / Var(y).

test

if test set is given (through the xtest or additionally ytest arguments ), this component is a list which contains the corresponding predicted, err.rate, confusion, votes ( for classification) or predicted, mse and rsq ( for regression) for the test set. If proximity=TRUE , there is also a component, proximity , which contains the proximity among the test set as well as proximity between test and training data.


It is not known what exactly the author uses from the above list, but there are simply no other sources for determining classification or regression errors when using RF, and there is no need for them.

The errors produced by RF will be different for different combinations of input data. This is what the author studies and draws conclusions about the error variance and a certain, not known how calculated bias.

 
Maxim Dmitrievsky #:
Are you also a therapist by trade? No, I'm a full-time therapist.

Yeah, I'm looking for clients, would you like to sign up?

In fact, you don't take criticism. You saw something similar to what you do - with filtering out uncomfortable portions of the sample, which in your mind gave scientificity to your approach and now you are defending it. One of the ways you defend it is by attacking it - by belittling and insulting your opponent. I admit that there is progress in this matter - you have become more restrained - and I can even praise you in this.

At the same time, my proposal about joint activity, i.e. a constructive proposal aimed at enriching knowledge about the subject under study - you call it a distraction from the topic.

What is the theme of this thread - to demonstrate the beauty and uniqueness of the minds of individual participants? In other words, blabbering rather than searching for the truth, in your opinion?