Machine learning in trading: theory, models, practice and algo-trading - page 1605
You are missing trading opportunities:
- Free trading apps
- Over 8,000 signals for copying
- Economic news for exploring financial markets
Registration
Log in
You agree to website policy and terms of use
If you do not have an account, please register
What you are doing(the test on the "third" sample) in terms of GMDH is called the "predictive power criterion".
I see you are a good expert. Could you please state the essence of GMDH in a few phrases, for non-mathematicians?
I see that you are a good specialist. Could you explain the essence of MGUA in a few phrases, for non-mathematicians?
regression model with enumeration of features transformed by different kernels (polynomial, splines, doesn't matter). The simplest model with the lowest error is preferred. It does not save from overtraining in the market.
Roughly speaking, this is bruteforcing models, where the simplest one is chosen, based on external criteria
it's like the basics of machine learning )
For example, the MSUA regression just mocks the regression of the modern random forest algorithm and all sorts of boosting...
Boosting is better in everything, if you prepare chips like for MSUA, it will be better.
but you don't care if you don't know what to teach
I see that you are a good specialist. Could you explain the essence of MSUA in a few phrases, for non-mathematicians?
I'm not an expert at all )) unfortunately....
If very simply, roughly and imprecisely, the principle of MSUA is self-organization...
For example, we have a set of features
x1,x2,x3.....x20...
from these attributes we create a set of candidate models
m1,m2,m3.....m10...
from these models the best ones are selected, and from the best ones new models are created, again the selection .... etc... and so on, until the error in the new data (previously unknown to the algorithm) decreases
The algorithm changes itself, complicates itself, organizes itself... Something like a genetic algorithm
regression model with the enumeration of features transformed by different kernels (polynomial, splines, it does not matter). The simplest model with the smallest error is preferred. It does not save from overtraining in the market.
Roughly speaking, it's a bruteforcing of models, where the simplest one is chosen, based on external criteria
Then I see nothing new and original in this methodology.
from these models the best ones are selected, and from the best ones new models are created, again the selection .... etc... and so on until the error on the new data (previously unknown to the algorithm) decreases
The algorithm changes itself, complicates itself, organizes itself... Sounds a bit like a genetic algorithm.
Then I don't see mathematics here, it's more brain work, well, and coding. GA is a trivial thing.
Why then all run around with this MSUA, writing dissertations, so that it's impossible to understand them, if inside it's some primitive, intuitively understandable since kindergarten?
Boosting is better in everything, if you prepare the features like for MSUA, it will be better
but it doesn't matter if you don't know what to teach
I disagree...
Let's make a small test, quick, by eye )
Create four variables (regular random) of 1000 elements each
z1 <- rnorm(1000)
z2 <- rnorm(1000)
z3 <- rnorm(1000)
z4 <- rnorm(1000)
create the target variable y as a sum of all four
y <- z1+z2+z3+z4
let's train boosting and mgua, not even to prediction, but just to make it explainy
I split the sample into three pieces, one training two for the test
green is MSUA
The red is Generalized Boosted Regression Modeling (GBM)
gray is the original data
remember, the target is the elementary sum of all predictors
http://prntscr.com/rawx14
As we see both algorithms have coped with the task very well
Now let's make the task a bit more complicated
let's add cumulative sum or trend to the data
z1 <- cumsum(rnorm(1000))
z2 <- cumsum(rnorm(1000))
z3 <- rnorm(1000)
z4 <- rnorm(1000)
and change the target to look like
y <- z1+z2+z3
so we add up two predictors with a trend and one standard one, and z4 turns out to be a noise, because it doesn't take part in the target y
and so we get the following result
http://prntscr.com/rax81b
Our boosting is all fucked up, and MGUA doesn't matter at all
I managed to "kill" MSUA only with this wild target
y <- ((z1*z2)/3)+((z3*2)/z4)
And even that's not completely, and what about the boosting ? )))
http://prntscr.com/raxdnz
code for games
Then I don't see the math here, it's more brain work, well, and coding. GA is a trivial thing.
Why then all run around with this MSUA, write dissertations, so that it is impossible to understand them, if inside it is some primitive, intuitively understandable from kindergarten?
I don't know, but it describes the data much better, post written, code posted
I disagree...
Let's make a small test, quick, by eye )
I have no desire to mess around with R (I use python), maybe the reason is that MSUA creates fey regressors, so it fits. If you do the same selection for boosting, there will be no difference
Here's an MSUA enumeration for the forest
https://www.mql5.com/ru/code/22915
I have no desire to mess around with R (I use python), maybe the reason is that MSUA creates fey regressors, so it fits. If you do the same selection for boosting, there will be no difference
Here's an MSUA enumeration for the forest
https://www.mql5.com/ru/code/22915
First, what other fey regressors? What nonsense, then why does the MSUA go out when the problem gets harder?
Secondly, in my example I have the same data for both MSUA and Boost
thirdly, you don't need to mess around, can't you make a matrix with four random values in python and then make a cumulative sum of them? To check your boost?
2 lines of code ))
I'm curious what's in it myself.