Machine learning in trading: theory, models, practice and algo-trading - page 239

 
Andrey Dik:
Try to describe the candle by two numbers, each in the range [-1.0; 1.0]. These are the position of O and C in relation to H and L.
From your example, it looks something like this :
1. [-0.8; 0.8]
2. [-0.2; 0.2]
3. [-0.9; -0.1]
how to do this?
 
mytarmailS:
How do you do that?
In height H is 1, L is -1, respectively express O and C with respect to H and L.
So clearly describe the shape regardless of the size of the candle.

 
Andrey Dik:
In height H is 1, L is -1, respectively express O and C relative to H and L.
This clearly describes the shape regardless of the size of the candle.

Here the volatility of the candle is not taken into account, all calculations are as if inside the candle, and what kind of candle it is, a gap or a small dojiq the MO does not see

I think the most normal is the % increments, but I do not count them correctly

 
mytarmailS:

It does not take into account the volatility of the candle, all calculations are as if inside the candle, and what kind of candle it is, a gap or a small dojiq the MO does not see

I think the most normal is the % increments, but I do not count them correctly

volatility is just not necessary to consider. and gaps should be got rid of (move the candle to the distance of the gap)
 
Andrey Dik:
volatility is exactly what you don't need to take into account. And you need to get rid of gaps (move the candles to the distance of the gap).
On the contrary, gaps should be memorized and considered. Since gaps are statistically closed anyway. Once I searched for an indicator for gaps - did not find it, made trapping by myself, through fractals. But I still need a good indicator.
 
Guys, there is an indicator called CandleCode, it encodes candles exactly the same have the same code, taking into account the scatter. What are you all inventing a bicycle, I do not understand :-(.
 
Vizard_:
The lesson is over)))

Thanks, I think I got it. It seems very simple, I do not believe it, but I will check.

Also strange that the sign is a separate predictor, I would just make the size of the candle negative if it is down. This should also try.

 
Dr.Trader:

Thanks, I think I got it. It seems very simple, I do not believe it, but I will check.

Also strange that the sign is a separate predictor, I would just make the size of the candle negative if it is down. This, too, should try.

I don't get it, though.

How was the target made?

Where did the formula come from?

 

I continue to believe that without selecting predictors by their effect on the target variable, everything else is irrelevant. This is the very first step. Either we remove the noise predictors and then our chances of building a NOT retrained model increase, or the noise predictors remain, which will necessarily lead to retraining. And since the behavior of the retrained model in the future is in no way related to its behavior in the past, such a retrained model is not needed.

Another interesting approach to determining the importance of predictors. Multiple algorithms for determiningsignificance testing are not used here.

Here is the executed code from this post

> n <- 10000
>
> x1 <- runif(n)
> x2 <- runif(n)
> y <- -500 * x1 + 50 * x2 + rnorm(n)
>
> model <- lm(y ~ 0 + x1 + x2)
>

 

 


> # 1a. Standardized betas
> summary(model)$coe[,2]
        x1         x2
0.02599082 0.02602010
> betas <- model$coefficients
> betas
        x1         x2
-500.00627   50.00839

 

 


> imp <- abs(betas)/sd.betas
Ошибка: объект 'sd.betas' не найден
> sd.betas <- summary(model)$coe[,2]
> betas <- model$coefficients
> imp <- abs(betas)/sd.betas
> imp <- imp/sum(imp)
> imp
       x1        x2
0.9091711 0.0908289

 

 


> imp1 <- abs(model$coefficients[1] * sd(x1)/sd(y))
> imp2 <- abs(model$coefficients[2] * sd(x2)/sd(y))
>
> imp1 / (imp1 + imp2)
       x1
0.9095839

 

 


> imp2 / (imp1 + imp2)
       x2
0.0904161

 

 


> # 2. Standardized variables
> model2 <- lm(I(scale(y)) ~ 0 + I(scale(x1)) + I(scale(x2)))
> summary(model2)

Call:
lm(formula = I(scale(y)) ~ 0 + I(scale(x1)) + I(scale(x2)))

Residuals:
       Min         1Q     Median         3Q        Max
-0.0236475 -0.0046199  0.0000215  0.0046571  0.0243383

Coefficients:
               Estimate Std. Error t value Pr(>|t|)    
I(scale(x1)) -9.932e-01  6.876e-05  -14446   <2e-16 ***
I(scale(x2))  9.873e-02  6.876e-05    1436   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.006874 on 9998 degrees of freedom
Multiple R-squared:      1,     Adjusted R-squared:      1
F-statistic: 1.058e+08 on 2 and 9998 DF,  p-value: < 2.2e-16

 

 


> abs(model2$coefficients)/sum(abs(model2$coefficients))
I(scale(x1)) I(scale(x2))
  0.90958355   0.09041645

How important is that variable?
  • 2016.12.03
  • Andrés Gutiérrez
  • hagutierrezro.blogspot.nl
When modeling any phenomena by including explanatory variables that highly relates the variable of interest, one question arises: which of the auxiliary variables have a higher influence on the response? I am not writing about significance testing or something like this. I am just thinking like a researcher who wants to know the ranking of...
Reason: