Machine learning in trading: theory, models, practice and algo-trading - page 2630

 
Maxim Dmitrievsky #:
Kind of creative, you don't know in advance.
The trades of a profitable TS should indicate a pattern. If it only uses price/time, then selection/approximation seems possible
Anything can be approximated, and TS is a clear logic in the code, without approximations
 
mytarmailS #:
Anything can be approximated, but TC is clear logic in the code, without approximations.
We don't know the exact logic, come to think of it... it's not decompilation. That leaves fuzzy, "in the image and likeness". abibas trainers
 
Maxim Dmitrievsky #:
We don't know the exact logic, you know... it's not decompilation. That leaves fuzzy, "in the image and likeness". abibas trainers

So, if you take the intersection strategy of two cars and don't give a direct sign of the intersection to the modeler.

It's pretty good, I'm even surprised, but it's a primitive algorithm...

blue for the original signal, red for the prefix.

Reference
Prediction   0   1
         0 106   4
         1   1  89
                                          
               Accuracy : 0.975           
                 95% CI : (0.9426, 0.9918)
    No Information Rate : 0.535           
    P-Value [Acc > NIR] : <2 e-16          
                                          
                  Kappa : 0.9496     


And if you don't normalize it...

Prediction   0   1
         0  96   0
         1   0 104
                                     
               Accuracy : 1          
                 95% CI : (0.9817, 1)
    No Information Rate : 0.52       
    P-Value [Acc > NIR] : < 2.2 e-16  
                                     
                  Kappa : 1          
x <- cumsum(rnorm(10000))

m5 <- TTR::SMA(x,5)
m15 <- TTR::SMA(x,15)

X <- matrix(ncol = 20,nrow = length(x))

for(i in 20:length(x)){
  ii <- (i-19):i
  X[i,] <- m5[ii] - m15[ii]
}


Yn <- (m5>m15)*1
Y <-  as.factor(Yn)

tr <- 50:9800
ts <- 9801:10000

library(randomForest)
rf <- randomForest(Y[tr]~., X[tr,])
pr <- predict(rf ,X[c(tr,ts),])

prN <- as.numeric(as.character(pr))

par(mar=c(2,2,0,0))
layout(1:3, heights = c(10,1,1)) #  Heights of the two rows
      
plot(tail(x,200),t="l",col=8)
lines(tail(m5[c(tr,ts)],200),col=2,lwd=1)
lines(tail(m15[c(tr,ts)],200),col=4,lwd=1)
plot(tail(Yn,200),t="h",col=4,lwd=2)
plot(tail(prN,200),t="h",col=2,lwd=2)


caret::confusionMatrix(predict(rf ,X[ts,]) , Y[ts])
 
mytarmailS #:

So, if you take the intersection strategy of two cars and don't give a direct sign of the intersection to the modeler.

It's pretty good, I'm even surprised, but it's a primitive algorithm...

blue for the original signal, red for the prefix.


And if you don't normalize it...

So it's OK, we need to figure out how to parse the reports and try to do a simple TC on them like your MAs, yeah. I'm a little busy at the moment, but it's a fun topic.
 
mytarmailS #:

So, if you take the intersection strategy of two cars and don't give a direct sign of the intersection to the modeler.

It's pretty good, I'm even surprised, but it's a primitive algorithm...

Blue is the original signal, red is the predictor.


And if you don't normalize it...

You can not know in advance what MA Expert Advisor uses and what periods it has. Or any other indicators are used.
Try to train the model not on the MA (X) but on raw quotes (x) for example on 100 bars (you don't know the periods of MA from the black box, you can only guess how many bars may have been used).

Well, the Y is the one given by your examiner.

 
elibrarius #:

You cannot know in advance what, expert MA is using and what periods. Or any other indicators used.

Don't tell me what I can and can't do, say "I don't know how you can do it". That's more honest.

 
elibrarius #:


Try to train the model on raw quotes (x) instead of MAhs (X)

raw isn't bad either.

 Reference
Prediction   0   1
         0  72   2
         1   5 121
                                          
               Accuracy : 0.965           
                 95% CI : (0.9292, 0.9858)
    No Information Rate : 0.615           
    P-Value [Acc > NIR] : <2 e-16          
                                          
                  Kappa : 0.9255     
 
mytarmailS #:

not bad on raw either

That's more interesting...
 
mytarmailS #:

Not bad on the raw ones either.

Does it really need MO?

 

My results. Whoever can decipher it, well done, I've forgotten what's what.

Another test example, crossing of ma and price. The input is increments of several last bars, the output is trade direction (1-bay, 0-sell). Parameters of the underlying network: 1 Dense layer with tanh. 1 epoch, batch=32. win - number of inputs, per - MA period, total - training sample size. The network is trained in 1 epoch so that there are no repeated samples during training. Validation is based on the training sample inverted vertically (*-1). The test runs on a separate independent sample. All of them are equal to total. At per<=win the network shows high accuracy, which was required to prove, the network is able to look for hidden patterns.

For small networks (<1000 neurons) the calculation on cpu is faster than on gpu. With batch=8192 the computation takes the same amount of time. This test case with 1 and 100 hidden neurons is computed in the same time. For cpu double and single precision counts in the same time, results are comparable. Different activation types count for about the same time and gave comparable results. Win size doesn't affect the time much. total=10^6 at batch=1 counts for 18 minutes. The relationship between batch and time is linear.

Accuracy of sample size. batch=1 , per=100, win=100. First column - sample size (total), 2 - time min.sec, 3 - accuracy on test, 4 - accuracy on train, 5 - accuracy on validation.
1м 18.49 99. 98.7 99.
100k 1.54 98.5 97.3 98.6
10k 0.11 97.8 88.4 98.1
1k 0.01 71.2 62.1 66.5

Adding noise to input. total=10^6, batch=32 , per=10, win=10. First column - noise fraction from input, 2 - accuracy on test, 3 - accuracy on trace, 4 - accuracy on validation.

0.001 99.8 98.1 99.8
0.01 99.6 98.2 99.6
0.1 96.8 96.1 96.8
1 74.9 74.2 75.1

Number of inputs and error. total=10^6, batch=32 , per=100. accuracy on test, accuracy on train, accuracy on validation.

win=150: 99.5 98.7 99.5

win=100: 99.6 98.8 99.6

win=90: 98.9 98.2 98.9

win=80: 97.2 96.6 97.2

win=70: 94.8 94.3 94.8

win=60: 92.0 91.6 91.9

win=50: 88.6 88.2 88.6

win=20: 74.7 74.4 74.7

Graphs of weights. 1 input neuron. ma(100) 100 inputs left, ma(50) 100 inputs right