How to improve the quality of life in a digital age - General

mytarmailS 2023.04.20 12:25 #30401

СанСаныч Фоменко #:

It is not clear how to compare. Ideally, upSample due to duplication of identical data should lead to overtraining, which is not immediately detectable.

Why not? Train, test, validate and go.

СанСаныч Фоменко 2023.04.20 13:46 #30402

mytarmailS #:

Why wouldn't you? Train, test, validate and go.

Too bad, I changed the avar.

mytarmailS 2023.04.20 13:59 #30403

СанСаныч Фоменко #:

Too bad, avar changed

Why?

Maxim Dmitrievsky 2023.04.20 16:29 #30404

Militarism has come to this cute, cuddly theme too

Aleksey Vyazmikin 2023.04.20 16:37 #30405

Maxim Dmitrievsky #:
Militarism has come to this cute, cuddly topic.

So is that a sniper he's got?

mytarmailS 2023.04.20 19:05 #30406

I am trying to linearise the space, or just translate a non-linear space into a more linear space. I'm interested in the HLLE algorithm.

https://en.wikipedia.org/wiki/Nonlinear_dimensionality_reduction

looks pretty interesting. It seems to me that the AMO will be easier to recognise such a sketch than the price as it is.

Can anyone tell me why there is such a nasty distortion of colours in the animation when I upload it here?

So this is how the price transformed by the algorithm looks like.

who wants to play around

p <- cumsum(rnorm(400,sd = 0.01))+100
p <- stats::embed(p,dimension = 20)[,20:1]
plot(p[,20],t="l",pch=20)

library(dimRed)
emb <- embed(p, "HLLE", knn = 15)

pp <- emb@org.data[,20]
xx <- emb@data@data

par(mar=c(2,2,2,2), mfrow=c(1,2))
plot(pp,t="l",pch=20)
plot(xx,t="p",pch=20)

for(i in 1:nrow(xx)){
  Sys.sleep(0.05)

  plot(pp,t="l",pch=20)
  points(i,pp[i],col=2,lwd=6)
  plot(xx, t="p",lwd=2,pch=20)
  points(xx[i,1],xx[i,2],col=2,lwd=6)
}

Files:

anigif.zip 6455 kb

Any questions from newcomers Entry line is like Quick question about "StringConcatenate"

Maxim Dmitrievsky 2023.04.20 19:44 #30407

Well, manifold learning has the same problems as pca.

you'll have a hard time fitting non-stationary series

mytarmailS 2023.04.20 19:56 #30408

Maxim Dmitrievsky #:

Well, I'm learning with the same problems as pca.

you'll have a hard time fitting non-stationary series

What's there to pick up? There's nothing to pick up, the current pattern is transformed to a different dimension and that's all.

mytarmailS 2023.04.20 19:57 #30409

made a nicer picture

p <- cumsum(rnorm(300,sd = 0.01))+100
n <- 10
p <- stats::embed(p,dimension = n)[,n:1]

library(dimRed)
emb <- embed(p, "HLLE", knn = 15)
pp <- emb@org.data[,n]
xx <- emb@data@data


gg <- cbind.data.frame(time=1:length(pp),xx,pp)
library(patchwork)
library(ggplot2)
p1 <- ggplot(gg, aes(x =time, y = pp, col=time)) +
  geom_point()  +
  scale_color_gradientn(colours = rainbow(4))
p2 <- ggplot(gg, aes(x = HLLE1, y = HLLE2, col=time)) +
  geom_point()  +
  scale_color_gradientn(colours = rainbow(4))
p1 + p2 + plot_layout(nrow = 2)

mytarmailS 2023.04.21 10:34 #30410

Extracting a few "good" rules/strategies from the data...

Full step

1) data transformation and normalisation

2) model training

3) rule extraction

4) rule filtering

5) visualisation

ready code, just substitute your data.

close <- cumsum(rnorm(10000,sd = 0.00001))+100
par(mar=c(2,2,2,2))
plot(close,t="l")

sw <- embed(x = close,dimension = 10)[,10:1] #  make slide window data
X <- t(apply(sw,1,scale)) #  normalase data

dp <- c(diff(close),0) #  diff prices
Y <- as.factor( ifelse(dp>=0,1,-1) ) #  target for classification

tr <- 1:500
library(inTrees)  # ?inTrees::getRuleMetric()
library(RRF)

rf <- RRF(x = X[tr,],y = Y[tr],ntree=100)
rule <- getRuleMetric(unique(extractRules(RF2List(rf),X[tr,])),X[tr,],Y[tr])
rule <- data.frame(rule,stringsAsFactors = F)
for(i in c(1,2,3,5)) rule[,i] <- as.numeric(rule[,i])
buy_rules <- rule$condition[ rule$pred==1 ]

plot(x = 1:1000,y = rep(NA,1000), ylim = c(-0.001,0.001)) 
for(i in 1:length(buy_rules)){
   cum_profit <- cumsum( dp[  eval(str2expression(buy_rules[i]))  ] )
   lines(cum_profit,col=8,lwd=1)
}
for(i in 1:length(buy_rules)){
  cum_profit <- cumsum( dp[  eval(str2expression(buy_rules[i]))  ] )
  ccor <- cor(cum_profit, 1:length(cum_profit))
  if(ccor>=0.9)  lines(cum_profit,col=i,lwd=2)
}
abline(h = 0,col=2,lty=2)

The question is that, if you can find "working TCs" in random, what are the ways to prove that the TCs found on real data are not random?

Alexey is doing it here, I wonder if there is any statistical test for this kind of tasks?

Optimise an EA and Coding help Fractals question ?

Machine learning in trading: theory, models, practice and algo-trading - page 3041