Machine learning in trading: theory, models, practice and algo-trading - page 1917

 
mytarmailS:

The idea is to modify the signs, train the model and see what happens with the new data.


And how do we work with regular models? We just feed x1,x2,x3,....x20 and that's the end of it.


What I did in the last script that - 0.82

I took all 31 indicators as predictors

then I took their predictions as predictors of the second order, then I took predictions of the predictions and so on 11 times until the error decreased, i got 0,82 instead of 0,7


I don't know where I screwed up (!)

I don't know where I screwed up (?). The idea is absolutely right, it allows to generate essentially predictors, but there are two subtleties:

1. Reproducibility of results - will need to do something like an additional conversion - what do we get after processing - a new table?

2. It is necessary to check the patterns at the sampling period - so that the correct classification was at different sites and this is not implemented in any of the training models I know, which is very bad.

Since most of my predictors are categorical in essence quantiles, I want to try generating predictors by combining through && pre-joined ranges of statistically significant predictors.

About the error, did it fall on a previously unknown sample, or a test sample? Maybe the script took into account the readings on the test sample when generating new rules?

It's been a long time, what's the bottom line - does the method work or not?

 
mytarmailS:

decided to see how the typical data would look like for 3d training of NS))

the data is 31 indicators, the target is a zigzag

i reduced the dimensionality to three dimensions with three algorithms - pca, t-sne , umap (the last two are considered the most advanced)


what is it anyway -https://en.wikipedia.org/wiki/Dimensionality_reduction

how it can help -https://ru.wikipedia.org/wiki/%D0%9F%D1%80%D0%BE%D0%BA%D0%BB%D1%8F%D1%82%D0%B8%D0%B5_%D1%80%D0%B0%D0%B7%D0%BC%D0%B5%D1%80%D0%BD%D0%BE%D1%81%D1%82%D0%B8


So data 31 indicator target zigzag , first we have PCA

Can you lay out the script, so that dummies like me could look at such nice pictures on their samples as well?

 
mytarmailS:

I have an idea to create a branch to discuss the target functions, not even a discussion, but rather to create a database of different types of target and statistics on them, what works and what does not work at all.

What do you think, who needs it?

The idea is good, but apart from ideas, it would be useful to have a code open in MQL - the result would be a library.

If not even a code, then at least a word algorithm for reproduction.


1. to predict the maximum daily price using intraday data (regression)

Using internal data of previous days or during the day - if the latter, then by the end of the day the error will simply decrease.

If at a certain point in time, the sample will be small.


2. At breakdown of which level there will be super volatility (regression + classification)

I don't even know what super volatility is, but I think it's just a trend or something else?

If there are many levels, there will be many classes and this means a much higher error than in binary classification.

The only way to do that is to use a different model for each level...


2. Predicting the hour of the daily reversalaccording to intraday data(classification).

It is essentially the same as the forecasting of the maximum price - the same problems.


3. If the first candle was black, will the third be white? (classification)

It should be realized primitively.


4. to mark the support and resistance levels and predict from which level there will be a pounce/break-down in probability (classification).

Same problems as in the second sentence.


5. to predict the optimal period for the indicator at any definite moment of time (regression).

This is interesting, of course, turning the purpose of using an indicator inside out.

 
Aleksey Vyazmikin:

It's been a long time, what's the result - does the method work or not?

Yes, I myself have not understood, I think so, try to bungle something similar


There are signs x1,x2,x3 and the target "Y"

make a prediction on the signs

"U_x1[i+1]" ~ x1+x2+x3

"U_x2[i+1]" ~ x1+x2+x3

"Y_x3[i+1]" ~ x1+x2+x3

and add to the model

x1,x2,x3"U_x1" + "U_x2"+ "U_x2" and the target "U"

now there are 6 signs.


Aleksey Vyazmikin:

Can you lay out the script, so that dummies like me, just look at such beautiful pictures on their samples?

Yes I've already erased it, fun to visualize, indulge ... but so I do not know whether you need it at all ... If ochen need, I can write again, right?


Aleksey Vyazmikin:

The idea is good, but in addition to ideas, it would be useful to have a code open in MQL - the result would be a library.

If not even a code, then at least the algorithm in words, to reproduce.

I wrote all these variants just to show that there are millions of ways to make forecasts, and now everyone is stuck with one - either binary (up/down) or increments, and that's it! Neither of them works, but they all get sick of it!)

 
mytarmailS:

Yes, I did not understand it myself, I think so, try to bake something similar to that


there are signs x1,x2,x3 and the target "Y"

make a prediction on the signs

"U_x1[i+1]" ~ x1+x2+x3

"U_x2[i+1]" ~ x1+x2+x3

"Y_x3[i+1]" ~ x1+x2+x3

and add to the model

x1,x2,x3"U_x1" + "U_x2"+ "U_x2" and the target "U"

Now there are six signs.

Yes, I will try my methodology, but it still seems like it will be very slow to do it all - lots of combinations.

How much of this is done on R, with how much sampling?

mytarmailS:

Yes I've already erased it, fun to visualize, indulge ... but so I do not know whether it is necessary at all ... If very necessary, I can write again, right?

It would be interesting to see - but write more what and how to put the library, please. And more comments in the script. Customizable delimiter and selection of the column with the target, and also column exceptions :)

By the way, you can do dropdown of different groups of columns in cycle and see, how these visual models will change, but then it's necessary to realize saving of pictures...

mytarmailS:

I wrote all these options just to show that there are millions of ways to make predictions, because everyone is stuck in one - either a binary (up/down) or increments, and that's it !!! Neither of them works, but they all do it )

Millions of them...

Try my method - 1 or -1 - opposite to vector ZZ Doncian's channel crosses the vector change point. We enter when the vector changes, trawl on the opposite channel and 0 - do not enter the market - flat. If the classification accuracy is 40% of class 1 and -1, there will be a profit.

Of simple, I want to implement a primitive target with the marking of a correct entry at the risk of 1k3 in points or any given risk - the ensemble with different risks may give good results in the aggregate, but it is theoretical.

And so, of course, you have to think about different targets.

 
Aleksey Vyazmikin:

Yes, I will try my methodology, but it still seems like it will be very slow to do it all - a lot of combinations.

How much it is done in R, with how much sampling?

It's very fast with vectorization, and when I wrote a checker to make it count by one bar, it's slower, but still very acceptable.

Aleksey Vyazmikin:

It would be interesting to watch - just write down what and how to set the library, please. And more comments in the script. Customizable delimiter and selection of column with target, and exclusion columns as well :)

Ok. Send me a file with your attributes and the target, in that format in which you will throw in a p-file, I'll try to make it work out of the box

Only not a big file please 1000 lines for eyes

Just keep in mind this is just a toy, just for visualization, if you want to check the model/signs it is 10000% better to stupidly see the error on the test)
 
mytarmailS:

The script is very fast with vectorization, and when I wrote a checker to make it count one bar at a time, it is slower, but it is still very acceptable

Hm, interesting - if you think you can - drop the script, maybe it's really useful. Now I'm doing a very large pool of new predictors, and need to somehow look for connections between them - the synergy, maybe most of them will have to be thrown out...

mytarmailS:

Ok. Send me a file with your attributes and the target, in that format in which you will throw in a p-file, I'll try to make it work out of the box

Just not a big file, please. 1000 lines is enough.

The point is that formats can change - so do not be stingy with comments :)

The variant of the more commonly used attach.

Files:
 
mytarmailS:

time will we take into account or ignore?

Ignore, as well as the last columns with the financial result. The time is in the predictors there.

 
Aleksey Vyazmikin:

Ignore, as well as the last columns with the financial result. The time there is in the predictors.

Yes yes, I've already seen it)

 

As for the target, I think it should be changed when there is no way to change the input data, as I said in the video. When the data is collected and there is no way to change it. Otherwise, if you have a variety of input variables, you should take the standard target and dig into the input data.

Don't forget, if the target deliberately contains errors and the model will improve the training result, it will also make errors according to the target. IMHO