Machine learning in trading: theory, models, practice and algo-trading - page 1112

 

For ML, the data is much more important than a good model.

So, it may be useful to someone, I share:

Collecting ticks from MetaTrader 5 directly to MySQL database via libmysql in realtime

MT5_ticks_to_MySQL

The history itself can be found here:

http://ticks.alpari.org

 
itslek:

For ML, the data is much more important than a good model.

So, it may be useful to someone, I share:

Collecting ticks from MetaTrader 5 directly to MySQL database via libmysql in realtime

MT5_ticks_to_MySQL

The history itself can be found here:

http://ticks.alpari.org

In a normal CSV file, the task is solved with the same success and speed. And besides, you don't have to deal with anything at all.

 
Yuriy Asaulenko:

The task is solved in a regular CSV file with the same success and speed. And, besides, you don't have to bother with anything at all.

You don't understand and comment right away)

I agree, it's more convenient to upload your history to CSV. But to work online with already made models...

 
itslek:

Eh do not understand and immediately comment)

I agree with the unloading of history, in CSV is more convenient. But to work online with ready-made models...

Online I don't need so many ticks. And they'll fit in memory.

And not online, you don't need much speed to export. You import from CSV to database by hand.

 
Vizard_:

+1

I agree that data is more important than the model, but the method of constructing it is also important. In this case I consider it as 50/50.

From the model it is important that during repeated optimizations it produces generalized models of more than 50%. That is, 5 or more models out of 10 optimizations should be generalized to a sufficient level. Why this way? Because when selecting from 10 models the probability of selecting the working one is greater. What good is the algorithm if it produces only 2 generalized models out of 10 optimizations. The probability that the Expert Advisor will choose the working model is very low and, therefore, this algorithm is of little use too.

It is important that the data should be the cause of the target function, at least 20 percent or more. If there is a reason in the input data, the responsibility for finding it lies on the optimization algorithm.


I still find it interesting to check this data on other algorithms and understand what is success. The presence of data or optimizer Reshetova, which is all hated here, but to do it will need to test in real life, and considering that people here stingy. CUSTOMS........

If anyone else is asking this question I think I can answer it how: I prepare the data. You train your AI, after training we test the model on the feedbacks (preferably in the real world), if the EA works and we are convinced that the model works, it means that your optimization algorithm works (I don't need it) and therefore we should focus on finding the input data. If I can't get a working model with my data, I will have to improve the algorithm. I think this will be of interest to newbies, in the meantime I will also be convinced that this is my success. Data or the presence of a powerful optimizer Reshetov.

So.... who accepts this challenge?????

 
Mihail Marchukajtes:

I agree that data is more important than the model, but the method of constructing it is also important. In this case I consider it as 50/50.

From the model it is important that during repeated optimizations it yields generalized models more than 50%. That is, 5 or more models out of 10 optimizations should be generalized to a sufficient level. Why this way? Because when selecting from 10 models the probability of selecting the working one is higher. What good is the algorithm if it produces only 2 generalized models out of 10 optimizations. The probability that the Expert Advisor will choose the working model is very low and, therefore, this algorithm is of little use too.

It is important that the data should be the cause of the target function, at least 20 percent or more. If there is a reason in the input data, the responsibility for finding it lies on the optimization algorithm.


I still find it interesting to check this data on other algorithms and understand what is success. The presence of data or optimizer Reshetova, which is all hated here, but to do it will need to test in real life, and considering that people here stingy. CUSTOMS........

If anyone else is asking this question I think I can answer it how: I prepare the data. You train your AI, after training we test the model on the feedbacks (preferably in the real world), if the EA works and we are convinced that the model works, it means that your optimization algorithm works (I don't need it) and therefore we should focus on finding the input data. If I can't get a working model with my data, I will have to improve the algorithm. I think this will be of interest to newbies, in the meantime I will also be convinced that this is my success. Data or the presence of a powerful optimizer Reshetov.

So.... who accepts this challenge?????

Targets in the data already set? What metrics?)

If you reduce the challenge to a simple machine learning competition form, you can attract not only traders)

 
itslek:

Are the tags in the data already set? What's the metric?)

If you reduce the task to a simple Machine Learning competition form, you can attract more than just traders)

Yes, the data from the target will already be there. We're talking about classification models. Here the problem is in other, in what way it is possible to check these models received in your systems????. Preferably on the real...

 
Mihail Marchukajtes:

Yes, the data from the target will already be there. We are talking about classification models. Here the problem is in other, how it is possible to check these models received in your systems???? It is desirable on the real...

To begin with check at least on delayed sampling. Suppose you give the data for 2012-2016. 2017 test. 2018 you keep for the final check (so that there was no adjustment).

Wait with the real) do not share the skin of the unkilled bear ;) first you need to make a model that at least could pass validation, and then try to roll out the elephant in the product. These are two completely different tasks.

 
I am not surewhat to do with it:

First check at least on a deferred sample. Suppose you give the data for 2012-2016. 2017 test. 2018 you keep for the final check (so that there was no adjustment).

Wait you with the real) do not share the skin not killed the bear ;)

So how do I check the model? If I do the check in the MT tester. After optimization, will we be able to load the model into MT4?

 
Mihail Marchukajtes:

Well, how do I check the model? If I check it in the MT tester. After optimization we will be able to load the model into MT4?

And my point is this. First, set the problem in a vacuum. with your metric.


If you want to run it in the tester with trawl there and other stuff:

Provide the data in .csv with the targeting (I understand you have this binary classification). then train the model and predict the targeting. the result is loaded as a list of model responses into the same tester and run. But to do it for each model is another variant of fitting, it is better to think about metrics or target. And in the tester to run only the last option.

For realtime it is a separate hassle and not all models can be wrapped into dll.