Research in matrix packages - page 7

 
http://quantquant.com/viewtopic.php?f=7&t=1236
 
zaskok3:
http://quantquant.com/viewtopic.php?f=7&t=1236

What's this link for? A short comment would be helpful.

For example: A set of links to R and Python tutorials.

As for the content of the links: have you scooped everything from the wide web? Or are there any preferences? Of Python, pyBrain is the most interesting and worthy of study and application. It implements networks which are not present in R packages. Not for discussion, not a criticism, just a remark in passing.

Good luck

 
Vladimir Perervenko:

1. Yes.

2. Why ? Need a single MT4R.dll which is the gateway to send data to R and receive results back.

3. To all existing databases. Not only that, both Microsoft and Oracle have integrated R into their databases.

4. R implements various options of interaction with Matlab from simple exchange of Matlab files to execution of Matlab functions from R. If you have experience and expertise in this area, you can implement an excellent Matlab-MTterminal linkage with R.

5. In R the packages for all directions of a science and technics and taking into account the latest achievements are realised. You may start from here.

6. There is more than one. The most common is ff.

In general I'm surprised. You're in the middle of nowhere. Look through the articles 1 and 2 on this site and you won't understand a lot but you will get an idea of how the language works.

I am finishing the second article on deep learning that I hope to send to you tomorrow for revision and the examples will be attached to the experts... If there is interest, I think I will write several articles on the initial level (filtering, decomposition, prediction etc.). And of course, keep digging on deep learning, especially as Google opened its TensorFlow library for everyone. And there are others no less interesting and promising (mxnet, pyBrain).

If a group of enthusiasts comes together, we can organize a branch of R language users.

Good luck

Good articles! Thank you. I'll have to look into it. But I plan to try SVM, GBM, xGBoost instead of NS.
 
Alexey Burnakov:
Good articles! Thank you. I need to caret to look into it. But I plan to try SVM, GBM, xGBoost instead of NS.
SVM, ada, randomforest. All this after exercising with these packages in rattle. And after that, predictor selection packages
 
Alexey Burnakov:
Good articles! Thank you. I'll have to look into it. But I plan to try SVM, GBM, xGBoost instead of NS.

Try them all.

My favourite is randomForest in various modifications (the main advantage is that it doesn't require input preprocessing. And also ada - very high quality score. Both have two drawbacks - very long learning curve and highly prone to overtraining.

That doesn't mean you shouldn't use them, just that you need to be mindful of the nuisances.

Good luck

 
СанСаныч Фоменко:
SVM, ada, randomforest. All this after the exercise with these packages in rattle. And after packages for selecting predictors
Sam Sanych, I've already quite practiced with these packages at work too. ))) Only xGBoost hasn't touched it yet.
 
Vladimir Perervenko:

Try them all.

My favourite is randomForest in various modifications (the main advantage is that it doesn't require preprocessing the input data. Also ada - very high quality score. Both have two drawbacks - they take very long to learn and are highly prone to over-learning.

That doesn't mean you don't have to use them, just that you need to be mindful of these nuisances.

Good luck

I have a question for you on article 1. I see from the trading emulation chart that the algorithm makes trades on every bar, right?

And one more question. When training, did you feed the machine data from each bar as well?

The central point that differentiates time-series problems from most other statistical problems is that in a time series, observations are not mutually independent. Rather a single chance event may affect all later data points. This makes time-series analysis quite different from most other areas of statistics.

Because of this nonindependence, the true patterns underlying time-series data can be extremely difficult to see by visual inspection. Anyone who has looked at a typical newspaper chart of stock-market averages sees trends that seem to go on for weeks or months. But statisticians who have studied the subject agree that such trends occur with essentially the same frequency one would expect by chance, and there is virtually no correlation between one day's stock-market movement and the next day's movement. If there were such a correlation, anybody could make money in the stock market simply by betting that today's trend would continue tomorrow, and it's simply not that easy. In fact, cumulating nearly any series of random numbers will yield a pattern that looks nonrandom.

From:http://node101.psych.cornell.edu/Darlington/series/series1.htm

The point, as I assume you understand it, is that a straightforward approach in which all points in the time series are involved in training (and testing) creates mutually dependent observations, which in one fell swoop overrides the validity of the conclusions about the "patterns" found. Simply put, the results cannot be trusted, even if everything else is done correctly. Creating a sample of observations from a time series that does not violate statistical assumptions is therefore archival. Very often this step is simply ignored in popular sources, and the consequences are most deplorable. The machine will not learn the patterns.

 
Alexey Burnakov:

I have a question for you on article 1. I see from the trading emulation chart that the algorithm makes trades on every bar, right?

And one more question. When training, did you feed the machine with data from each bar as well?

From:http://node101.psych.cornell.edu/Darlington/series/series1.htm

The point, as I assume you understand it, is that a straightforward approach in which all points in the time series are involved in training (and testing) creates mutually dependent observations, which in one fell swoop crosses the validity of the conclusions about the "patterns" found. Simply put, the results cannot be trusted, even if everything else is done correctly. Creating a sample of observations from a time series that does not violate statistical assumptions is therefore archival. Very often this step is simply ignored in popular sources, and the consequences are most deplorable. The machine will not learn the patterns.

Afternoon.

that the algorithm makes deals on every bar?

No. The algorithm performs deals on the signals received on the last formed bar. Maybe I don't understand the question?

One more question. Did you feed the Expert Advisor with the data from every bar during training?

I do not understand it. Would you like to explain? I will try to answer it.

Good luck

 
Alexey Burnakov:

I have a question for you on article 1. I see from the trading emulation chart that the algorithm makes trades on every bar, right?

And one more question. When training, did you feed the machine with data from each bar as well?

From:http://node101.psych.cornell.edu/Darlington/series/series1.htm

The point, as I assume you understand it, is that a straightforward approach in which all points in the time series are involved in training (and testing) creates mutually dependent observations, which in one fell swoop crosses the validity of the conclusions about the "patterns" found. Simply put, the results cannot be trusted, even if everything else is done correctly. Creating a sample of observations from a time series that does not violate statistical assumptions is therefore archival. Very often this step is simply ignored in popular sources, and the consequences are most deplorable. The machine will not learn the patterns.

The article you are referring to is about regression. We are dealing with classification. That's two big differences...

I still don't understand your question.

Good luck

 
Vladimir Perervenko:

The article you are referring to is about regression. We are dealing with classification. Those are two big differences...

I still don't understand your question.

Good luck

A passing question to everyone in the discussion. Do you work with tick data? I moved away from bar analysis a long time ago, I work solely on DSP methods