Taking Neural Networks to the next level - page 32

 

A year later after reading an amazing contribution by Chris70, I'm back and have just printed the whole thread. I'm particularly keen in Neural networks. Such a discussion would be of help. 

Please check this article too for more on this and other topics. Feel free to join the project and make it a success.

https://www.mql5.com/en/forum/338341

Stay safe.

The Ultimate_AI EA Project
The Ultimate_AI EA Project
  • 2020.04.23
  • www.mql5.com
Hello everyone. I would like to call upon every worthy programmer and trader to a crucial mission...
 
Chris70:

Okay... it's probably now really time to get back to neural networks... (yeah, I know, the off-topic intermezzo was mostly my own fault...):

In the meantime I finished the code of a multicurrency EA version for neural network price forecasting, so that it's now ready for training, validation and testing.

As I mentioned, the predictions in previous attempts (single currency) were not very reliable on average for any random moment in time, so I wanted to concentrate more on finding high-probability setups only.

I chose a classifier model now instead of forecasting exact prices, because the method that I'll use this time closely follows an approach suggested by Dr. Marcos Lopez De Prado ("Quant of the year" 2019, author of the book "Advances in financial machine learning").

The network has 3 outputs that are labeled based on the "triple barrier method":

 - output 1: an upper price level (n pips, fixed distance) is hit first (=upper "horizontal barrier")

 - output 2: a lower price level (n pips, fixed distance) is hit first (=lower "horizontal barrier")

 - output 3: no price level is hit within a max. number of minutes (="vertical barrier")

The activation function of the output layer is "softmax", which has the nice quality that all three outputs together add up to 1.0, so that the individual outputs can be seen as probabilities within a distribution.

Because it is a classifier this time, the loss function that we want to minimize during training this time isn't MSE loss (mean squared error), but Cross Entropy Loss.

The network has a normal MLP architecture for now, but I might also give it a try and compare with LSTM cells.

As I mentioned earlier, MLPs are good for pattern recognition, LSTMs and related types of recurrent networks are better for long time dependencies. So both have advantages. A multilayered fully connected LSTM network combines the advantages of both and this is also the model that I had initially used with the autoencoder. Without the autoencoder (which gets a little complicated with multi-currency trading), computation performance will suffer, which is why I start with a normal MLP; this doesn't mean that it can't have many neurons/layers, but not having to backpropagate on top of that through lots of time-steps is gonna make the training part a lot faster. We'll see.

Nevertheless, we're not done yet with a standard MLP network. Further following the suggestion of Dr. M.Lopez De Prado, I'm taking the outputs and the correct labels and thereby obtain true positives / true negatives / false positives / false negatives and can make a second (!) MLP network learn (after training of the main network) with this "meta labeling", so that I can calculate things like accuracy (validity), precision (reliability), recall and F-Score. The objective is to use these values for selection of high probability setups only.

For the inputs of the primary/main network, I'm using n periods of High/Low/Close prices (1.) of the main chart symbol and (2.) additional symbols that are conveniently communicated as an input variable (=comma separated list). Instead of pure prices, I take log returns a differencing method. The plan is to use at least all major pairs (EURUSD, USDJPY, GPBUSD, USDCAD, USDCHF) plus AUDUSD, as long as MT5 can handle these many price histories simultanously... It is the job of the neural network to find correlations among the currency pairs by itself and thereby derive possible consequences for the next upcoming prices of the main chart symbol.

I also added the month, the day of the week and the time as input variables.

For those of you who think about developing neural networks by themselves (may it be MQL or e.g. Python..), let's think for a moment about how to best feed these variables into a network (and if you don't know it yet, maybe I can show a neat trick):

Let's take the hour of the day as an example: 23 is followed by 0... does this really make sense? The minutes 23:59 and 0:00 are direct neighbors, but their values are at the highest possible distance. We have no continuity and the network will have some issues trying to make something meaningful out of this huge step. So what can we do?

One very common method (in fact the standard method for this purpose) is called "one-hot" encoding, which means we don't take just one input for the hour of the day, but 24 (i.e. 0-23). If for example the hour is 15:xx, then input number 15 gets the value 1, all other 23 of these inputs get the value 0. This method isn't that rare at all. Think of image recognition: an RGB sub-pixel is either ON or OFF, so it totally makes sense to encode a picture as "one-hot" encodings of all those MegaPixels that the images is made of.

If we only encode the hour, we need those 24 inputs. If we also encode the minute of the hour we have 60 more. Then 12 for the month... All this is absolutely feasible, but there might be a more elegant way...:

Think of the hour hand of a clock (and let's say this clock has a 24h watchface instead of 12h): instead of taking the value of the hour, we might instead take the angle of the hour hand, then we get a 360 degrees circle. Still, between 359° and 0°, there is this huge gap that we want to avoid. So how do we achieve continuity? The magic trick: the sine and cosine wave function! They are continuous, no gaps between neighbor values. If we put this into code, the declaration of the inputs can then look something like this:

et voilà.. we just used only 2 inputs for continuous time information that is precise down to the second, instead of 24+60+60=144 inputs for the one-hot encoding method;

sin(2*M_PI*mon/12) and cos(2*M_PI*mon/12)    do the same for the month; this method works for all kinds of such "cyclic" variables.


Okay... now let's see if the multicurrency network version is training without any surprises and I'll come back later with some results...

Hey Chris, I PMed you
 
Neural networks made easy (Part 6): Experimenting with the neural network learning rate
Neural networks made easy (Part 6): Experimenting with the neural network learning rate
  • www.mql5.com
We have previously considered various types of neural networks along with their implementations. In all cases, the neural networks were trained using the gradient decent method, for which we need to choose a learning rate. In this article, I want to show the importance of a correctly selected rate and its impact on the neural network training, using examples.
 
Neural Networks Made Easy
Neural Networks Made Easy
  • www.mql5.com
Artificial intelligence is often associated with something fantastically complex and incomprehensible. At the same time, artificial intelligence is increasingly mentioned in everyday life. News about achievements related to the use of neural networks often appear in different media. The purpose of this article is to show that anyone can easily create a neural network and use the AI achievements in trading.
 
Neural networks made easy (Part 8): Attention mechanisms
Neural networks made easy (Part 8): Attention mechanisms
  • www.mql5.com
In previous articles, we have already tested various options for organizing neural networks. We also considered convolutional networks borrowed from image processing algorithms. In this article, I suggest considering Attention Mechanisms, the appearance of which gave impetus to the development of language models.
 
Neural networks made easy (Part 9): Documenting the work
Neural networks made easy (Part 9): Documenting the work
  • www.mql5.com
We have already passed a long way and the code in our library is becoming bigger and bigger. This makes it difficult to keep track of all connections and dependencies. Therefore, I suggest creating documentation for the earlier created code and to keep it updating with each new step. Properly prepared documentation will help us see the integrity of our work.
 
Neural networks made easy (Part 10): Multi-Head Attention
Neural networks made easy (Part 10): Multi-Head Attention
  • www.mql5.com
We have previously considered the mechanism of self-attention in neural networks. In practice, modern neural network architectures use several parallel self-attention threads to find various dependencies between the elements of a sequence. Let us consider the implementation of such an approach and evaluate its impact on the overall network performance.
 

Neural networks made easy (Part 11): A take on GPT - MT5

In June 2018, OpenAI presented the GPT neural network model, which immediately showed the best results in a number of language tests. GDP-2 appeared in 2019, and GPT-3 was presented in May 2020. These models demonstrated the ability of the neural network to generate related text. Additional experiments concerned the ability to generate music and images. The main disadvantage of such models is connected with the computing resources they involve. It took a month to train the first GPT on a machine with 8 GPUs. This disadvantage can be partially compensated by the possibility of using pre-trained models to solve new problems. But considerable resources are required to maintain the model functioning considering its size. 

----------------

Neural networks made easy (Part 11): A take on GPT
Neural networks made easy (Part 11): A take on GPT
  • www.mql5.com
Perhaps one of the most advanced models among currently existing language neural networks is GPT-3, the maximal variant of which contains 175 billion parameters. Of course, we are not going to create such a monster on our home PCs. However, we can view which architectural solutions can be used in our work and how we can benefit from them.
 

Machine learning in Grid and Martingale trading systems. Would you bet on it? - MT5

Machine learning in Grid and Martingale trading systems. Would you bet on it?

We have been working hard studying various approaches to using machine learning aimed at finding patterns in the forex market. You already know how to train models and implement them. But there are a large number of approaches to trading, almost every one of which can be improved by applying modern machine learning algorithms. One of the most popular algorithms is the grid and/or martingale. Before writing this article, I did a little exploratory analysis, searching for the relevant information on the Internet. Surprisingly, this approach has little to no coverage in the global network. I had a little survey among the community members regarding the prospects of such a solution, and the majority answered that they did not even know how to approach this topic, but the idea itself sounded interesting. Although, the idea itself seems quite simple.

Let us conduct a series of experiments with two purposes. First, we will try to prove that this is not as difficult as it might seem at first glance. Second, we will try to find out if this approach is applicable and effective. 


Machine learning in Grid and Martingale trading systems. Would you bet on it?
Machine learning in Grid and Martingale trading systems. Would you bet on it?
  • www.mql5.com
This article describes the machine learning technique applied to grid and martingale trading. Surprisingly, this approach has little to no coverage in the global network. After reading the article, you will be able to create your own trading bots.