Machine learning in trading: theory, models, practice and algo-trading - page 50
You are missing trading opportunities:
- Free trading apps
- Over 8,000 signals for copying
- Economic news for exploring financial markets
Registration
Log in
You agree to website policy and terms of use
If you do not have an account, please register
Here's even this no-nonsense kid talking, watch from minute 10
https://www.youtube.com/watch?v=KUdWTnyeBxo&list=PLDCR37g8W9nFO5bPnL91WF28V5L9F-lJL&index=3
In the video, the point is not to apply exactly a Fourier series, but some function to process the data to transfer it to another space (kernel transform). Pictures are suggested to be converted to HSL color palette, sound to a histogram of frequencies. All this will change the data, but it can easily be restored by the inverse function if necessary. This transformation should somehow make logical sense, so that the classes after the transformation are more easily grouped to each other in multidimensional space, to simplify classification. If after some transformation the classes will be even more spaced out in space than before - there is nothing good in this action, it will only be worse.
It is not necessary to use exactly Fourier, since it is not advised. Alexey for example suggests to take deltas between bars, not raw values. Yury in libVMR used different mathematical transformations, it's interesting too, see his source code in java. You can decompose data using principal component method and feed it into neuronics component, about this SanSanych and I've already published some articles and examples. What you want to do has hundreds of solutions, Fourier is only one of them.
Or you can just take a neural network with dozens of layers, and it will work out by itself on the raw data, without any kernel transformations. But there will be a lot of problems with regulation, crossvalidation, etc.
Also, none of these methods will get rid of the input garbage problem. Input garbage -> post-conversion garbage -> output garbage -> flush on fronttest.
The question was one: is it possible to measure the similarities between the functions through amplitude, phase and frequency, if so how is this done?
THAT'S IT!!! I am not interested in anything else ...
All the rest written about Fourier is a consequence of the CC answer and is not relevant to my question
In the video, the point is not to apply exactly a Fourier series, but to process the data in some function to transfer it to another space (kernel transformation).
Or you can just take a neuroncu with dozens of layers, it will sort itself out on raw data without any kernel transformations.
The kernel transforms in this video are manually selected by visualizing predictor pairs. That is, you need a human specialist who will visually find suitable pairs and select adequate kernel transformations for them.
Or you can just take a neuronics with dozens of layers, it can handle the raw data on its own without any kernel transformations.
The video says that not only will it not figure it out, but it will get even more confused. In addition, methods of reverse error propagation do not practically adjust weights in layers remote from the output layer - the signal does not get there, and if it gets there, it is very weak.
Yury used in libVMR different mathematical transformations, it's interesting too, look at his source code in java
Not different, but algorithmically composed according to A. G. Ivakhnenko's method of group factorization of arguments (GMDH). If you take random, i.e. without any structure, you get crap.
If the functions are periodic, you can. If non-periodic, there will be errors when comparing edges (at the beginning - 0 value of the argument, and at the end of the period - 2*PI value of the argument) of two functions.
I see, there will still be distortion at the ends and the closer to the edge the stronger, sorry, it doesn't fit....
Dr.Trader
how is your neural network? trained?
Found another note like this on one site, it is for some reason no longer works
Maybe someone will be interested:
".... Following your recommendations I built a few robot models, the robots are learning and recognize something on the new data, but the results, alas, are still far from what I expected.
.... First I applied polynomial-harmonic approximation to lowpass filters to get first set of secondary features, one filter for short term trend and second one for long term trend. As secondary features I took frequencies, amplitudes cos and sin, coefficients P. ... The network learned, but had no generalization.
...The next step was a new model: we apply a low pass filter to the closing price (I used the 2nd order Butterworth filter), apply polynomial-harmonic approximation, transform A*cos(wx)+B*sin(wx) to the form M*sin(wx+f) and take M and f as secondary signs.
.... And with this model I managed to build a network, which had very good generalization properties: it recognized almost all new data correctly.
Polynomial-harmonic approximation is time consuming, so I decided to make another model: a set of bandpass filters with equal frequency distribution, apply them to LF filter of closing prices and then apply Hilbert transform to them. I also succeeded in making a network for artificial market model that successfully recognized new data.
After that, I applied this model to real quotes:
- we filter the closing price with an adaptive LF filter .
- we build a set of bandpass filters to determine the market waves.
- we apply the Hilbert transform.
- first set of secondary attributes: values of bandpass filters, values of instantaneous amplitudes and phases.
- we build LF filter from close price.
- the second set of secondary attributes - relative deviation of close and low values of the last candle price, support and resistance levels from the value of LF filter, relative volume in a bar (relative to the average value).
- create a training sample.
As a result we have the following: the network is trained, but performs poorly on new data. Somewhere it predicts minima accurately, and somewhere it "forgets what it is supposed to do". Compared to what I was doing before, the result seems to be of a qualitatively different level, but this is my subjective opinion. Yes. Let me also make it clear, I followed the rules of thumb: the number of input features of the network (secondary features in this case) < size of training sample/10, the number of neurons in the hidden layer is not greater than the number of network inputs...". I hope that these excerpts from the letter give you an idea of possible approaches to trait search.
I see, there will still be distortion at the ends and the closer to the edge the stronger, sorry, it does not fit....
The question of whether it will fit or not is very trivial. First, decompose the function into a Fourier series, then restore it from the resulting series by inverse transformation. And compare the initial function before decomposition and the restored one. If the reconstructed function is radically different from the original function, it becomes quite obvious that this method will not work.
For non-periodic functions, they recommend using wavelet transforms. I haven't tried it myself. But judging by the way the image is first compressed by wavelets, and then restored with some distortions, but not visually noticeable compared to those obtained from Fourier transforms, it is quite obvious that wavelets are more adequate for non-periodic series compared to Fourier transforms. Since I have no practical experience in applying wavelets to quoted BPs, I cannot share useful information about which wavelets are the most adequate here. And there are a wagon and a small wagon of them out there.
How's your neural network doing?
The question of whether it fits or not is very trivial. First, decompose the function into a Fourier series, then restore it from the resulting series by inverse transformation. And compare the initial function before decomposition and the restored one. If the reconstructed function drastically differs from the original function, it is obvious that this method will not work.
A little bit of prehistory....
I started by looking for patterns in the history by the following principle:
1) we have the current price - let it be the last 20 candles
2) we go to the history as a loop and look for a similar situation in the past (the proximity was measured by Pearson's correlation and Euclid)
3) when we find such a situation, we look how it has ended - with growth or decline
4) when we have found a lot of such analogues it is possible to collect the statistics with some prevalence, for example we have found 10 analogues
8 of which ended in growth and 2 - in decline, here's a prediction of what will happen to the price :)
Thatis, I've done something similar as in Ivakhnenok's "method of compiling analogues - model-free forecasting"but my method is much more primitive
Well, this approach turned out to be not working due to a number of reasons
1) the more candlesticks in the vector you are looking for, the less analogues you can find in history, in practice if you take all OHLC prices and the accuracy of Pearson correlation >0.93, then a three-candle vector will be the limit, while a normal forecast requires a much larger vector
2) In practice, there are practically no identical price situations in the market ....
3) It turns out that we cannot increase the vector, as we would lose much accuracy. And if we can, we won't find an analogue in a bigger vector anyway, as there are no identical situations
The solution has been found.....
Depends on what tasks and data? If you slip it a sample of random numbers, it mumbles: "Garbage in, Garbage out." And if you give it a sample of significant predictors, it produces generalizability values. It doesn't take long to train, at least for a sample of a dozen predictors and a few thousand examples.
A little bit of prehistory....
I started by looking for patterns in the history by the following principle:
1) we have the current price - let it be the last 20 candles
2) We go to the history as a loop and look for a similar situation in the past (the proximity was measured by Pearson's correlation and Euclid)
Not quite clear? Are you taking prices as OHLC for patterns or any transformations of them?
The point is that if you take OHLC as is, a similar pattern but 1,000 points higher or lower from the one being compared will differ in Euclidean distance more than a completely dissimilar pattern but 10 points higher or lower from the one being compared. Moreover, the difference will be two orders of magnitude, and, accordingly, the comparison error will also be two orders of magnitude.