Neuromongers, don't pass by :) need advice - page 9

 
Figar0:

Here it is a point by point discussion:

Thanks Sergey, you got the point.

2) Pre-processing of inputs (the question seems quite simple and quite open, we can discuss if we know what is done in this case and how it is done)

Nothing extraordinary. There are several levels, each separately processed by Hodrick-Prescott filter without peeking.

3) Mathematics of NS.

A number of experiments have shown that network parameters have a negligible effect on results within certain limits. Too little leads to overtraining, too much leads to oversaturation.

On the subject of echo networks, I'm ready to chat. I won't post the code yet, I have some plans.

4) 'Organisational' issues of NS operation.

How/when to train/retrain

By the way, I haven't tried to change

, periods/intervals

Didn't do any serious research either. I think there will be an impact, maybe even different periods depending on the tool.

, the logic of the work of Expert Advisor-interpreter of the net output

No serious research either, but from what I've changed I don't think there will be any significant effect, although... will have to check again.

, MM.

I don't see the point of adding it at all. Potential profitability is easy to estimate with FS.

- why "echo"? You may have been steamed in it, tell me about the pros and cons.

Well for one thing, there's less wiggle room with the network parameters. No need to think that if for example the hidden layer is smaller than the input layer, it's already data compression and there's a high probability that the network won't work, etc. You don't have to take into account a bunch of little things that the rest of the network has.

In brief, now I work with network -- I just throw neurons, connections (certain number, certain types).

Adapt it. I use it. I don't really care what happens inside, so I basically get a handy black box.

Virtually any problem solved by MLP is solved by the echo network.


Secondly I always get the optimum solution based on the topology and input/output ratio

Thirdly -- the adaptation time (I deliberately avoid the word "learning") of the network is predicted with great accuracy, because MLP is used for it, there is no convergence, etc.


So far I've seen only one disadvantage - limitation by fitness function. I.e. in theory I can only use finding the solution with the smallest RMS error of the FF and such. Of course, this can be bypassed with genetic learning, but then all the beauty of the echo network is lost.

Although no, there is another one, I'm not sure, but in my opinion (I may be wrong) training time increases cubicly (not so much training as forming of derivative matrix), so training a network with, say, 1000 neurons will take considerable time.


How did you dig it up in the first place?

Thanks to the same forum :) from my friend gpwr, for which I want to thank him very much :)

2th type of TC is not good at all imho.

Imho, with type 2 is much easier to work with and analyze the results. For example, the TC of the project under discussion was originally fully suited to type 2.

a)Are you really sure that the inputs/outputs cannot be improved?

Of course not.

b) Pre-processing: What does it look like? For example, did you analyze the distribution of the input values?

Normalization is present in some form, but there is no serious analysis of data distribution.

 
renegate:

Have you done any devolatilisation (found in the articles) for the indulators that you feed into the network input?

Have had a look at it -- interesting. Maybe you can share your experience of using them? What are the results, improvements, features, pitfalls?

You could also try to make the inductors gauge-free.

Erm, there are doubts here, but would still like to hear a short characterisation too.
 

Let's represent, conventionally, the analysed data area (pattern) with a blue rectangle and the predicted data area with a red rectangle. In the current implementation, the vertical size of the red area depends through a scale factor on the size of the blue area (and should depend on the data content of the blue area, not the size of the area). Here are two examples where we see a discrepancy:

и

We see that the size of the red rectangle is smaller in the first screen and larger in the second than the size of the blue one.

The normalization of the signal is relative to the vertical size.

So I think we should normalize not by sample size but by the size of the whole training sample. It seems to decrease the prediction ability of the grid.

There is one inconvenience related to it (that's why I chose this way of rationing), but there seems to be no escaping it - we should run through the training sample once more to get the maximal and minimal values.

It is clear that signals distribution from the pattern in the current implementation is shifted heavily (which is bad) in the area of maximal and minimal values, because each pattern has value 1 and value -1.

In my opinion, we should start with these changes.

 
That's not how you explained it to me :) . Now I guess I agree.
 
TheXpert:
That's not how you explained it to me :) . I guess I agree now.

No, not the other way around. I'm telling you, words are hard to explain, pictures are easier. For both the speaker and the listener. ;)

PS As for the predictive area on learning to profit - that remains the case, I'm working on it.

 

I experimented with the price using the following algorithm:

1) Get a series of first differences (FDD) from Close

2) Calculate the moving average of the FFD module (I took a period of 25)

3) Divide the FFD by the moving average

We obtain the FFD, which is more stationary. You can go back to a pseudo price series using the cumulative sum.


I see that you are not using the RRR. Are you using trend removal for the price series? Or do you simply normalise the price series to a given range?

 

renegate:

Are you using trend deletion for the price series?

More details here as well.

Or do you simply normalise the price range into a given range?

At the moment the normalization is performed within the pattern description.

I will now do normalisation across the whole set of patterns. This shouldn't be too difficult.

I would like to try to attach devolatilization but it will be more complicated here. I will think about it.

 

To do this, it is necessary to accept the axiom that the price series consists of trend, cyclical, and noise components.

We subtract the trend component from the price series. I can think of 3 ways:

1) Do principal component analysis (AGC or PCA) and zero out the first principal component.

2) Subtract a muving from the price series. Its period can be selected by eye or through optimization or spectrum analysis

3) Find the linear regression of the whole price series and subtract from the price.

After that we get a series containing only cyclic and noise components. It is convenient to normalise them into a given range.

 
That's actually the main component I'm looking for :)
 
renegate:

I experimented with the price using the following algorithm:

1) Get a series of first differences (FDR) from Close

No trick question, why this step?