Машинное обучение в трейдинге: теория, модели, практика и алготорговля - страница 1068

 
FxTrader562:

So the number 100 or 1000 or 500 etc has to be same in both the codes copyclose and in declaration.right?

yes

 
Maxim Dmitrievsky:

yes

Ok, but in your current sample code and implementation I am not sure what exactly is happening during training and what is the difference between agents and models:))

I hope you will explain this in your article when you will publish it. I mean what an agent is doing and what a model is doing using kernels...

 
FxTrader562:

Ok, but in your current sample code and implementation I am not sure what exactly is happening during training and what is the difference between agents and models:))

I hope you will explain this in your article when you will publish it. I mean what an agent is doing and what a model is doing using kernels...

every RL agent can have unique predictors, then we average result of all agents

number of models - number of iterations of oeature transformations with cos. Forget about it now, because we make gdmh

 
Maxim Dmitrievsky:

every RL agent can have unique predictors, then we average result of all agents

number of models - number of iterations of oeature transformations with cos. Forrot about it now, becouse we make gdmh

Yes, right..You can try to use GDMH and let me know if you progress or get stuck in implementation, because anyway finally after seeing LIVE results we can make some conclusions about the algo.But looking into the algo of GDMH, it seems very promising..

By the way, try to use natural log in case of optimisation and training formulas. In my experience, using Mathpow() of exponents seem to converge a solution rather quickly.

 
FxTrader562:

Yes, right..You can try to use GDMH and let me know if you progress or get stuck in implementation, because anyway finally after seeing LIVE results we can make some conclusions about the algo.But looking into the algo of GDMH, it seems very promising..

By the way, try to use natural log in case of optimisation and training formulas. In my experience, using Mathpow() of exponents seem to converge a solution rather quickly.

also can use trigonometric polynomials. This is will something like "recursive feature eilimination", not actually gdmh.. something medium )

because gdmh its linear quadratic algorithm, but we use RDF
 
Maxim Dmitrievsky:

also can use trigonometric polynomials. This is will something like "recursive feature eilimination", not actually gdmh.. something medium )

I don't know about that..I have to read to understand:))...In fact, I didn't know anything about GDMH and you just told me yesterday and I just learnt and wrote the code..I think I am learning fast:)))))

what I am referring to is when you are approximating a random function to get a solution, then by using natural log or exponent generally converges quickly..why? because that is the definition and purpose of natural log or ln or exponenent() or e

Here is an example code what I am referring to:

        double x=MathRandomUniform(0,1,unierr);

        double likelyhood = 1/(1+exp(MathPow(x,3)));

I understand GDMH somewhat...but RDF is still not 100% clear. I was just trying to implement monte carlo instead of RDF, but if we can do it by RDF, then I don't see the use of Monte carlo. What do you think which is better, monte carlo or RDF?

But I will summarise here what I am expecting from this algo:

1.It will take the indicators or close prices and break it into m small pieces and create polynomials or approximate functions during training

2.When we will run it in trading, then for every candle it will check the past training data and find which polynomial piece is matching our current price and predict what's next going to happen and it should iterate

 
FxTrader562:

I don't know about that..I have to read to understand:))

what I am referring to is when you are approximating a random function to get a solution, then by using natural log or exponent generally converges quickly..why? because that is the definition and purpose of natural log or ln or exponenent() or e

Here is an example code what I am referring to:

        double x=MathRandomUniform(0,1,unierr);

        double likelyhood = 1/(1+exp(MathPow(x,3)));

I understand GDMH somewhat...but RDF is still not 100% clear. I was just trying to implement monte carlo instead of RDF, but if we can do it by RDF, then I don't see the use of Monte carlo. What do you think which is better, monte carlo or RDF?

But I will summarise here what I am expecting from this algo:

1.It will take the indicators or close prices and break it into m small pieces and create polynomials or approximate functions during training

2.When we will run it in trading, then for every candle it will check the past training data and find which polynomial piece is matching our current price and predict what's next going to happen and it should iterate

RDF approximate the agent's polisy directly, on other hand q-learning with monte carlo or TD and Markov chains do it with too many iterations, so it can takes much longer

1,2 yes, absolutely right

 
Maxim Dmitrievsky:

RDF approximate the agent's polysy directly, on other hand q-learning with monte carlo or TD do it with too many iterations, so it can takes much longer

1,2 yes, absolutely right

So you mean RDF is better and faster than Monte carlo which is definitely required for instant trading decisions upon candle close....So we are on the right path towards creating the forex version

of "ALPHA ZERO"...let's see:)))))))))

 
FxTrader562:

So you mean RDF is better and faster than Monte carlo which is definitely required for instant trading decisions upon candle close....So we are on the right path towards creating the forex version

of "ALPHA ZERO"...let's see:)))))))))

Monte carlo its not a deep learning, RDF is deep. Alfa zer0 use DQN a I know, with neural net, but not a pure q-leraning. We have actually not a classical RL, I dont know what we have.. hand made :) but looks like it works

 
Maxim Dmitrievsky:

Monte carlo its not a deep learning, RDF is deep. Alfa zer0 use DQN a I know, with neural net, but not a pure q-leraning. We have actually not a classical RL, I dont know what we have.. hand made :) but looks like it works

Everything seems okay except the "UpdateReward()" function where we need to implement profits NOT the the profit loss trade count.. then we can just create an exact copy of "ALPHA ZERO" based on my algo which I explained previously based on candle simulations and it will just converge over time after a few million simulations.. We can do the same for each currency pair and save it for future use...

But anyway I will do it somehow to implement profits to reward function after I thoroughly understood your RDF implementation...But if you get some idea, then you can try to implement or just let me know how to implement profits so that the agent will check the overall profits and accordingly optimise the policy and make the next trading decision based on profits and losses..NOT based on just profit or loss counts

Also, I am not good at matrix implementation and hence, till now I was unable to deal with profits to implement such a policy.

Причина обращения: