Machine learning in trading: theory, models, practice and algo-trading - page 1272

 
Maxim Dmitrievsky:

I'm tired of arguing about obvious things, the article says it all from them. Let everyone understand it the way they want.

If a little more abstraction, it will be clear why playing against the market is the same.

And I suggest that we at least discuss it in terms they themselves use, not in cleverly invented ones. Otherwise the argument is about nothing.

Here, try to find analogies for trading and gaming, even considering the same dynamics of their balance of probabilities, which is influenced by both sides of the process. Let's examine the problem in details, let's not use terminology as a cover.

 
Aleksey Vyazmikin:

Try to find analogies for trading and gambling, even taking into account the same dynamics of their probability balance, which is influenced by both sides of the process. Let's look at the problem objectively instead of hiding behind terminology.

For the last time, I'll say no more.

The RL agent doesn't care what he's playing against - the market or another opponent in the SC, he doesn't understand it because it's a program. The rest is purely your "know-how".

It does not matter whether the opponent is static or dynamic, in any case, the agent will study the optimal policy

You'll get your act together and you'll figure it out. Someday.

 
Maxim Dmitrievsky:

For the last time, I'm not going to write any more.

RL agent does not care what he plays against - the market or another opponent in the UK, he does not understand it because it is a program. The rest is purely your "know-how".

It does not matter whether the opponent is static or dynamic, in any case, the agent will study the optimal policy

You'll get your act together and you'll figure it out. Someday.

The name of the training method is secondary. I've been trying to talk to you about predictors for a long time.

And, how do you not understand that NS can learn to influence the situation, and depending on the effectiveness of the influence to influence the probability of the outcome of the event. And this is just the advantage of such a network - the ability to influence the situation. At each frame a decision is made what to do to improve the performance of the target (that very graph), the process of activity is multistep, the final probability of victory or defeat is not determined at the moment the game starts, but is constantly changing, including due to the actions of players, and here is just the biggest difference from trading.

I didn't say that it's impossible to invent the RL method to teach trading, I was talking about the effectiveness of a network that influences the situation to reach its target, instead of just passively guessing what the opponent will do (where the price will go).

 
Maxim Dmitrievsky:

Why show emotion, better write a reasoned rebuttal to my arguments.

 
Aleksey Vyazmikin:

This is called an OPTIMAL POLICY or STRATEGY, which takes into account all possible enemy behavior

Read a book, don't embarrass yourself. I've written to you 100 times already, what you're trying to express fits into a couple of words.

Why are you being so ridiculous?

 
Maxim Dmitrievsky:

This is called an OPTIMAL POLICY or STRATEGY, which takes into account all possible enemy behavior

Read a book, do not disgrace yourself. I've written to you 100 times already, what you're trying to express fits into a couple of words.

Why are you kidding me?

I am not familiar with the term, the Internet is just as reluctant to tell something - give me a link, I'll see if it really is called that.

And if, my description fits under a certain term, I do not understand your objection on the merits. It's not about the terms, but the impact on the situation to achieve the long-term goal - to win the game, through a chain of actions, which may vary depending on the actions of the enemy.

It's about the different environment in which the decision-making takes place - in one you can interact with the environment, and in the other you can't - just observe through the glass.
 
 

Because you don't know that I don't read foreign books... I looked through them, yes, they are more advanced than the Russian-language Internet.

 
Maxim Dmitrievsky:

if you throw out more than half of the extra words, and leave the chain of actions (Markov chains) with probabilistic transitions, then it is already better

all other nonsense may be left out

The agent/environment split is always there, there are no observations through the glass. Again, an abstract level that is not available to everyone. And here again you are bumping into a wall, because you are making it up instead of studying it.

For the last time I write and stop this bacchanalia: there is no influence, there are probabilities of transitions and approximations of policies.

I do not know how you think, but terms only distort the essence of thought, if we are not talking about axioms, long established and not subject to verification.

You couldn't provide an analogy, referring to terminology is unproductive.

 
Aleksey Vyazmikin:

Because you don't know that I don't read foreign books... I looked through them, yes, they are more advanced than the Russian-language Internet.

Well, there's nowhere else to read. Sutton, Barto"training with reinforcement" there's a translation in the Internet, an old book, but also useful