Machine learning in trading: theory, models, practice and algo-trading - page 2755

 
Valeriy Yastremskiy #:

I am looking for an idea of determining behaviour through a grid of levels, uniform, or levels of historical extrema) As if it is necessary to apply both approaches, while I am solving the problem, what should be calculated from a number, that would set the correct grid step, and how often it is necessary to change this grid)))))

maybe it would be better to just take some situation on the market with a level, then find similar situations, combine them into a group and then try to separate this group from everything else....

 
mytarmailS #:

Maybe it's better to just take some situation on the market with a level, then find similar situations, group them together and then try to separate this group from everything else.

Maybe it's better. This is a different job, initially you can take a chart of news (at least somehow formalised) and events (it is necessary to formalise) and chase the same patterns, but this has already been done and is being done. The question is how to formalise, what we have today does not give profitable results.

It is more profitable to use the same time interval.

I am still fascinated by the idea of correct definition of the state. Sanych liked the conclusion, change of behaviour quite often drains the deposit without warning))))) My conclusion, the price changes behaviour quite often and in a wide enough range and apparently with different patterns before the change of behaviour. I can not yet formulate more precisely what this means. In mathematics, it seems that an infinite number of patterns are needed to describe the SB. For the market, we still do not have enough models to describe it accurately enough.

In general, what should be counted on the row, that would correctly calculate the step of the grid))))) And how often to change the step))))

Why grid, because it is one of the simplest variants of state control, everything between levels is not fixed, transitions from level to level, trend, return, means extremum.... Well, as if there are problems, but such an oak formalisation of the state.
 
JeeyCi #:

Yeah, I like that word (something so native to event-driven programming) -

in the variant voiced by SanSanych Fomenko -- something like this seems to be implemented: outlier -> means input (or output)... I've mixed a mess of dim_reduction and multidimensional classification methods (LDA, clustering) above... but the essence of Mahalanobis has probably always been primarily in multidimensional space as outliers/novelty detection anyway... so the option of trading on outliers looks very nice (only feature_engineering should be done correctly, not a dumb set of initial data to be searched for fs)...

but still the "sliding window" is confusing (although the usual autoregressive model is common for timeseries-following-trading).... - there can be a mess there too (in the window)... -- I assume the window boundaries are the entry into the market of smarts, who incidentally use in their portfolio_management -- Mean-Variance Optimisation... we don't know their portfolio of course (only roughly from SoTs retrospectively), but within this variance they probably do rebalancing of their portfolio --- while fixing or loading about retail...

trading on outliers is certainly an interesting option, but considering who is up against it - OTF or DTF (day traders) -- is also important to interpret the outlier correctly...

p.s..

well, or just take not Mahalanobis outliers, but extreme deciles of prediction distribution -- for risk-acceptance (vs. risk-avoidance) behaviour, e.g. actor'a (with corresponding switching of its state according to env. parameters).

You are still moving in a sliding window, no matter what shape and properties it has or what attributes it contains. It would be foolish to deny the obvious.

Until the peeps learn the basics, it will be impossible to communicate

Reinforcement learning is tied to the fact that the agent affects the environment, except for simple tabular examples. And there are finitely many stats, or policies change slowly and not in leaps and bounds. Based on this it is impossible to implement an agent in Forex, you will always get a regular classification.

In form it will look like wow, the actor does something in a certain environment. And in essence it is not fulfilled, just categorising instances. For robotics or games it's more suitable.

Without this condition being met, RL can be seen as an optimisation, one of...
 
Maxim Dmitrievsky #:
Reinforcement learning is tied to the fact that the agent affects the environment, except for simple tabular examples. Based on this it is impossible to implement an agent in Forex, you will always get a regular classification.

In form it will look like wow, the actor does something in a certain environment. And in essence it is not fulfilled, just categorising cases. For robotics or games it's more suitable.

Without this condition being met, RL can be seen as an optimisation.

agree... so I tend to think that the agent is OTF (large-cap, smarts) -- by ultra-high volume (as in standard VSA) -- both constraining and breaching volumes to formalise.... this is where I see the main purpose of ML in data preparation -- to determine what is uhv on a particular asset... and it is the smarts that should be treated as an actor, not the day traders.... although, of course, formalisation is still needed....

 

by the way, I'm trying to look through your link on causal inference - the link, by the way, describes the meaning of the phenomenon in rather clumsy Russian.... but the links from it to book and others suggest that we are talking about studies of partial correlation coefficients (to put it in Russian) - i.e. with fixed factors other than the one under study.... i.e. a partial correlation coefficient is found (there are statistical mathematical formulas in academic circles or any standard statistical book), its significance is assessed through the null hypothesis (H0) and a conclusion is made about influence or non-influence (i.e. dependence or independence of the selected variables), e.g. in Russian:

при фиксированном качестве посевного материала, мин. удобрений и фин. затрат - продуктивность посевов не зависит от фондооснащённости

i.e. fund-equipment and crop productivity are linearly NON-dependent ... (this is in terms of interventions)...

and it is possible to prove it by a private coefficient of correlation from statistics.... this is exactly the situation of proving that correlation in multifactor space does not necessarily serve as causality of the observed phenomenon... i.e., in essence, the standard Planning of Experiments is designed to solve such problems - to rid logical conclusions about results from being littered with insignificant factor-result correlations, which are unreasonably elevated to the rank of causal phenomena.

I didn't read the rest (about confounders, etc.), but somehow articles for mass (not in the best sense) readers and articles for PR (to promote new words about old truths) are very much out of line semantically and logically with the arguments for a reasonable scientific argument (well, and reasonable proof of arguments, i.e. cause and effect relations, i.e. cause and effect relations).i.e. the cause-and-effect relationships of certain dependencies proven(!) by the RIGHT planning and setting up of a series of experiments)...

... for retail (due to incomplete information about the market) - causal inference is incomprehensible in principle, for research institutes - unprovable (for the same reason due to various elements of commercial secrecy), for statistical committee of the state level, perhaps, there are some representative samples ... but statistical data on finance in any case is always based on commercial secrecy... and for marketers it is just a toy "to look into someone else's wallet at any cost" -- that's why I think your link is more interesting as a "toy for marketers", but in essence, based on classical statistical methodology -- just for them it is a classic statistical methodology. methodology, -- it's just that they are not always interested in classical scientific research, theory and matrix statistical methods and they don't always have access to all information for full-fledged matrix modelling, as well as psychologists, sociologists, etc. due to "ethicality of experiments" ...

>> so they cite "toy examples" in their articles and books, over which a large number of brains dry up - simply because in real life neither such experiments can be performed, nor such data can be obtained, no funding for such research can be found, no such empty logic about swimming pools can be found, and business wants explanations when and how demand is born.... so they write "shir.consum" articles for grief marketers who dream of riding on the back of inflation - for a long time and for profit... and brains, drying up, try to solder these toy-examples to real life....

a real, scientifically grounded, logically adequate, scientific experiment about these or those dependencies implies proper planning of the experiment to gather evidence base for those/other logical conclusions, statistically confirmed ... not the populism of statistics per se, but in English terms ...

it seemed to me so from the obtuse link (( - so don't worry about not soldering that article to market analysis right away ...

--- this is my Review

Speed vs. Accuracy: When is Correlation Enough? When Do You Need Causation?
Speed vs. Accuracy: When is Correlation Enough? When Do You Need Causation?
  • adam kelleher
  • medium.com
Often, we need fast answers with limited resources. We have to make judgements in a world full of uncertainty. We can’t measure everything. We can’t run all the experiments we’d like. You may not have the resources to model a product or the impact of a decision. How do you find a balance between finding fast answers and finding correct answers...
 
This is some kind of introductory or promotional article, I think there was a link to the library itself.
And there was something about inference for time series further on, I don't have time to look into it yet. Maybe it's an overblown topic, I can't rule it out

I think it makes sense to search for "time series classification causal inference" and things like that
 
Maxim Dmitrievsky #:
This is some kind of introductory or promotional article, I think there was a link to the library itself.
And something about inference for time series was further on,

and it was really promotional. and not in the best traditions of the art of advertising...

in the timeseries part of the link there's a reference to SUTVA, which is from the book.

The assumption of no interference was labelled "no interaction between units" by Cox (1958),

and is included in the "stable-unit-treatment-value assumption (SUTVA)" described by Rubin (1980).

VanderWeele (2009) formalised this point as the assumption of "treatment variation irrelevance," i.e., the assumption that multiple versions of treatment may exist but they all result in the same outcome.

In the presence of interference, the counterfactual for an individual i is not well defined because an individual's outcome depends on other individuals' treatment values.

Like the assumption of no interference, the assumption of no multiple versions of treatment is included in the "stable-unit-treatment-value assumption (SUTVA)" described by Rubin (1980).

Interference exists when treating one individual a ect the outcome of others in the population.

and further down the book - a few lines about " transportability [including adverbiality] of causal inferences across populations", but at the same time "each method's estimate is based on different modelling assumptions." - which is common in modelling. Then the hopes of marketers to sort of "spread rumours" in populations, while they separate individuals in the population by suggestibility (those who believe, disbelieve, resist, complain, give in) - i.e. a new division for classification.... at this rate we will obviously move away from modelling of price movement to modelling of social masses, while we will dream that the market is moved by the same sentiment, and it doesn't need to be modelled - it just needs to be gathered in a pile and directed in the direction (crowd psychology) at least to the moon - that's what marketers are playing at.... dreaming of equating themselves with political scientists...

and there are many technologies of PR and information dissemination in society - as manipulation of public opinion... everyone "trains" in spreading their speculations, dreaming of influencing someone's decision-making process... but decisions made on the basis of the crowd's views are rarely long-term and there are always "stop levels" beyond which the crowd will not be allowed to go (by whom? - by the Regulator, if he fulfils his functions)... I think it is futile to study flat behaviour (who is against whom and by what means, at least SUTVA) -- if the goal is to join the smarts and " follow their coat-tail".

the book, by the way, is called What IF - harvard.edu (haven't finished reading it - already reading diagonally makes me smile).

p.s.

no, but still they will not invent anything and call it high-tech calculations, so that in economy they will not distribute benefits among people, but will give the toy to marketers, and distribute advertising (surrogate of real product, fiction, hope, dream, self-deception) - and be happy that they have such a "good" and "promising" job - marketing.... I am still closer to normal science-based logic, even if crowd behaviour.... but I don't make trading decisions based on crowd behaviour and I don't advise anyone...

so causal inference analysis (who shoved whom in the crowd and how the fight broke out) is not considered modelling of the pricing process, but modelling of the fight ... there is always an arbiter (and it is not marketing - it is economics!, there is no equal sign between them), and it is impossible to take everything into account (modelling is always one or another Assumptions, if an object is modelled, which cannot be studied to the end).

p.p.s..

I don't accept arguments about the equality of economic and political technologies... "socio-economic field" is more reminiscent of earthly laws than linguistic abstractions... here on Earth there is a "good" (be it a shovel or something else), "must-not_need", and "marginal utility" from this good, which should be maximised, but there are limitations - budget and other resources - this is in business.... but in life there's a regulator... it forces aggregate demand and supply to expand or contract due to circumstances... conclusion: " follow their coat-tail", but do not mix marketing in its worst traditions with economy and life of real people (including pension funds, hedge funds, investment funds for servicing people's money - liquidity market).

 
... that's it - after counter-advertising, I'm leaving... just the harvard.edu website address was a bit confusing... as was the link to SUTVA in the timeseries section
 
Fine, fuck it, I won't go into it myself then, there are other interesting topics to discuss

I like your expression and analysis, I believe you 😀

I thought there were some ready-made tools in the library so you could deduce something without much effort.
 
Aleksey Nikolayev #:

I have a function that searches for a pattern/sequence in another large string , if found then ITINA otherwise False

illustration



I have a function in Rk but it is slow for large calculations, or rather there are even several functions written in different styles....

This is how the input data looks like, by the way both strings can be of arbitrary length.

dat <- sample(letters[1:20],size = 30,replace = T)
pat <- c('a',"c","g")

dat
 [1] "h" "c" "q" "a" "s" "a" "d" "b" "c" "n" "a" "t" "e" "q" "s" "k" "j" "t" "l" "j" "n" "t" "r" "m" "h"
[26] "b" "o" "e" "g" "h"
pat
[1] "a" "c" "g"

Here is the first s1 function written in the standard style, clear but cumbersome.

s1 <- function(pat , dat){  
  lv <- rep(F,length(pat))
  k <- 1     
  for(i in 1:length(dat)){        
    if(dat[i] == pat[k]) 
    {
      lv[k] <- TRUE
      k <- k+1 
    }       
    if(k==length(pat)+1) break
  }
  return(  all(lv)   )
}

or the more elegant s2.

s2 <- function(pat , dat) grepl(paste(pat, collapse=".*"), paste(dat, collapse=""))

both do the same thing.

s1(pat = pat,dat = dat)
[1] TRUE
s2(pat = pat,dat = dat)
[1] TRUE

I have a question/request - can you write this simple function in rcpp for me?

Reason: