Machine learning in trading: theory, models, practice and algo-trading - page 1266
You are missing trading opportunities:
- Free trading apps
- Over 8,000 signals for copying
- Economic news for exploring financial markets
Registration
Log in
You agree to website policy and terms of use
If you do not have an account, please register
I am a guest in this thread. Just came by to share an article
And this by the way exactly
Considering the number of morons on the forum (including your favorite magician Mudo), I do not think it's necessary to support this topic anymore, because I did not get any benefit for myself
Maxim, you are wrong! Benefit for you is, the benefit in the formulation and presentation of tasks. However, I'm not trying to persuade you.
Maxim, you are wrong! There is a benefit to you, a benefit in the very formulation and presentation of the tasks. However, I do not persuade.
Well, you see that here inhabit the clinical morons, it's not a forum mask, but real clinical cases, you give them a word they give you two, every message
I found an obscure code in Alglib forests. The full code of cross-etropy calculation function from dataanalysis.mqh:
//+------------------------------------------------------------------+
//| Average cross-entropy (in bits per element) on the test set |
//| INPUT PARAMETERS: |
//| DF - decision forest model |
//| XY - test set |
//| NPoints - test set size |
//| RESULT: |
//| CrossEntropy/(NPoints*LN(2)). |
//| Zero if model solves regression task. |
//+------------------------------------------------------------------+
static double CDForest::DFAvgCE(CDecisionForest &df,CMatrixDouble &xy,
const int npoints)
{
//--- create variables
double result=0;
int i=0;
int j=0;
int k=0;
int tmpi=0;
int i_=0;
//--- creating arrays
double x[];
double y[];
//--- allocation
ArrayResizeAL(x,df.m_nvars);
ArrayResizeAL(y,df.m_nclasses);
//--- initialization
result=0;
for(i=0;i<=npoints-1;i++)
{
for(i_=0;i_<=df.m_nvars-1;i_++)
x[i_]=xy[i][i_];
//--- function call
DFProcess(df,x,y);
//--- check
if(df.m_nclasses>1)
{
//--- classification-specific code
k=(int)MathRound(xy[i][df.m_nvars]);
tmpi=0;
for(j=1;j<=df.m_nclasses-1;j++)
{
//--- check
if(y[j]>(double)(y[tmpi]))
tmpi=j;
}
//--- check
if(y[k]!=0.0)
result=result-MathLog(y[k]);
else
result=result-MathLog(CMath::m_minrealnumber);
}
}
//--- return result
return(result/npoints);
}
The code fragment marked in red considers something(tmpi) which is not used in any way further down the code. Why is it included then?
According to wikipedia it should beEither something is missing or the code is not completely cleaned up.
In general, I started to get into this function, because I wanted to explore a tree. And when I set number of trees in the forest = 1, I saw that all errors are between 0 and 1, and this one from 100 to 300 + happens.
Someone understands cross entropy - is the code even correct, or something is undone?
The value in general to infiniteness may be when calculating logloss if for the correct class zero probability is predicted, because the formula includes all other classes except it with zero coefficient, and there seems to be tried somehow to eliminate this glitch - in tmpi in the loop find the class that in this sample has the highest probability value, maybe wanted to add to the formula, but apparently not thought it through:)
In total, 1 has tmpi and is used, 2 more have it but are not used.
In general, it does not affect the work.
tmpi is only used in 1 of the 5 error functions. Apparently it was used as a dummy for other functions, but forgot to remove it in others.
Altogether 1 has tmpi and is used, 2 more have it but are not used.
In general, it does not affect the work.
I'm basically saying that a good formula for error calculation could take into account the probability distribution across all classes, not just one correct one.
That being said, if one of the samples has zero probability of a correct class, then everything flies off to infinity.
Apparently this is why I prefer regression with a quadratic error:)
I'm basically saying that a good formula for calculating error could take into account the probability distribution across all classes, not just one correct one.
Now let Kesha (grandson of SanSanYch) and Alyosha, who was punished by investors, lead this thread. This will be fair.
It makes more sense to drop this topic, and start a new one, more adequate, with other related topics.
By the way, I found a normal distribution in prices. I have already written in Tip, that all abnormality from the "wrong" data processing - themselves and contribute).
The other day or sooner I will post in Python thread.
It makes more sense to drop this topic, and start a new one, more adequate, with another related topic.
By the way, I found a normal distribution in prices. I have already written in Tip, that all abnormality from the "wrong" data processing - themselves and contribute).
I'll post it in the Python thread one of these days or sooner.
Alas, due to the lack of people of the level of Matemat, Northwind and Prival on the forum, all these topics have no future. IMHO.