It is a little sad, that nobody seems to wanna discuss genetic algorithms at the moment. But I guess although nobody answered yet, some people will read it, or they'll come back later to the topic once they experiment with genetic algos by themselves.
Meanwhile, I applied some changes to my rules. I added a ranking operator for the fitness of each specimen (=combination of genes or parameters), which just performs a pairwise comparison and swaps places until the order is correct. This method takes a little bit of computation, but the function needs to be called only rarely, because a new ranking is only necessary before the natural selection operator is called.
Having this ranking option now, I can select from any quota of the "elite" population for chosing parent specimen for crossbreeding and I also have full flexibility for the quota of the "hitlist" of those specimen that don't comply with the required fitness to survive for the next round.
With these new functions I'm now able to freely chose how often I want "natural selection" to do it's work. I found that is has some advantages to rate a specimen always instantly directly after it has made a trade and therefore to decide in real time wether it will stay in the population or not. On the other hand, I decide over relatively unexperienced specimen this way and the individual result can be far from representative, which leads to the risk of certain properties to die out of the population although they might have been beneficial. That's why I chose to perform natural selection only occasionally and therefore maintain a higher population diversity for longer.
By the way, the method I'm using is a lot different from the way the built-in genetic optimizer works. The latter has a much higher risk of overfitting. Because I alway randomly chose the specimen for the next trade, every training pass is different and takes completely different trades. Therefore it is possible, that a population that was profitable before can have unlucky runs. It is a little like melting genetic self-learning and Monte Carlo method together. Sometime I also observe that a population that is highly profitable at times will finally converge into preferring to not trade at all. This way I can prove, that the edge is an illusion. If I would just take the best result (with reasonable profit factor, expectation, drawdown and sharpe) like one would do after a standard Metatrader backtest, only later losses in live trading would show, that the edge was an illusion. So I think having a whole population of "fit" parameter combinations instead of a single (probably more overfitted) choice and performing genetic retests constantly will finally result in higher returns.
Anybody who has an opinion on the topic please feel free to answer.
The question of good hyperparameters still remains. What is a good population size? Perform natural selection after how many trades per specimen (on average)? "Kill" what hitlist quota by natural selection? Take what "elite" population quota as parents of genetic offspring? Mutation probability? Average proximity of mutated values compared to previous values? ...
______________
Here is a list of the operators that I implemented; I think the names speak for themselves:
void Initialize(int population_size_inp,int genes_per_specimen_inp,string filename_inp); void DeInitialize(); void CreatePopulation(void); bool LoadPopulation(void); void SavePopulation(void); void KillAndReplaceRandom(int childID,ushort fitness_function); void KillAndReplaceAlpha(int childID,ushort fitness_function); void KillAndReplaceElite(int childID,double quota,ushort fitness_function); int SpecimenRandomSelect(void); int SpecimenOmegaSelect(ushort fitness_function); int SpecimenAlphaSelect(ushort fitness_function); int SpecimenEliteSelect(double quota,ushort fitness_function); float fitness(int specimenID,ushort fitness_function); bool CrossBreed(int fatherID,int motherID, int childID); void AlphaBreed(int childID); //=crossbreed best and second best specimen void Mutate(int specimenID,double probability,double proximity); void UpdateRanking(ushort fitness_function); void PrintRanking(double elite_quota,double hitlist_quota,ushort max_lines=30); void NaturalSelection(ushort fitness_function,double hitlist_quota,double elite_quota,double mutation_risk,double mutation_proximity); float update_active_days(int specimenID); float days_since_selection(int specimenID); float rnd(); //=for random double within range 0-1 int transcribe_int (int specimenID,int gene_location,int lower_limit=INT_MIN,int upper_limit=INT_MAX,string descript=""); uint transcribe_uint (int specimenID,int gene_location,uint lower_limit=0,uint upper_limit=UINT_MAX,string descript=""); short transcribe_short (int specimenID,int gene_location,short lower_limit=SHORT_MIN,short upper_limit=SHORT_MAX,string descript=""); ushort transcribe_ushort (int specimenID,int gene_location,ushort lower_limit=0,ushort upper_limit=USHORT_MAX,string descript=""); long transcribe_long (int specimenID,int gene_location,long lower_limit=LONG_MIN,long upper_limit=LONG_MAX,string descript=""); ulong transcribe_ulong (int specimenID,int gene_location,ulong lower_limit=0,ulong upper_limit=ULONG_MAX,string descript=""); double transcribe_double (int specimenID,int gene_location,double lower_limit=DBL_MIN,double upper_limit=DBL_MAX,string descript=""); float transcribe_float (int specimenID,int gene_location,float lower_limit,float upper_limit,string descript); bool transcribe_bool (int specimenID,int gene_location,string descript=""); ENUM_TIMEFRAMES transcribe_timefr(int specimenID,int gene_location,ENUM_TIMEFRAMES lower_limit=PERIOD_M1,ENUM_TIMEFRAMES upper_limit=PERIOD_MN1,string descript=""); ENUM_TIMEFRAMES ShortToTimeframe(short tf_short); short TimeframeToShort(ENUM_TIMEFRAMES period_tf);
And here is a list of the fitness functions that I work with (usually formulas 1, 3 and 8 turn out okay):
float profit=specimen_profit[specimenID]; float loss=specimen_loss[specimenID]; float result=profit-loss; float risk=specimen_risk[specimenID]; float actdays=active_days[specimenID]; //=exposure time in the market int winners=specimen_winners[specimenID]; int losers=specimen_losers[specimenID]; float avgwinner=0;if(winners!=0){avgwinner=profit/winners;} float avgloser=0;if(losers!=0){avgloser=loss/losers;} float winrate=0;if(losers!=0){winrate=((float)winners)/(winners+losers);} switch(fitness_function) //------------------------------------------------------------------------------------------------------ {//case | zero division protection | fitness formula | action in zero division case case 0: return profit; case 1: return result; case 2: if (risk!=0) {return profit/risk;} else {return 0;} case 3: if (risk!=0) {return result/risk;} else {return 0;} case 4: if (risk*actdays!=0) {return profit/(risk*actdays);} else {return 0;} case 5: if (risk*actdays!=0) {return result/(risk*actdays);} else {return 0;} case 6: if (loss!=0) {return profit/loss;} else {return 1;} //=profit factor case 7: if (loss*actdays!=0) {return profit/(loss*actdays);} else {return 1;} //=profit factor with time penalty case 8: if (avgloser*(1-winrate)!=0) {return (avgwinner*winrate)/(avgloser*(1-winrate));} //=expected value else {return 1;} case 9: if (avgloser*(1-winrate)*actdays!=0) {return (avgwinner*winrate)/(avgloser*(1-winrate)*actdays);} //=expected value with time penalty else if (actdays!=0) {return 1/actdays;} else {return 0;} case 10: return winrate; case 11: if ((risk+loss)*actdays!=0) {return profit/((risk+loss)*actdays);} else if (actdays!=0) {return 1/actdays;} default: return 0; }//------------------------------------------------------------------------------------------------------
Let me add pointing out what is the main difference between the standard Metatrader genetic optimization and my approach:
METATRADER:
- objective: find a single "fit" parameter combination that historically performed best
- method: randomize a single parameter combination and cycle through the whole backtest period with it
- natural selection timing: after a series of completed backtest periods until their end
- training and trading are two separate processes
MY APPROACH:
- objective: evolution of a population of "fit" parameter combinations
- method: randomization happening before every trade
- natural selection timing: repeatedly during training period and live trading
- training doesn't stop during trading
It is an interesting topic but I personally don't have time currently.
- Free trading apps
- Over 8,000 signals for copying
- Economic news for exploring financial markets
You agree to website policy and terms of use
Over the recent years, several very good articles have been written here on this website on the topic of genetic algorithms. We all know about their huge potential from Metatrader backtesting experience. There is only one problem: we can (apart from insights through additional forward testing) never know how future-proof or random the best backtest results actually are, because they are not constantly retested. The results end where the backtest ends. It therefore makes a lot of sense to implement continuous self-improvement via genetic algorithms into live trading. I walked that path and I'm happy with it, just like I'm not reinventing the wheel here - many people have done it before and there are some good code examples available.
What I can't find on the other hand is a discussion about hyperparameters for fine-tuning. As the main components of a genetic algorithm are usually much the same, there certainly are some hyperparameters, that most such genetic EA's have in common.
I will describe some specific rules of a real life example of what such an algorithm can look like.
A genetic algorithm has many biological analogies and the purpose is to constantly improve a set of properties (here: EA parameters) by the principle of "survival of the fittest"
So here is the promise of some specific definitions and rules (please discuss with me what you would do differently):
Now after these definitions let's apply some rules:
I am not trying to develop a new algorithm here. My code is complete and working and it is amazing how it finds great solutions in real time. The cool thing is: all that annoying backtesting becomes obsolete. I have e.g. a very complex EA for which the number of adjustable parameters added up to 56 over the time of development. But once I apply genetic self-learning, I don't have to care about any of those parameters. If the system has an edge, the algorithm will find it.
However, any system can be tweaked… So to any of you who have their own experience with genetic learning: are there any things you did differently? Why? And what is your solution for the hyperparameters printed in bold letters in this text?
I know, this discussion goes into some details that are not for everybody, but might be extremely interesting if you have dealt with the same stuff.