Data Science and Machine Learning (Part 16): A Refreshing Look at Decision Trees

MetaTrader 5 — Trading | 13 December 2023, 13:12

2 322

Omega J Msigwa

Quick Recap

I wrote an article on decision trees in this article series that explained what decision trees are all about, and we built an algorithm to help us classify the weather data. However, the code and explanations provided in the article weren't concise enough; as I keep getting requests to provide a better approach to building decision trees, I believe writing a second article and providing better code for the decision tree might be better. Clarifying the decision trees will make it easier to understand the random forest algorithms that an article is coming out shortly.

What is a Decision Tree?

A Decision Tree is a flowchart-like tree structure where each internal Node represents a test on an attribute (or feature), each branch represents the outcome of the test, and each leaf Node represents a class label or a continuous value. The topmost Node in a decision tree is known as the "root," and the leaves are the outcomes or predictions.

What is a Node?

In a decision tree, a node is a fundamental component representing a decision point based on a particular feature or attribute. There are two main types of nodes in a decision tree: internal nodes and leaf nodes.

Internal Node

An internal node is a decision point in the tree where a test is performed on a specific feature. The test relies on a particular condition, such as whether a feature value is greater than a threshold or belongs to a particular category.
Internal nodes have branches (edges) leading to child nodes. The outcome of the test determines which branch to follow.
The internal Nodes, which are two left and right child nodes, are Nodes inside the central tree node.

Leaf Node (or Terminal Node)

A leaf node marks a terminal point in the tree where it makes a final decision or prediction. It denotes the class label in a classification task or the predicted value in a regression task.
Leaf nodes have no outgoing branches; they are the endpoints of the decision process.
We will code this as a double variable.

class Node
{
  public:
    // for decision node
       
    uint feature_index;
    double threshold;
    double info_gain;
     
    // for leaf node
     
    double leaf_value;   
      
    Node *left_child;  //left child Node
    Node *right_child; //right child Node

    Node() : left_child(NULL), right_child(NULL) {} // default constructor

    Node(uint feature_index_, double threshold_=NULL, Node *left_=NULL, Node *right_=NULL, double info_gain_=NULL, double value_=NULL)
        : left_child(left_), right_child(right_)
    {
        this.feature_index = feature_index_;
        this.threshold = threshold_;
        this.info_gain = info_gain_;
        this.value = value_;
    }
    
   void Print()
    {
      printf("feature_index: %d \nthreshold: %f \ninfo_gain: %f \nleaf_value: %f",feature_index,threshold, info_gain, value);
    }    
};

Unlike some ML algorithms we have coded from scratch in this series, the decision tree can be tricky to code and confusing at times, as it requires recursive classes and functions to implement well, something that can be hard to code in a language other than Python, according to my experience.

Components of a Node:

A node in a decision tree typically contains the following information:

01. Test condition

Internal nodes have a test condition based on a specific feature and a threshold or category. This condition determines how the data is split into child nodes.

You don't see the test condition in this Node class, but we will implement it inside a build_tree function, which integrates itself to the node class as it returns the node class instance.

Node *build_tree(matrix &data, uint curr_depth=0);

02. Feature and Threshold

Indicates which feature is being tested at the node and the threshold or category used for the split.

uint feature_index;
double threshold;

03. Class Label or Value

A leaf node stores the predicted class label (for classification) or value (for regression)

double leaf_value;

04. Child Nodes

Internal nodes have child nodes corresponding to the different outcomes of the test condition. Each child node represents a subset of the data that satisfies the condition.

Node *left_child;  //left child Node
Node *right_child; //right child Node

Example:

Consider a simple decision tree for classifying whether a fruit is an apple or an orange based on its color;

[Node]

feature: color

Test condition: is the color Red?

If True, go to the left child; if False, go to the right child

[Leaf Node - Apple]

-Class label: Apple

[Leaf Node - Orange]

-Class label: Orange

Types of Decision Trees:

CART (Classification and Regression Trees): Used for both classification and regression tasks. Splits the data based on the Gini impurity for classification and mean squared error for regression.

ID3 (Iterative Dichotomiser 3): Primarily used for classification tasks. Employs the concept of entropy and information gain to make decisions.

C4.5: An improved version of ID3, C4.5, is used for classification. It employs a gain ratio to address bias towards attributes with more levels.

Since we will be looking to use the decision tree for classification purposes, we will be looking to build the ID3 algorithm characterized by Information gain, Impurity calculation and categorical features:

ID3 (Iterative Dichotomiser 3)

ID3 uses Information Gain to decide which feature to split on at each internal node. Information gain measures the reduction in entropy or uncertainty after a dataset is split.

double CDecisionTree::information_gain(vector &parent, vector &left_child, vector &right_child)
 {  
    double weight_left = left_child.Size() / (double)parent.Size(),
           weight_right = right_child.Size() / (double)parent.Size();
    
    double gain =0;    
    switch(m_mode)
      {
       case  MODE_GINI:
         gain = gini_index(parent) - ( (weight_left*gini_index(left_child)) + (weight_right*gini_index(right_child)) );
         break;
       case MODE_ENTROPY:
         gain = entropy(parent) - ( (weight_left*entropy(left_child)) + (weight_right*entropy(right_child)) );
         break;
      }
    
   return gain;
 }

Entropy is a measure of uncertainty or disorder in a dataset. In ID3, the algorithm seeks to reduce entropy by choosing feature splits that result in subsets with more homogenous class labels.

double CDecisionTree::entropy(vector &y)
 {    
   vector class_labels = matrix_utils.Unique_count(y);
     
   vector p_cls = class_labels / double(y.Size());
  
   vector entropy = (-1 * p_cls) * log2(p_cls);
  
  return entropy.Sum();
 }

To give more flexibility, one can choose between entropy and the Gini index, which is also a function often used in decision trees that does the same work as the entropy function. They both evaluate the impurity or disorder in the dataset.

double CDecisionTree::gini_index(vector &y)
 {
   vector unique = matrix_utils.Unique_count(y);
   
   vector probabilities = unique / (double)y.Size();
   
   return 1.0 - MathPow(probabilities, 2).Sum();
 }

Given by the formulas in the image below:

ID3 is particularly suitable for Categorical Features, and the selection of features and thresholds is based on the entropy reduction for categorical splits. We'll see this in action on the decision tree algorithm below.

Decision Tree Algorithm

01. Splitting Criteria

For classification, standard splitting criteria are Gini impurity and entropy, while mean squared error is often used for regression. Let's delve into the decision tree algorithm's splitting functions, which begin with the structure to retain information for the data undergoing a split.

//A struct containing splitted data information
struct split_info
  {
   uint feature_index;
   double threshold;
   matrix dataset_left,
          dataset_right;
   double info_gain;
  };

Using the threshold, we will split the features with values less than the threshold to the matrix dataset_left while keeping the rest to the matrix dataset_right. Lastly, the split_info structure instance is returned.

split_info CDecisionTree::split_data(const matrix &data, uint feature_index, double threshold=0.5)
 {
   int left_size=0, right_size =0;
   vector row = {};
   
   split_info split;
   
   ulong cols = data.Cols();
   
   split.dataset_left.Resize(0, cols);
   split.dataset_right.Resize(0, cols);
      
    for (ulong i=0; i<data.Rows(); i++)
     {       
       row = data.Row(i);
       
       if (row[feature_index] <= threshold)
        {
          left_size++;
          split.dataset_left.Resize(left_size, cols);
          split.dataset_left.Row(row, left_size-1); 
        }
       else
        {
         right_size++;
         split.dataset_right.Resize(right_size, cols);
         split.dataset_right.Row(row, right_size-1);         
        }
     }
   return split;
 }

Out of many splits, the algorithm needs to figure out the best splits, the one with the maximum information gain.

split_info CDecisionTree::get_best_split(matrix &data, uint num_features)
  {
  
   double max_info_gain = -DBL_MAX;
   vector feature_values = {};
   vector left_v={}, right_v={}, y_v={};
   
//---
   
   split_info best_split;
   split_info split;
   
   for (uint i=0; i<num_features; i++)
     {
       feature_values = data.Col(i);
       vector possible_thresholds = matrix_utils.Unique(feature_values); //Find unique values in the feature, representing possible thresholds for splitting.
                  
         for (uint j=0; j<possible_thresholds.Size(); j++)
            {              
              split = this.split_data(data, i, possible_thresholds[j]);
              
              if (split.dataset_left.Rows()>0 && split.dataset_right.Rows() > 0)
                {
                  y_v = data.Col(data.Cols()-1);
                  right_v = split.dataset_right.Col(split.dataset_right.Cols()-1);
                  left_v = split.dataset_left.Col(split.dataset_left.Cols()-1);
                  
                  double curr_info_gain = this.information_gain(y_v, left_v, right_v);
                                    
                  if (curr_info_gain > max_info_gain) // Check if the current information gain is greater than the maximum observed so far.
                    {             
                      #ifdef DEBUG_MODE
                        printf("split left: [%dx%d] split right: [%dx%d] curr_info_gain: %f max_info_gain: %f",split.dataset_left.Rows(),split.dataset_left.Cols(),split.dataset_right.Rows(),split.dataset_right.Cols(),curr_info_gain,max_info_gain);
                      #endif 
                      
                      best_split.feature_index = i;
                      best_split.threshold = possible_thresholds[j];
                      best_split.dataset_left = split.dataset_left;
                      best_split.dataset_right = split.dataset_right;
                      best_split.info_gain = curr_info_gain;
                      
                      max_info_gain = curr_info_gain;
                    }
                }
            }    
     }
     
    return best_split;
  }

This function searches overall features and possible thresholds to find the best split that maximizes information gain. The result is a split_info structure containing information about the feature, threshold, and subsets associated with the best split.

02. Building the Tree

Decision Trees are constructed by recursively splitting the dataset based on features until a stopping condition is met (e.g., reaching a certain depth or minimum samples).

Node *CDecisionTree::build_tree(matrix &data, uint curr_depth=0)
 {
    matrix X;
    vector Y;
      
    matrix_utils.XandYSplitMatrices(data,X,Y); //Split the input matrix into feature matrix X and target vector Y.
    
    ulong samples = X.Rows(), features = X.Cols(); //Get the number of samples and features in the dataset.
        
    Node *node= NULL; // Initialize node pointer
            
    if (samples >= m_min_samples_split && curr_depth<=m_max_depth)
      {
         split_info best_split = this.get_best_split(data, (uint)features);
         
         #ifdef DEBUG_MODE
          Print("best_split left: [",best_split.dataset_left.Rows(),"x",best_split.dataset_left.Cols(),"]\nbest_split right: [",best_split.dataset_right.Rows(),"x",best_split.dataset_right.Cols(),"]\nfeature_index: ",best_split.feature_index,"\nInfo gain: ",best_split.info_gain,"\nThreshold: ",best_split.threshold);
         #endif 
                  
         if (best_split.info_gain > 0)
           {
             Node *left_child = this.build_tree(best_split.dataset_left, curr_depth+1);
             Node *right_child = this.build_tree(best_split.dataset_right, curr_depth+1);
                          
             node = new Node(best_split.feature_index,best_split.threshold,left_child,right_child,best_split.info_gain);
             return node;
           }
      }      
     
     node = new Node();
     node.leaf_value = this.calculate_leaf_value(Y);
          
     return node;
 }

if (best_split.info_gain > 0):

The above line of code checks if there is information gained.

Inside this block:

Node *left_child = this.build_tree(best_split.dataset_left, curr_depth+1);

Recursively build the left child node.

Node *right_child = this.build_tree(best_split.dataset_right, curr_depth+1);

Recursively build the right child node.

node = new Node(best_split.feature_index, best_split.threshold, left_child, right_child, best_split.info_gain);

Create a decision node with the information from the best split.

node = new Node();

If no further split is needed, create a new leaf node.

node.value = this.calculate_leaf_value(Y);

Set the value of the leaf node using the calculate_leaf_value function.

return node;

Return the node representing the current split or leaf.

To make the functions convenient and user-friendly, the build_tree function can be kept inside the fit function, which is commonly used in Python machine-learning modules.

void CDecisionTree::fit(matrix &x, vector &y)
 {
   matrix data = matrix_utils.concatenate(x, y, 1);
   
   this.root = this.build_tree(data);
 }

Making Predictions on Training and Testing of the Model

vector CDecisionTree::predict(matrix &x)
 {
    vector ret(x.Rows());
    
    for (ulong i=0; i<x.Rows(); i++)
       ret[i] = this.predict(x.Row(i));
       
   return ret;
 }

Making Predictions in Real-Time

double CDecisionTree::predict(vector &x)
 {     
   return this.make_predictions(x, this.root);
 }

The make_predictions function is where all the dirty work gets done:

double CDecisionTree::make_predictions(vector &x, const Node &tree)
 {
    if (tree.leaf_value != NULL) // This is a leaf leaf_value
      return tree.leaf_value;
      
    double feature_value = x[tree.feature_index];
    double pred = 0;
    
    #ifdef DEBUG_MODE
      printf("Tree.threshold %f tree.feature_index %d leaf_value %f",tree.threshold,tree.feature_index,tree.leaf_value);
    #endif 
    
    if (feature_value <= tree.threshold)
      {
       pred = this.make_predictions(x, tree.left_child);  
      }
    else
     {
       pred = this.make_predictions(x, tree.right_child);
     }
     
   return pred;
 }

More details on this function:

Check if the feature value is less than or equal to the threshold of the current node.

if (feature_value <= tree.threshold):

Inside this block:

Recursively call make_predictions for the left child node.

pred = this.make_predictions(x, *tree.left_child);

Else, If the feature value is greater than the threshold:

Recursively call the make_predictions function for the right child node.

pred = this.make_predictions(x, *tree.right_child);

return pred;

Return the prediction.

Leaf Value Calculations

The function below calculates the leaf value:

double CDecisionTree::calculate_leaf_value(vector &Y)
 {   
   vector uniques = matrix_utils.Unique_count(Y);
   vector classes = matrix_utils.Unique(Y);
   
   return classes[uniques.ArgMax()];
 }

This function returns the element from Y with the highest count, effectively finding the most common element in the list.

Wrapping it all up in a CDecisionTree class

enum mode {MODE_ENTROPY, MODE_GINI};

class CDecisionTree
  {
CMatrixutils   matrix_utils;

protected:
      
   Node *build_tree(matrix &data, uint curr_depth=0);
   double  calculate_leaf_value(vector &Y);
   
//---
   
   uint m_max_depth;
   uint m_min_samples_split;
   
   mode m_mode;
   
   double  gini_index(vector &y);
   double  entropy(vector &y);
   double  information_gain(vector &parent, vector &left_child, vector &right_child);
   
   
   split_info  get_best_split(matrix &data, uint num_features);
   split_info  split_data(const matrix &data, uint feature_index, double threshold=0.5);
   
   double make_predictions(vector &x, const Node &tree);
   void delete_tree(Node* node);
   
public:
                     Node *root;
                     
                     CDecisionTree(uint min_samples_split=2, uint max_depth=2, mode mode_=MODE_GINI);
                    ~CDecisionTree(void);
                    
                     void fit(matrix &x, vector &y);
                     void print_tree(Node *tree, string indent=" ",string padl="");
                     double predict(vector &x);
                     vector predict(matrix &x);
  };

Having shown that, let us observe how everything works in action, how to build the tree, and how to use it to make predictions on training and testing, not to mention during real-time trading. We will use the most popular iris-CSV dataset to test if it does work.

Suppose we will be training the decision tree model on every EA initialization, starting by loading the training data from a CSV file:

int OnInit()
  {
  matrix dataset = matrix_utils.ReadCsv("iris.csv"); //loading iris-data
  
  decision_tree = new CDecisionTree(3,3, MODE_GINI); //Initializing the decision tree
  
  matrix x; vector y;
  matrix_utils.XandYSplitMatrices(dataset,x,y); //split the data into x and y matrix and vector respectively
  
  
  decision_tree.fit(x, y);  //Building the tree
  decision_tree.print_tree(decision_tree.root); //Printing the tree
  
  vector preds = decision_tree.predict(x); //making the predictions on a training data
  
  Print("Train Acc = ",metrics.confusion_matrix(y, preds)); //Measuring the accuracy
  
   return(INIT_SUCCEEDED);
  }

This is the appearance of the dataset matrix when printed. The last column has been Encoded. One(1) stands for Setosa, two(2) stands for Versicolor and three(3) stands for Virginica

Print("iris-csv\n",dataset);

MS      0       08:54:40.958    DecisionTree Test (EURUSD,H1)   iris-csv
PH      0       08:54:40.958    DecisionTree Test (EURUSD,H1)   [[5.1,3.5,1.4,0.2,1]
CO      0       08:54:40.958    DecisionTree Test (EURUSD,H1)    [4.9,3,1.4,0.2,1]
...
...

NS      0       08:54:40.959    DecisionTree Test (EURUSD,H1)    [5.6,2.7,4.2,1.3,2]
JK      0       08:54:40.959    DecisionTree Test (EURUSD,H1)    [5.7,3,4.2,1.2,2]
...
...

NQ      0       08:54:40.959    DecisionTree Test (EURUSD,H1)    [6.2,3.4,5.4,2.3,3]
PD      0       08:54:40.959    DecisionTree Test (EURUSD,H1)    [5.9,3,5.1,1.8,3]]

Printing the Tree

If you look at the code, you may have noticed the function print_tree, which takes the tree root as one of its arguments. This function attempts to print the overall tree appearance; a closer look below.

void CDecisionTree::print_tree(Node *tree, string indent=" ",string padl="")
  {
     if (tree.leaf_value != NULL)
        Print((padl+indent+": "),tree.leaf_value); 
     else //if we havent' reached the leaf node keep printing child trees
       {
         padl += " ";
         
         Print((padl+indent)+": X_",tree.feature_index, "<=", tree.threshold, "?", tree.info_gain);
         
         print_tree(tree.left_child, "left","--->"+padl);
         
         print_tree(tree.right_child, "right","--->"+padl);
       }
  }

More details on this function:

Node Structure:

The function assumes that a Node class represents the decision tree. Each Node can be either a decision node or a leaf node. Decision nodes have a feature_index, a threshold, and an info_gain indicating the feature, threshold, information gain, and leaf_value.

Print Decision Node:

If the current Node is not a leaf node (i.e., tree.leaf_value is NULL), it prints information about the decision node. It prints the condition for the split, such as "X_2 <= 1.9 ? 0.33" and the indentation level.

Print Leaf Node:

If the current Node is a leaf node (i.e., tree.leaf_value is not NULL), it prints the leaf value along with the indentation level. For example, "left: 0.33".

Recursion:

The function then recursively calls itself for the left and right children of the current Node. The padl argument adds indentation to the printed output, making the tree structure more readable.

The output of the print_tree for the decision tree built inside the OnInit function is:

CR      0       09:26:39.990    DecisionTree Test (EURUSD,H1)     : X_2<=1.9?0.3333333333333334
HO      0       09:26:39.990    DecisionTree Test (EURUSD,H1)   ---> left: 1.0
RH      0       09:26:39.990    DecisionTree Test (EURUSD,H1)   --->  right: X_3<=1.7?0.38969404186795487
HP      0       09:26:39.990    DecisionTree Test (EURUSD,H1)   --->--->   left: X_2<=4.9?0.08239026063100136
KO      0       09:26:39.990    DecisionTree Test (EURUSD,H1)   --->--->--->    left: X_3<=1.6?0.04079861111111116
DH      0       09:26:39.990    DecisionTree Test (EURUSD,H1)   --->--->--->--->    left: 2.0
HM      0       09:26:39.990    DecisionTree Test (EURUSD,H1)   --->--->--->--->    right: 3.0
HS      0       09:26:39.990    DecisionTree Test (EURUSD,H1)   --->--->--->    right: X_3<=1.5?0.2222222222222222
IH      0       09:26:39.990    DecisionTree Test (EURUSD,H1)   --->--->--->--->    left: 3.0
QM      0       09:26:39.990    DecisionTree Test (EURUSD,H1)   --->--->--->--->    right: 2.0
KP      0       09:26:39.990    DecisionTree Test (EURUSD,H1)   --->--->   right: X_2<=4.8?0.013547574039067499
PH      0       09:26:39.990    DecisionTree Test (EURUSD,H1)   --->--->--->    left: X_0<=5.9?0.4444444444444444
PE      0       09:26:39.990    DecisionTree Test (EURUSD,H1)   --->--->--->--->    left: 2.0
DP      0       09:26:39.990    DecisionTree Test (EURUSD,H1)   --->--->--->--->    right: 3.0
EE      0       09:26:39.990    DecisionTree Test (EURUSD,H1)   --->--->--->   right: 3.0

Impressive.

Below is the accuracy of our trained Model:

  vector preds = decision_tree.predict(x); //making the predictions on a training data
  
  Print("Train Acc = ",metrics.confusion_matrix(y, preds)); //Measuring the accuracy

Outputs:

PM      0       09:26:39.990    DecisionTree Test (EURUSD,H1)   Confusion Matrix
CE      0       09:26:39.990    DecisionTree Test (EURUSD,H1)   [[50,0,0]
HR      0       09:26:39.990    DecisionTree Test (EURUSD,H1)    [0,50,0]
ND      0       09:26:39.990    DecisionTree Test (EURUSD,H1)    [0,1,49]]
GS      0       09:26:39.990    DecisionTree Test (EURUSD,H1)   
KF      0       09:26:39.990    DecisionTree Test (EURUSD,H1)   Classification Report
IR      0       09:26:39.990    DecisionTree Test (EURUSD,H1)   
MD      0       09:26:39.990    DecisionTree Test (EURUSD,H1)   _    Precision  Recall  Specificity  F1 score  Support
EQ      0       09:26:39.990    DecisionTree Test (EURUSD,H1)   1.0    50.00     50.00     100.00       50.00     50.0
HR      0       09:26:39.990    DecisionTree Test (EURUSD,H1)   2.0    51.00     50.00     100.00       50.50     50.0
PO      0       09:26:39.990    DecisionTree Test (EURUSD,H1)   3.0    49.00     50.00     100.00       49.49     50.0
EH      0       09:26:39.990    DecisionTree Test (EURUSD,H1)   
PR      0       09:26:39.990    DecisionTree Test (EURUSD,H1)   Accuracy                                   0.99
HQ      0       09:26:39.990    DecisionTree Test (EURUSD,H1)   Average   50.00    50.00    100.00      50.00    150.0
DJ      0       09:26:39.990    DecisionTree Test (EURUSD,H1)   W Avg     50.00    50.00    100.00      50.00    150.0
LG      0       09:26:39.990    DecisionTree Test (EURUSD,H1)   Train Acc = 0.993

We achieved a 99.3% accuracy, indicating the successful implementation of our decision tree. This accuracy aligns with what you would expect from Scikit-Learn models when dealing with a simple dataset problem.

Let's proceed to train further and test the Model on out-of-sample data.

  matrix train_x, test_x;
  vector train_y, test_y;
  
  matrix_utils.TrainTestSplitMatrices(dataset, train_x, train_y, test_x, test_y, 0.8, 42); //split the data into training and testing samples

  
  decision_tree.fit(train_x, train_y);  //Building the tree
  decision_tree.print_tree(decision_tree.root); //Printing the tree
  
  vector preds = decision_tree.predict(train_x); //making the predictions on a training data
  
  Print("Train Acc = ",metrics.confusion_matrix(train_y, preds)); //Measuring the accuracy
  
//---

  preds = decision_tree.predict(test_x); //making the predictions on a test data
  
  Print("Test Acc = ",metrics.confusion_matrix(test_y, preds)); //Measuring the accuracy

Outputs:

QD      0       14:56:03.860    DecisionTree Test (EURUSD,H1)     : X_2<=1.7?0.34125
LL      0       14:56:03.860    DecisionTree Test (EURUSD,H1)   ---> left: 1.0
QK      0       14:56:03.860    DecisionTree Test (EURUSD,H1)   --->  right: X_3<=1.6?0.42857142857142855
GS      0       14:56:03.860    DecisionTree Test (EURUSD,H1)   --->--->   left: X_2<=4.9?0.09693877551020412
IL      0       14:56:03.860    DecisionTree Test (EURUSD,H1)   --->--->--->   left: 2.0
MD      0       14:56:03.860    DecisionTree Test (EURUSD,H1)   --->--->--->    right: X_3<=1.5?0.375
IS      0       14:56:03.860    DecisionTree Test (EURUSD,H1)   --->--->--->--->    left: 3.0
QR      0       14:56:03.860    DecisionTree Test (EURUSD,H1)   --->--->--->--->    right: 2.0
RH      0       14:56:03.860    DecisionTree Test (EURUSD,H1)   --->--->  right: 3.0
HP      0       14:56:03.860    DecisionTree Test (EURUSD,H1)   Confusion Matrix
FG      0       14:56:03.860    DecisionTree Test (EURUSD,H1)   [[42,0,0]
EO      0       14:56:03.860    DecisionTree Test (EURUSD,H1)    [0,39,0]
HK      0       14:56:03.860    DecisionTree Test (EURUSD,H1)    [0,0,39]]
OL      0       14:56:03.860    DecisionTree Test (EURUSD,H1)   
KE      0       14:56:03.860    DecisionTree Test (EURUSD,H1)   Classification Report
QO      0       14:56:03.860    DecisionTree Test (EURUSD,H1)   
MQ      0       14:56:03.860    DecisionTree Test (EURUSD,H1)   _    Precision  Recall  Specificity  F1 score  Support
OQ      0       14:56:03.860    DecisionTree Test (EURUSD,H1)   1.0    42.00     42.00     78.00       42.00     42.0
ML      0       14:56:03.860    DecisionTree Test (EURUSD,H1)   3.0    39.00     39.00     81.00       39.00     39.0
HK      0       14:56:03.860    DecisionTree Test (EURUSD,H1)   2.0    39.00     39.00     81.00       39.00     39.0
OE      0       14:56:03.860    DecisionTree Test (EURUSD,H1)   
EO      0       14:56:03.860    DecisionTree Test (EURUSD,H1)   Accuracy                                   1.00
CG      0       14:56:03.860    DecisionTree Test (EURUSD,H1)   Average   40.00    40.00    80.00      40.00    120.0
LF      0       14:56:03.860    DecisionTree Test (EURUSD,H1)   W Avg     40.05    40.05    79.95      40.05    120.0
PR      0       14:56:03.860    DecisionTree Test (EURUSD,H1)   Train Acc = 1.0
CD      0       14:56:03.861    DecisionTree Test (EURUSD,H1)   Confusion Matrix
FO      0       14:56:03.861    DecisionTree Test (EURUSD,H1)   [[9,2,0]
RK      0       14:56:03.861    DecisionTree Test (EURUSD,H1)    [1,10,0]
CL      0       14:56:03.861    DecisionTree Test (EURUSD,H1)    [2,0,6]]
HK      0       14:56:03.861    DecisionTree Test (EURUSD,H1)   
DQ      0       14:56:03.861    DecisionTree Test (EURUSD,H1)   Classification Report
JJ      0       14:56:03.861    DecisionTree Test (EURUSD,H1)   
FM      0       14:56:03.861    DecisionTree Test (EURUSD,H1)   _    Precision  Recall  Specificity  F1 score  Support
QM      0       14:56:03.861    DecisionTree Test (EURUSD,H1)   2.0    12.00     11.00     19.00       11.48     11.0
PH      0       14:56:03.861    DecisionTree Test (EURUSD,H1)   3.0    12.00     11.00     19.00       11.48     11.0
KD      0       14:56:03.861    DecisionTree Test (EURUSD,H1)   1.0    6.00     8.00     22.00       6.86     8.0
PP      0       14:56:03.861    DecisionTree Test (EURUSD,H1)   
LJ      0       14:56:03.861    DecisionTree Test (EURUSD,H1)   Accuracy                                   0.83
NJ      0       14:56:03.861    DecisionTree Test (EURUSD,H1)   Average   10.00    10.00    20.00      9.94    30.0
JR      0       14:56:03.861    DecisionTree Test (EURUSD,H1)   W Avg     10.40    10.20    19.80      10.25    30.0
HP      0       14:56:03.861    DecisionTree Test (EURUSD,H1)   Test Acc = 0.833

The model is 100% accurate on a training data while being 83% accurate on out-of-sample data.

Decision Tree AI in Trading

All this amounts to nothing if we don't explore the trading aspect using the decision tree models. To use this Model in trading, let us formulate a problem we want to solve.

The Problem To Solve:

We want to use the decision tree AI model to make predictions on the current bar to possibly tell us where the market is heading, either up or down.

As with any model, we want to give the Model a dataset to learn upon; let's say we decide to use the two indicators of oscillator types, The RSI indicator and the Stochastic oscillator; basically, We want the Model to understand the patterns between these two indicators and how it affects the price movement for the current bar.

Data structure:

Once collected for train-test purposes, the data gets stored in the structure below. The same applies to data used for making real-time predictions.

struct data{
   vector stoch_buff, 
          signal_buff, 
          rsi_buff, 
          target;
} data_struct;

Collecting Data, Training and Testing the decision Tree

void TrainTree()
 {
  matrix dataset(train_bars, 4);
  vector v;

//--- Collecting indicator buffers

  data_struct.rsi_buff.CopyIndicatorBuffer(rsi_handle, 0, 1, train_bars);
  data_struct.stoch_buff.CopyIndicatorBuffer(stoch_handle, 0, 1, train_bars);
  data_struct.signal_buff.CopyIndicatorBuffer(stoch_handle, 1, 1, train_bars);
  
//--- Preparing the target variable
   
  MqlRates rates[];
  ArraySetAsSeries(rates, true); 
  int size = CopyRates(Symbol(), PERIOD_CURRENT, 1,train_bars, rates);
  
  data_struct.target.Resize(size); //Resize the target vector
  
  for (int i=0; i<size; i++)
    {
      if (rates[i].close > rates[i].open)
        data_struct.target[i] = 1;
      else 
        data_struct.target[i] = -1;
    }
  
  dataset.Col(data_struct.rsi_buff, 0);
  dataset.Col(data_struct.stoch_buff, 1);
  dataset.Col(data_struct.signal_buff, 2);
  dataset.Col(data_struct.target, 3);
   
  decision_tree = new CDecisionTree(min_sample,max_depth_, tree_mode); //Initializing the decision tree
  

  matrix train_x, test_x;
  vector train_y, test_y;
  
  matrix_utils.TrainTestSplitMatrices(dataset, train_x, train_y, test_x, test_y, 0.8, 42); //split the data into training and testing samples

  
  decision_tree.fit(train_x, train_y);  //Building the tree
  decision_tree.print_tree(decision_tree.root); //Printing the tree
  
  vector preds = decision_tree.predict(train_x); //making the predictions on a training data
  
  Print("Train Acc = ",metrics.confusion_matrix(train_y, preds)); //Measuring the accuracy
  
//---

  preds = decision_tree.predict(test_x); //making the predictions on a test data
  
  Print("Test Acc = ",metrics.confusion_matrix(test_y, preds)); //Measuring the accuracy
 }

Min-sample was set to 3 while the max-depth was set to 5.

Outputs:

KR      0       16:26:53.028    DecisionTree Test (EURUSD,H1)     : X_0<=65.88930872549261?0.0058610536710859695
CN      0       16:26:53.028    DecisionTree Test (EURUSD,H1)   --->  left: X_0<=29.19882857713344?0.003187469522387243
FK      0       16:26:53.028    DecisionTree Test (EURUSD,H1)   --->--->   left: X_1<=26.851851851853503?0.030198175526895188
RI      0       16:26:53.028    DecisionTree Test (EURUSD,H1)   --->--->--->    left: X_2<=7.319205739522295?0.040050858232676456
KG      0       16:26:53.028    DecisionTree Test (EURUSD,H1)   --->--->--->--->     left: X_0<=23.08345903222593?0.04347468770545693
JF      0       16:26:53.028    DecisionTree Test (EURUSD,H1)   --->--->--->--->--->      left: X_0<=21.6795921184317?0.09375
PF      0       16:26:53.028    DecisionTree Test (EURUSD,H1)   --->--->--->--->--->--->      left: -1.0
ER      0       16:26:53.028    DecisionTree Test (EURUSD,H1)   --->--->--->--->--->--->      right: -1.0
QF      0       16:26:53.028    DecisionTree Test (EURUSD,H1)   --->--->--->--->--->      right: X_2<=3.223853479489069?0.09876543209876543
LH      0       16:26:53.028    DecisionTree Test (EURUSD,H1)   --->--->--->--->--->--->      left: -1.0
FJ      0       16:26:53.028    DecisionTree Test (EURUSD,H1)   --->--->--->--->--->--->      right: 1.0
MM      0       16:26:53.028    DecisionTree Test (EURUSD,H1)   --->--->--->--->    right: -1.0
MG      0       16:26:53.028    DecisionTree Test (EURUSD,H1)   --->--->--->   right: 1.0
HH      0       16:26:53.028    DecisionTree Test (EURUSD,H1)   --->--->   right: X_0<=65.4606831930956?0.0030639039663222234
JR      0       16:26:53.028    DecisionTree Test (EURUSD,H1)   --->--->--->    left: X_0<=31.628407983040333?0.00271101025966336
PS      0       16:26:53.028    DecisionTree Test (EURUSD,H1)   --->--->--->--->     left: X_0<=31.20436037455599?0.0944903581267218
DO      0       16:26:53.028    DecisionTree Test (EURUSD,H1)   --->--->--->--->--->      left: X_2<=14.629981942657205?0.11111111111111116
EO      0       16:26:53.028    DecisionTree Test (EURUSD,H1)   --->--->--->--->--->--->      left: 1.0
IG      0       16:26:53.028    DecisionTree Test (EURUSD,H1)   --->--->--->--->--->--->      right: -1.0
EI      0       16:26:53.028    DecisionTree Test (EURUSD,H1)   --->--->--->--->--->     right: 1.0
LO      0       16:26:53.028    DecisionTree Test (EURUSD,H1)   --->--->--->--->     right: X_0<=32.4469112469684?0.003164795835173595
RO      0       16:26:53.028    DecisionTree Test (EURUSD,H1)   --->--->--->--->--->      left: X_1<=76.9736842105244?0.21875
RO      0       16:26:53.028    DecisionTree Test (EURUSD,H1)   --->--->--->--->--->--->      left: -1.0
PG      0       16:26:53.028    DecisionTree Test (EURUSD,H1)   --->--->--->--->--->--->      right: 1.0
MO      0       16:26:53.028    DecisionTree Test (EURUSD,H1)   --->--->--->--->--->      right: X_0<=61.82001028403415?0.0024932856070305487
LQ      0       16:26:53.028    DecisionTree Test (EURUSD,H1)   --->--->--->--->--->--->      left: -1.0
EQ      0       16:26:53.029    DecisionTree Test (EURUSD,H1)   --->--->--->--->--->--->      right: 1.0
LE      0       16:26:53.029    DecisionTree Test (EURUSD,H1)   --->--->--->    right: X_2<=84.68660541575225?0.09375
ED      0       16:26:53.029    DecisionTree Test (EURUSD,H1)   --->--->--->--->    left: -1.0
LM      0       16:26:53.029    DecisionTree Test (EURUSD,H1)   --->--->--->--->    right: -1.0
NE      0       16:26:53.029    DecisionTree Test (EURUSD,H1)   --->  right: X_0<=85.28191275702572?0.024468404842877933
DK      0       16:26:53.029    DecisionTree Test (EURUSD,H1)   --->--->   left: X_1<=25.913621262458935?0.01603292204455742
LE      0       16:26:53.029    DecisionTree Test (EURUSD,H1)   --->--->--->    left: X_0<=72.18709160232456?0.2222222222222222
ED      0       16:26:53.029    DecisionTree Test (EURUSD,H1)   --->--->--->--->     left: X_1<=15.458937198072245?0.4444444444444444
QQ      0       16:26:53.029    DecisionTree Test (EURUSD,H1)   --->--->--->--->--->     left: 1.0
CS      0       16:26:53.029    DecisionTree Test (EURUSD,H1)   --->--->--->--->--->     right: -1.0
JE      0       16:26:53.029    DecisionTree Test (EURUSD,H1)   --->--->--->--->    right: -1.0
QM      0       16:26:53.029    DecisionTree Test (EURUSD,H1)   --->--->--->    right: X_0<=69.83504428897093?0.012164425148527835
HP      0       16:26:53.029    DecisionTree Test (EURUSD,H1)   --->--->--->--->     left: X_0<=68.39798826749553?0.07844460227272732
DL      0       16:26:53.029    DecisionTree Test (EURUSD,H1)   --->--->--->--->--->      left: X_1<=90.68322981366397?0.06611570247933873
DO      0       16:26:53.029    DecisionTree Test (EURUSD,H1)   --->--->--->--->--->--->      left: 1.0
OE      0       16:26:53.029    DecisionTree Test (EURUSD,H1)   --->--->--->--->--->--->      right: 1.0
LI      0       16:26:53.029    DecisionTree Test (EURUSD,H1)   --->--->--->--->--->      right: X_1<=88.05704099821516?0.11523809523809525
DE      0       16:26:53.029    DecisionTree Test (EURUSD,H1)   --->--->--->--->--->--->      left: 1.0
DM      0       16:26:53.029    DecisionTree Test (EURUSD,H1)   --->--->--->--->--->--->      right: -1.0
LG      0       16:26:53.029    DecisionTree Test (EURUSD,H1)   --->--->--->--->     right: X_0<=70.41747488780877?0.015360959832756427
OI      0       16:26:53.029    DecisionTree Test (EURUSD,H1)   --->--->--->--->--->     left: 1.0
PI      0       16:26:53.029    DecisionTree Test (EURUSD,H1)   --->--->--->--->--->      right: X_0<=70.56490391752676?0.02275277028755862
CF      0       16:26:53.029    DecisionTree Test (EURUSD,H1)   --->--->--->--->--->--->      left: -1.0
MO      0       16:26:53.029    DecisionTree Test (EURUSD,H1)   --->--->--->--->--->--->      right: 1.0
EG      0       16:26:53.029    DecisionTree Test (EURUSD,H1)   --->--->   right: X_1<=97.0643939393936?0.10888888888888892
CJ      0       16:26:53.029    DecisionTree Test (EURUSD,H1)   --->--->--->   left: 1.0
GN      0       16:26:53.029    DecisionTree Test (EURUSD,H1)   --->--->--->    right: X_0<=90.20261550045987?0.07901234567901233
CP      0       16:26:53.029    DecisionTree Test (EURUSD,H1)   --->--->--->--->     left: X_0<=85.94461490761033?0.21333333333333332
HN      0       16:26:53.029    DecisionTree Test (EURUSD,H1)   --->--->--->--->--->     left: -1.0
GE      0       16:26:53.029    DecisionTree Test (EURUSD,H1)   --->--->--->--->--->      right: X_1<=99.66856060606052?0.4444444444444444
GK      0       16:26:53.029    DecisionTree Test (EURUSD,H1)   --->--->--->--->--->--->      left: -1.0
IK      0       16:26:53.029    DecisionTree Test (EURUSD,H1)   --->--->--->--->--->--->      right: 1.0
JM      0       16:26:53.029    DecisionTree Test (EURUSD,H1)   --->--->--->--->    right: -1.0
KE      0       16:26:53.029    DecisionTree Test (EURUSD,H1)   Confusion Matrix
DO      0       16:26:53.029    DecisionTree Test (EURUSD,H1)   [[122,271]
QF      0       16:26:53.029    DecisionTree Test (EURUSD,H1)    [51,356]]
HS      0       16:26:53.029    DecisionTree Test (EURUSD,H1)   
LF      0       16:26:53.029    DecisionTree Test (EURUSD,H1)   Classification Report
JR      0       16:26:53.029    DecisionTree Test (EURUSD,H1)   
ND      0       16:26:53.029    DecisionTree Test (EURUSD,H1)   _    Precision  Recall  Specificity  F1 score  Support
GQ      0       16:26:53.029    DecisionTree Test (EURUSD,H1)   1.0    173.00     393.00     407.00       240.24     393.0
HQ      0       16:26:53.029    DecisionTree Test (EURUSD,H1)   -1.0    627.00     407.00     393.00       493.60     407.0
PM      0       16:26:53.029    DecisionTree Test (EURUSD,H1)   
OG      0       16:26:53.029    DecisionTree Test (EURUSD,H1)   Accuracy                                   0.60
EO      0       16:26:53.029    DecisionTree Test (EURUSD,H1)   Average   400.00    400.00    400.00      366.92    800.0
GN      0       16:26:53.029    DecisionTree Test (EURUSD,H1)   W Avg     403.97    400.12    399.88      369.14    800.0
LM      0       16:26:53.029    DecisionTree Test (EURUSD,H1)   Train Acc = 0.598
GK      0       16:26:53.029    DecisionTree Test (EURUSD,H1)   Confusion Matrix
CQ      0       16:26:53.029    DecisionTree Test (EURUSD,H1)   [[75,13]
CK      0       16:26:53.029    DecisionTree Test (EURUSD,H1)    [86,26]]
NI      0       16:26:53.029    DecisionTree Test (EURUSD,H1)   
RP      0       16:26:53.029    DecisionTree Test (EURUSD,H1)   Classification Report
HH      0       16:26:53.029    DecisionTree Test (EURUSD,H1)   
LR      0       16:26:53.029    DecisionTree Test (EURUSD,H1)   _    Precision  Recall  Specificity  F1 score  Support
EM      0       16:26:53.029    DecisionTree Test (EURUSD,H1)   -1.0    161.00     88.00     112.00       113.80     88.0
NJ      0       16:26:53.029    DecisionTree Test (EURUSD,H1)   1.0    39.00     112.00     88.00       57.85     112.0
LJ      0       16:26:53.029    DecisionTree Test (EURUSD,H1)   
EL      0       16:26:53.029    DecisionTree Test (EURUSD,H1)   Accuracy                                   0.51
RG      0       16:26:53.029    DecisionTree Test (EURUSD,H1)   Average   100.00    100.00    100.00      85.83    200.0
ID      0       16:26:53.029    DecisionTree Test (EURUSD,H1)   W Avg     92.68    101.44    98.56      82.47    200.0
JJ      0       16:26:53.029    DecisionTree Test (EURUSD,H1)   Test Acc = 0.505

The Model was correct 60% of the time during training while being 50.5% accurate during testing; not good. There might be many reasons, including the quality of the data we used to build the Model, or maybe there are bad predictors. The most common reason might be that we have not set the parameters for the Model well.

To fix this, you might need to tweak the parameters to determine what works best for your needs.

Now, let's code for a function to make real-time predictions.

int desisionTreeSignal()
 {
//--- Copy the current bar information only

   data_struct.rsi_buff.CopyIndicatorBuffer(rsi_handle, 0, 0, 1);
   data_struct.stoch_buff.CopyIndicatorBuffer(stoch_handle, 0, 0, 1);
   data_struct.signal_buff.CopyIndicatorBuffer(stoch_handle, 1, 0, 1);
   
   x_vars[0] = data_struct.rsi_buff[0];
   x_vars[1] = data_struct.stoch_buff[0];
   x_vars[2] = data_struct.signal_buff[0];
   
   return int(decision_tree.predict(x_vars));
 }

Now, let us make a simple trading logic:

If the decision tree predicts -1, meaning the candle will close down, we open a sell trade; if it predicts the class of 1, indicating the candle will close higher than where it opened, we want to place a buy trade.

void OnTick()
  {
//---
    if (!train_once)              // You want to train once during EA lifetime 
      TrainTree();
    train_once = true;
    
    if (isnewBar(PERIOD_CURRENT)) // We want to trade on the bar opening 
      {
        int signal = desisionTreeSignal();
        double min_lot = SymbolInfoDouble(Symbol(), SYMBOL_VOLUME_MIN);
        SymbolInfoTick(Symbol(), ticks);
        
         if (signal == -1)
           {
              if (!PosExists(MAGICNUMBER, POSITION_TYPE_SELL)) // If a sell trade doesnt exist
                m_trade.Sell(min_lot, Symbol(), ticks.bid, ticks.bid+stoploss*Point(), ticks.bid - takeprofit*Point());
           }
         else
           {
             if (!PosExists(MAGICNUMBER, POSITION_TYPE_BUY))  // If a buy trade doesnt exist
               m_trade.Buy(min_lot, Symbol(), ticks.ask, ticks.ask-stoploss*Point(), ticks.ask + takeprofit*Point());
           }
      }
  }

I ran a test on a single month 2023.01.01 - 2023.02.01 on Open Prices just to see if everything works out.

FAQs on Decision Trees in Trading:

Question	Answer
Is normalization of input data important for decision trees?	No, normalization is generally not crucial for decision trees at all. Decision trees make splits based on feature thresholds, and the scale of features doesn't affect the tree structure. However, it's a good practice to check the impact of normalization on model performance.
How do decision trees handle categorical variables in trading data?	Decision trees can handle categorical variables naturally. They perform binary splits based on whether a condition is met, including conditions for categorical variables. The tree will determine the optimal split points for categorical features.
Can decision trees be used for time-series forecasting in trading?	While decision trees can be utilized for time-series forecasting in trading, they may not capture complex temporal patterns as effectively as models such as recurrent neural networks (RNNs). Ensemble methods like Random Forests could offer greater robustness
Do decision trees suffer from overfitting?	Decision trees, particularly deep ones, can be prone to overfitting by capturing noise in the training data. Techniques such as pruning and limiting tree depth can be employed to mitigate overfitting in trading applications
Are decision trees suitable for feature importance analysis in trading models?	Yes, decision trees provide a natural way to assess feature importance. Features that contribute more to the splitting decisions at the top of the tree are generally more critical. This analysis can offer insights into the factors driving trading decisions.
How sensitive are decision trees to outliers in trading data?	Decision trees can be sensitive to outliers, especially when the tree is deep. Outliers may lead to specific splits that capture noise. Preprocessing steps, such as outlier detection and removal, can be applied to mitigate this sensitivity.
Are there specific hyperparameters to tune for decision trees in trading models?	Yes, key hyperparameters to tune include the tree depth, minimum samples per leaf and the criterion for splitting (e.g., Gini impurity or entropy). One can use Cross-validation to find optimal hyperparameter values for given datasets.
Can decision trees be part of an ensemble approach?	Yes, decision trees can be part of ensemble methods like Random Forests, which combine multiple trees to improve overall predictive performance. Ensemble methods are often robust and effective in trading applications.

Advantages of Decision Trees:

Interpretability:

Decision trees are easy to understand and interpret. The graphical representation of the tree structure allows for clear visualization of decision-making processes.

Handling Non-Linearity:

Decision trees can capture non-linear relationships in data, making them suitable for problems where the decision boundaries are not linear.

Handling Mixed Data Types:

Decision trees can take both numerical and categorical data without the need for extensive preprocessing.

Feature Importance:

Decision trees provide a natural way to assess the importance of features, helping identify critical factors influencing the target variable.

No Assumptions about Data Distribution:

Decision trees make no assumptions about data distribution, making them versatile and applicable to various datasets.

Robustness to Outliers:

Decision trees are relatively robust to outliers since splits are based on relative comparisons and unaffected by absolute values.

Automatic Variable Selection:

The tree-building process includes automatic variable selection, reducing the need for manual feature engineering.

Can Handle Missing Values:

Decision trees can handle missing values in features without requiring imputation, as splits are made based on available data.

Disadvantages of Decision Trees:

Overfitting:

Decision trees are prone to overfitting, especially when they are deep and capture noise in the training data. Techniques like pruning are used to address this issue.

Instability:

Small changes in the data can lead to significant changes in the tree structure, making decision trees somewhat unstable.

Bias Toward Dominant Classes:

In datasets with imbalanced classes, decision trees can be biased toward the dominant class, leading to suboptimal performance for minority classes.

Global Optimum vs. Local Optima:

Decision trees focus on finding local optimum splits at each node, which may not necessarily lead to a globally optimal solution.

Limited Expressiveness:

Decision trees might struggle to express complex relationships in data compared to more sophisticated models like neural networks.

Not Suitable for Continuous Output:

While decision trees are adequate for classification tasks, they may not be as suitable for tasks requiring a continuous output.

Sensitive to Noisy Data:

Decision trees can be sensitive to noisy data, and outliers may lead to specific splits that capture noise rather than meaningful patterns.

Biased Toward Dominant Features:

Features with more levels or categories may appear more critical due to how splits are made, potentially introducing bias. One can address this through techniques like feature scaling.

That's it folks, Thanks for reading.

Track the development and contribute to the decision tree algorithm and many more AI models on my GitHub repo: https://github.com/MegaJoctan/MALE5/tree/master

Attachments:

tree.mqh	The main include, file. Contains the decision tree code we mainly discussed above.
metrics.mqh	Contains functions and code to measure the performance of ML models.
matrix_utils.mqh	Contains additional functions for matrix manipulations.
preprocessing.mqh	The library for pre-processing raw input data to make it suitable for Machine learning models usage.
DecisionTree Test.mq5(EA)	The main file. An expert advisor for running the decision tree.

Attached files |

Download ZIP

Code.zip (20.2 KB)

Warning: All rights to these materials are reserved by MetaQuotes Ltd. Copying or reprinting of these materials in whole or in part is prohibited.

Quick Recap

What is a Decision Tree?

What is a Node?

Components of a Node:

ID3 (Iterative Dichotomiser 3)

Decision Tree Algorithm

Leaf Value Calculations

Wrapping it all up in a CDecisionTree class

Printing the Tree

Decision Tree AI in Trading

Collecting Data, Training and Testing the decision Tree

FAQs on Decision Trees in Trading:

Advantages of Decision Trees:

Disadvantages of Decision Trees:

Other articles by this author