5.3.2.3 File operations

We continue working on our GPT model implementation class. We have already implemented the functionality of this model in the methods of our CNeuronGPT class. In the previous sections, we discussed object initialization methods and created the processes of feed-forward and backpropagation passes. The specified functionality is sufficient for creating a test model, and it is even possible to conduct a series of tests to assess the model functionality.

However, we have already discussed the importance of file handling methods for the practical operation of any neural network model. The main significance of this process is attributed to the cost of the model training process because it requires both time and resources. Often such costs are quite high. Therefore, there is a strong desire to train the model once and then use it with maximum workload in the shortest possible time.

The unpredictable and highly volatile nature of financial markets does not leave us with hope for an indefinitely prolonged usage of a model trained once. However, even in this case, with the volatility of the environment, retraining the model in new conditions will require fewer resources and time compared to training the model from scratch with random weights.

Therefore, let's continue our work and implement methods for working with files. As always, let's start with the CNeuronGPT::Save method that saves data to a file.

When starting to work on the data saving method, as usual, we take a critical look at the structure of our class and evaluate the necessity of saving the data for each object.

class CNeuronGPT    :  public CNeuronBase
  {
protected:
   CArrayLayers      m_cQuerys;
   CArrayLayers      m_cKeys;
   CArrayLayers      m_cValues;
   CArrayLayers      m_cScores;
   CArrayLayers      m_cAttentionOut;
   CArrayLayers      m_cW0;
   CArrayLayers      m_cFF1;
   CArrayLayers      m_cFF2;
   //---
   int               m_iLayers;
   int               m_iWindow;
   int               m_iUnits;
   int               m_iKeysSize;
   int               m_iHeads;
   CBufferType       m_dStd[];
   int               m_iCurrentPosition;
   int               m_iScoreTemp;
 
   virtual bool      NormlizeBuffer(CBufferType *bufferCBufferType *std,
                                                               uint std_shift);
   virtual bool      NormlizeBufferGradient(CBufferType *output
                      CBufferType *gradientCBufferType *stduint std_shift);
public:
                     CNeuronGPT(void);
                    ~CNeuronGPT(void);
   //---
   virtual bool      Init(const CLayerDescription *descoverride;
   virtual bool      SetOpenCL(CMyOpenCL *opencloverride;
   virtual bool      FeedForward(CNeuronBase *prevLayeroverride;
   virtual bool      CalcHiddenGradient(CNeuronBase *prevLayeroverride;
   virtual bool      CalcDeltaWeights(CNeuronBase *prevLayerbool readoverride;
   virtual bool      UpdateWeights(int batch_sizeTYPE learningRate,
                                           VECTOR &BetaVECTOR &Lambdaoverride;
   //---
   virtual int       GetUnits(voidconst { return m_iUnits;   }
   virtual int       GetLayers(voidconst { return m_iLayers; }
   //--- methods for working with files
   virtual bool      Save(const int file_handleoverride;
   virtual bool      Load(const int file_handleoverride;
   //--- object identification method
   virtual int       Type(voidoverride  const { return(defNeuronGPT);  }
  };

At this point, we realize that besides constants, our class contains only collections of objects. The resources required to recreate the collection objects with a complete description of their structure will be much higher than the potential savings in disk space resources. Therefore, we organize the saving of all collections in a data file for model recovery.

In the parameters, this method receives a file handle to save the data. To avoid duplicate controls and reduce the total amount of program code, we do not check the received handle. Instead, we call the similar method of the parent class, to which we pass the received handle. The advantages of this approach are obvious. With a single command, we check the received handle and save the data of objects inherited from the parent class. By checking the result of the parent class method, we control the entire specified process.

bool CNeuronGPT::Save(const int file_handle)
  {
//--- calling a method of a parent class
   if(!CNeuronBase::Save(file_handle))
      return false;

After the successful execution of the method of the parent class, we save the following constants of our method to the file:

  • m_iLayers — the number of nested neural layers of the GPT block
  • m_iWindow — the size of the source data window (the size of the description vector of one element of the source data sequence)
  • m_iKeysSize — the size of the description vector of one element of the Keys key tensor
  • m_iHeads — the number of attention heads used
  • m_iUnits — the number of elements in the sequence
  • m_iCurrentPosition — the position of the currently analyzed element

//--- save the constants
   if(FileWriteInteger(file_handlem_iLayers) <= 0)
      return false;
   if(FileWriteInteger(file_handlem_iWindow) <= 0)
      return false;
   if(FileWriteInteger(file_handlem_iKeysSize) <= 0)
      return false;
   if(FileWriteInteger(file_handlem_iHeads) <= 0)
      return false;
   if(FileWriteInteger(file_handlem_iUnits) <= 0)
      return false;
   if(FileWriteInteger(file_handlem_iCurrentPosition) <= 0)
      return false;

Saving the position of the current analyzed element is necessary for the proper functioning of the Key and Value stacks. However, in real usage conditions, I would recommend that before using the model, you sequentially input data into it in a volume sufficient to fully fill the stacks. This approach will allow you to control the process of data loading into the model and eliminate the risk of possible omissions, which could potentially impact the accuracy of the model performance in the initial stages after data loading. Of course, the model will level out after the stack is completely filled. But the risk of losses up to this point increases.

Next, we sequentially check the pointers to objects in all our collections and call their data-saving methods.

//--- call the method for all collections of inner layers
   if(!m_cQuerys.Save(file_handle))
      return false;
   if(!m_cKeys.Save(file_handle))
      return false;
   if(!m_cValues.Save(file_handle))
      return false;
   if(!m_cScores.Save(file_handle))
      return false;
   if(!m_cAttentionOut.Save(file_handle))
      return false;
   if(!m_cW0.Save(file_handle))
      return false;
   if(!m_cFF1.Save(file_handle))
      return false;
   if(!m_cFF2.Save(file_handle))
      return false;
//---
   return true;
  }

Then we exit the data saving method.

We have created a method for saving an object of our class. Now we can move on to work on the method of recovering an object from the data written to the file. As a reminder, the primary requirement for methods restoring the functionality of objects from saved data is to read the data in strict accordance with the sequence of their recording.

Similar to the file-writing method, our data-loading method CNeuronGPT::Load receives in parameters the handle of the file containing the data to be read. Just like when writing data, we first call the analogous method of the parent class. First, we read the data in strict accordance with the writing sequence. Second, we use the idea voiced when studying the method of writing data, that is, we use the controls implemented in the method of the parent class and exclude their duplication. Of course, before proceeding further, we check the result of the parent method operations.

bool CNeuronGPT::Load(const int file_handle)
  {
//--- call the method of a parent class
   if(!CNeuronBase::Load(file_handle))
      return false;

After the successful execution of the parent class method, we read the constants of our block operating parameters. Their values are read in the order in which they are written. After reading the constant values, we should adjust the size of the dynamic array for writing standard deviations used in normalizing the results of our block operation. The size of the array must be sufficient to store data from all nested neural layers. Otherwise, we run the risk of encountering a critical error due to exceeding array dimensions during program execution.

//--- read constants from a file
   m_iLayers = FileReadInteger(file_handle);
   m_iWindow = FileReadInteger(file_handle);
   m_iKeysSize = FileReadInteger(file_handle);
   m_iHeads = FileReadInteger(file_handle);
   m_iUnits = FileReadInteger(file_handle);
   m_iCurrentPosition = FileReadInteger(file_handle);
   if(ArrayResize(m_dStdm_iLayers) <= 0)
      return false;
   for(int i = 0i < m_iLayersi++)
      if(!m_dStd[i].BufferInit(121))
         return false;;

Then all we have to do is load the data from our object collections. However, before calling the method to load collection object data, we need to ensure the relevance of the collection object pointer and, if necessary, create a new instance of the collection object. Only then we can call the data loading method. Of course, do not forget that the order of loading objects is in strict accordance with the order of their writing. We also control the data loading process at each iteration.

//--- call the method for all collections of inner layers
   if(!m_cQuerys.Load(file_handle))
      return false;
   if(!m_cKeys.Load(file_handle))
      return false;
   if(!m_cValues.Load(file_handle))
      return false;
   if(!m_cScores.Load(file_handle))
      return false;
   if(!m_cAttentionOut.Load(file_handle))
      return false;
   if(!m_cW0.Load(file_handle))
      return false;
   if(!m_cFF1.Load(file_handle))
      return false;
   if(!m_cFF2.Load(file_handle))
      return false;

After loading all objects, we create another loop and reformat the result buffers of all created objects. In this case, we do not perform a validity check on the object pointers as in the previous iterations, all these objects loaded data from the file, which means they were created and verified.

//--- reformat the result matrices
   for(int i = 0i < m_iLayersi++)
     {
      CNeuronBasetemp = m_cKeys.At(i);
      if(!temp.GetOutputs().Reshape(m_iUnitsm_iKeysSize * m_iHeads))
         return false;
      temp = m_cValues.At(i);
      if(!temp.GetOutputs().Reshape(m_iUnitsm_iKeysSize * m_iHeads))
         return false;
      temp = m_cScores.At(i);
      if(!temp.GetOutputs().Reshape(m_iHeadsm_iUnits))
         return false;
      temp = m_cAttentionOut.At(i);
      if(!temp.GetOutputs().Reshape(m_iHeadsm_iKeysSize))
         return false;
     }

At the end of the method, we replace the buffers and terminate its work.

//--- replace data buffers to avoid excessive copying
   CNeuronBase *last = m_cFF2.At(m_cFF2.Total() - 1);
   if(!m_cOutputs)
      delete m_cOutputs;
   m_cOutputs = last.GetOutputs();
   if(!m_cGradients)
      delete m_cGradients;
   m_cGradients = last.GetGradients();
//---
   return true;
  }

Now that the file handling methods have been created, we can proceed further. Next, our plan involves creating the capability to perform parallel mathematical operations using OpenCL.