- 5.3.2.1 GPT feed-forward method
- 5.3.2.2 GPT backpropagation methods
- 5.3.2.3 File operations
5.3.2.3 File operations
We continue working on our GPT model implementation class. We have already implemented the functionality of this model in the methods of our CNeuronGPT class. In the previous sections, we discussed object initialization methods and created the processes of feed-forward and backpropagation passes. The specified functionality is sufficient for creating a test model, and it is even possible to conduct a series of tests to assess the model functionality.
However, we have already discussed the importance of file handling methods for the practical operation of any neural network model. The main significance of this process is attributed to the cost of the model training process because it requires both time and resources. Often such costs are quite high. Therefore, there is a strong desire to train the model once and then use it with maximum workload in the shortest possible time.
The unpredictable and highly volatile nature of financial markets does not leave us with hope for an indefinitely prolonged usage of a model trained once. However, even in this case, with the volatility of the environment, retraining the model in new conditions will require fewer resources and time compared to training the model from scratch with random weights.
Therefore, let's continue our work and implement methods for working with files. As always, let's start with the CNeuronGPT::Save method that saves data to a file.
When starting to work on the data saving method, as usual, we take a critical look at the structure of our class and evaluate the necessity of saving the data for each object.
class CNeuronGPT : public CNeuronBase
|
At this point, we realize that besides constants, our class contains only collections of objects. The resources required to recreate the collection objects with a complete description of their structure will be much higher than the potential savings in disk space resources. Therefore, we organize the saving of all collections in a data file for model recovery.
In the parameters, this method receives a file handle to save the data. To avoid duplicate controls and reduce the total amount of program code, we do not check the received handle. Instead, we call the similar method of the parent class, to which we pass the received handle. The advantages of this approach are obvious. With a single command, we check the received handle and save the data of objects inherited from the parent class. By checking the result of the parent class method, we control the entire specified process.
bool CNeuronGPT::Save(const int file_handle)
|
After the successful execution of the method of the parent class, we save the following constants of our method to the file:
- m_iLayers the number of nested neural layers of the GPT block
- m_iWindow the size of the source data window (the size of the description vector of one element of the source data sequence)
- m_iKeysSize the size of the description vector of one element of the Keys key tensor
- m_iHeads the number of attention heads used
- m_iUnits the number of elements in the sequence
- m_iCurrentPosition the position of the currently analyzed element
//--- save the constants
|
Saving the position of the current analyzed element is necessary for the proper functioning of the Key and Value stacks. However, in real usage conditions, I would recommend that before using the model, you sequentially input data into it in a volume sufficient to fully fill the stacks. This approach will allow you to control the process of data loading into the model and eliminate the risk of possible omissions, which could potentially impact the accuracy of the model performance in the initial stages after data loading. Of course, the model will level out after the stack is completely filled. But the risk of losses up to this point increases.
Next, we sequentially check the pointers to objects in all our collections and call their data-saving methods.
//--- call the method for all collections of inner layers
|
Then we exit the data saving method.
We have created a method for saving an object of our class. Now we can move on to work on the method of recovering an object from the data written to the file. As a reminder, the primary requirement for methods restoring the functionality of objects from saved data is to read the data in strict accordance with the sequence of their recording.
Similar to the file-writing method, our data-loading method CNeuronGPT::Load receives in parameters the handle of the file containing the data to be read. Just like when writing data, we first call the analogous method of the parent class. First, we read the data in strict accordance with the writing sequence. Second, we use the idea voiced when studying the method of writing data, that is, we use the controls implemented in the method of the parent class and exclude their duplication. Of course, before proceeding further, we check the result of the parent method operations.
bool CNeuronGPT::Load(const int file_handle)
|
After the successful execution of the parent class method, we read the constants of our block operating parameters. Their values are read in the order in which they are written. After reading the constant values, we should adjust the size of the dynamic array for writing standard deviations used in normalizing the results of our block operation. The size of the array must be sufficient to store data from all nested neural layers. Otherwise, we run the risk of encountering a critical error due to exceeding array dimensions during program execution.
//--- read constants from a file
|
Then all we have to do is load the data from our object collections. However, before calling the method to load collection object data, we need to ensure the relevance of the collection object pointer and, if necessary, create a new instance of the collection object. Only then we can call the data loading method. Of course, do not forget that the order of loading objects is in strict accordance with the order of their writing. We also control the data loading process at each iteration.
//--- call the method for all collections of inner layers
|
After loading all objects, we create another loop and reformat the result buffers of all created objects. In this case, we do not perform a validity check on the object pointers as in the previous iterations, all these objects loaded data from the file, which means they were created and verified.
//--- reformat the result matrices
|
At the end of the method, we replace the buffers and terminate its work.
//--- replace data buffers to avoid excessive copying
|
Now that the file handling methods have been created, we can proceed further. Next, our plan involves creating the capability to perform parallel mathematical operations using OpenCL.