As options...
1. make your own cache. In this case, you will be able to manage all the contents yourself.
2. use mapping for the file. vin itself will cache what it needs, so it won't overwrite the disk.
1. this is the cache... Or I don't understand what you mean. My option of constantly reading the necessary chunks?
2. Can you elaborate a bit more? What will mapping do and which way to approach it?
Oh, shit...
32-битные архитектуры (Intel 386, ARM 9) не могут создавать отображения длиной более 4 Гб
Same eggs, but from the side. Reading might speed up, but it doesn't solve the problem globally.
Another idea is to move everything to a database (MySQL?) and work with it. The idea is that databases are designed for such volumes and constant digging.
Are there any experts? Who has something to say?
1) Is there any way to redo the algorithm? To load a block (2GB), process it, save the result (shorter), release memory, load the next block ...
and at the end, process all the results all over again.
2) When there is a lot of work with memory, hash-based solutions, B-trees (and their modifications), offload to database are associated.
1) Is there any way to redo the algorithm? To load a block (2GB), process it, save the result (shorter), release memory, load the next block ...
and at the end, process all the results all over again.
2) When there is a lot of work with memory, hash-based solutions, B-trees (and their modifications), offloading to the database are associated.
1. I wrote about it - you can, but the problem is that you have to process the data multiple times. It will be very slow.
2. Tomorrow I'll google myself, I'll be grateful for short description.
1. I have written about this - you can, but the problem is that you have to process the data many times. It would be very slow.
2. Tomorrow I'll google myself, I'd be grateful for a brief description.
I remembered a site where a similar problem and variants of its solution in C++ were discussed.
- www.fulcrumweb.com.ua
1. Naturally, use an x64 system.
2. rent a more powerful machine in the Amazon EC2 cloud and do the calculations on it.
3. use compressed data, decompress in memory on the fly. Real data is better compressed if you divide it into streams (sign/mantissa/exponent); you can use 12-bit float (at the expense of accuracy).
4. do an off-advisor calculation with something that can handle big data (Matlab/R/etc).
- Free trading apps
- Over 8,000 signals for copying
- Economic news for exploring financial markets
You agree to website policy and terms of use
There is a large amount of information (about 20 GB in a text file).
The information consists of the same kind of sequences, about a million of them.
It is necessary to go through all the sequencesrepeatedly and make some calculations.
The first thing that comes to mind is to read all the contents of the file, fill the array of structures with it and work with them in memory.
But it goes wrong, with next resizing MT swears "Memory handler: cannot allocate 5610000 bytes of memory".
Dispatcher shows that terminal.exe uses 3.5 GB RAM (of 16 physical). I assume this is because the process can only get 4GB.
EA says "Not enough memory(4007 Mb used, 88 Mb available, 4095 Mb total)!!!".
And this is only 15.3% of the required amount (and I would like to increase it in the future as well).
Option 2 - read every time the file. Find the necessary piece, save it to the structure, read the next piece, compare the result, overwrite the structure.
And if I had to go through these sequences once, that's what I would do. But you have to go through them many times, shifting forward a bit each time.
So you have to read a lot of times, which is:
It's also frustrating how much information there is... If it was 10 GiG, I'd move it to RAM-disk (in fact, into memory) and read as much as I can. Yes?
That's all I can think of.
Try to recompile these sequences, so that there would be many-many pieces, but each containing only necessary at the moment information?
Also try to compress data (I've already converted to floats with char types everywhere I can)? But it will give me 10%-20% more at most, and I need to reduce volume by an order of magnitude...
Any advice, friends? I'll get it )