HelloI have been working on Cap'n Proto for some time to make some tests. My aim is to read the small chunks in a big serialized data to reduce the total memory consumption. For that purpose, I used memory-mapped reading and wrote a simple example to make some memory usage tests.In the tests, I realized that even if I only read the small data chunk (address) only include "address" string in itself, the total memory usage of the below test program is 512 MB in my machine (the capnp database is 2.1GB). I am wondering where I am doing something wrong. Note: I run the program only "read" mode. I called the "write" once to create capnp database.If you have any opinion, I would be very happy if you share it with me.Proto file----------------------------------------------------------------------------------------------@0xa5af5d9c9e54c04a;
struct Person {
name @0 :Text;
id @1 :UInt32;
email @2 :Text;
address @3 :Text;
}
struct AddressBook {
people @0 :List(Person);
}----------------------------------------------------------------------------------------------Source code of example------------------------------------------------------------------------------------------------#include "test.capnp.h"#include <capnp/message.h>#include <capnp/serialize-packed.h>#include <capnp/serialize.h>#include <iostream>#include <fcntl.h>#include <sys/mman.h>#include <sys/stat.h>#include <unistd.h>#include <stdlib.h>void writeAddressBook(int fd){constexpr const size_t NodeNumber = 1024 * 8;::capnp::MallocMessageBuilder message;AddressBook::Builder addressBook = message.initRoot<AddressBook>();::capnp::List<Person>::Builder people = addressBook.initPeople(NodeNumber);// Each string will be 128KB.constexpr const size_t size = 1024 * 128;for (int i = 0; i < NodeNumber; i++){Person::Builder person = people[i];person.setId(i);person.setName(std::string(size, 'A').c_str());person.setEmail(std::string(size, 'A').c_str());person.setAddress("Address");}kj::VectorOutputStream output;writeMessage(output, message);auto serializedData = output.getArray();void *dataPtr = const_cast<void *>(static_cast<const void *>(serializedData.begin()));size_t dataSize = serializedData.size();size_t totalBytesWritten = 0;while (totalBytesWritten < dataSize){auto numberOfBytesWritten = write(fd, static_cast<const char *>(dataPtr) + totalBytesWritten, dataSize - totalBytesWritten);if (numberOfBytesWritten == -1){throw std::runtime_error{"Error during creating capnp database"};}totalBytesWritten += numberOfBytesWritten;}}void readAddressBook(int fd){struct stat st;fstat(fd, &st);size_t fileSize = st.st_size;char *mappedData = static_cast<char *>(mmap(nullptr, fileSize, PROT_READ, MAP_PRIVATE, fd, 0));capnp::FlatArrayMessageReader reader(kj::ArrayPtr<const capnp::word>(reinterpret_cast<const capnp::word *>(mappedData), fileSize / sizeof(capnp::word)));AddressBook::Reader addressBook = reader.getRoot<AddressBook>();for (Person::Reader person : addressBook.getPeople()){person.getId();}munmap(mappedData, fileSize);close(fd);}int main(int argc, char **argv){int fd = open("./data.bin", O_RDWR);if (!std::strcmp(argv[1], "--write")){writeAddressBook(fd);}if (!std::strcmp(argv[1], "--read")){readAddressBook(fd);}return 0;}
You received this message because you are subscribed to the Google Groups "Cap'n Proto" group.
To unsubscribe from this group and stop receiving emails from it, send an email to capnproto+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/capnproto/a3192b90-a8bf-4151-84e8-0b8516d8f71bn%40googlegroups.com.
Hi, thanks for your reply.
I really appreciate your work in this library.
I used /bin/time utility of Linux but I also saw the same result with another memory analyzer.
As I mentioned, since the file could be big, my aim is to reduce memory usage when reading data from capnp database because it could be very big. When I read small portions of that database, I want my program not to consume so much memory. In the documentation, you refer to mmap usage to achieve this. Do you think that my approach is wrong for that purpose like I implemented in my code?
Thanks
To view this discussion on the web visit https://groups.google.com/d/msgid/capnproto/e4783119-6a58-47d9-954b-5a5ba205b671n%40googlegroups.com.