capnp::MallocMessageBuilder messageBuilder;
//What is a good size for our words? As long as its smaller?
capnp::word scratch[1024];
kj::ArrayPtr<capnp::word> scratchSpace(scratch);
kj::FdInputStream stream(fd);
kj::BufferedInputStreamWrapper buff(stream);
unsigned long count = 0;
while (buff.tryGetReadBuffer().size() != 0) {
capnp::InputStreamMessageReader message(buff, capnp::ReaderOptions(), scratchSpace);
Message::Reader chunk = message.getRoot<Message>();
count++;
}
I tried the helper methods first, but they seem too slow I think without the BufferedInputStreamWrapper.
I don't do much C++ so I appreciate any help :)
Btw, Is there an IRC chat or something ?
struct stat stats;KJ_SYSCALL(fstat(fd, &stats));size_t size = stats.st_size;const void* data = mmap(nullptr, size, PROT_READ, MAP_PRIVATE, fd, 0);
if (data == MAP_FAILED) {KJ_FAIL_SYSCALL("mmap");}
KJ_DEFER(KJ_SYSCALL(munmap(data, size)) { break; });
KJ_SYSCALL(madvise(data, size, MADV_SEQUENTIAL));
kj::ArrayPtr<capnp::word> words(reinterpret_cast<const capnp::word*>(data),size / sizeof(capnp::word));while (words.size() > 0) {capnp::FlatArrayMessageReader message(words);
Message::Reader chunk = message.getRoot<Message>();count++;
words = kj::arrayPtr(message.getEnd(), words.end());}
--
You received this message because you are subscribed to the Google Groups "Cap'n Proto" group.
To unsubscribe from this group and stop receiving emails from it, send an email to capnproto+unsubscribe@googlegroups.com.
Visit this group at https://groups.google.com/group/capnproto.
Is MMAP the only way to randomly seek to an offset in the file?
I can't seem to find a way to do that with kj::FdInputStream ?
I'm trying to create an index of the elements in the file.To unsubscribe from this group and stop receiving emails from it, send an email to capnproto+...@googlegroups.com.
while (buff.tryGetReadBuffer().size() != 0) {
capnp::InputStreamMessageReader message(buff, capnp::ReaderOptions()); //, scratch);
Message::Reader message = message.getRoot<Message>();
capnp::MallocMessageBuilder messageBuilder;
messageBuilder.setRoot(chunk);
messages->at(index++) = messageBuilder.getRoot<Message>();
}
Is MMAP the only way to randomly seek to an offset in the file?I can't seem to find a way to do that with kj::FdInputStream ?I'm trying to create an index of the elements in the file.
One more question =)I need to copy the root from a FdStream to a vectorDo I need to copy it into a MallocMessageBuilder ?
auto messages = std::make_unique<std::deque<Message::Reader *> >(10);
while (words.size() > 0) {
capnp::FlatArrayMessageReader * reader = new capnp::FlatArrayMessageReader(words);
Message::Reader message = reader->getRoot<Message>();
words = kj::arrayPtr(message->getEnd(), words.end());
messages->at(index++) = & message;
}
All the items in my message array seem to be always pointing to the last item read.
I'm not sure what I'm doing wrong here.auto messages = std::make_unique<std::deque<Message::Reader *> >(10);while (words.size() > 0) {
capnp::FlatArrayMessageReader * reader = new capnp::FlatArrayMessageReader(words);
Message::Reader message = reader->getRoot<Message>();
words = kj::arrayPtr(message->getEnd(), words.end());
messages->at(index++) = & message;
}
On Thursday, July 20, 2017 at 4:35:29 PM UTC-7, Kenton Varda wrote:On Thu, Jul 20, 2017 at 3:40 PM, Farid Zakaria <farid.m...@gmail.com> wrote:Is MMAP the only way to randomly seek to an offset in the file?I can't seem to find a way to do that with kj::FdInputStream ?I'm trying to create an index of the elements in the file.kj::InputStream doesn't assume the stream is seekable and doesn't track the current location. You could create a custom wrapper around InputStream or around BufferedInputStream that remembers how many bytes have been read. You can also lseek() the underlying fd directly, though of course you'll have to discard any buffers after that.But indeed, if you use mmap() this will all be a lot easier, and faster. I highly recommend using mmap() here.On Thu, Jul 20, 2017 at 4:14 PM, Farid Zakaria <farid.m...@gmail.com> wrote:One more question =)I need to copy the root from a FdStream to a vectorDo I need to copy it into a MallocMessageBuilder ?With InputStreamMessageReader, yes. You have to destroy the InputStreamMessageReader before you can read the next message, and that invalidates the root Reader and all other Readers pointing into it.However, with the mmap strategy, you don't need to delete the FlatArrayMessageReader before reading the next message. So, you can allocate them on the heap and put them into your vector, and then all the Readers pointing into them remain valid, as long as the FlatArrayMessageReaders exist and the memory is still mapped. (In this case you should remove the madvise() line since you plan to go back and randomly access the data later.)Again, I *highly* recommend this strategy instead of using a stream. With the mmap strategy, not only do you avoid copying into a builder, but you avoid copying the underlying data when you read it. The operating system causes the memory addresses to point directly at its in-memory cache of the file data. If multiple programs mmap() the same file, they share the memory, rather than creating their own copies. Moreover, the operating system is free to evict the data from memory and then load it again later on-demand. There are tons of advantages to this approach and it is exactly what Cap'n Proto is designed to enable.-Kenton
--
You received this message because you are subscribed to the Google Groups "Cap'n Proto" group.
To unsubscribe from this group and stop receiving emails from it, send an email to capnproto+unsubscribe@googlegroups.com.
Finally (sorry I keep making separate messages) --The reason why I was seeking a FdInputStream solution is because it seems to be much faster than an MMAP solution.
Although my file is quite large (10GB) -- memory is not much of a concern.
How does one copy from InputStreamMessageReader into the MallocMessageReader ?
--
You received this message because you are subscribed to the Google Groups "Cap'n Proto" group.
To unsubscribe from this group and stop receiving emails from it, send an email to capnproto+unsubscribe@googlegroups.com.