Importing blk*.dat files

111 views
Skip to first unread message

Nishil Shah

unread,
Feb 18, 2017, 11:51:50 PM2/18/17
to bitcoinj
Is there anyway to import blk?????.dat files individually and parse them into Block objects? I'm not trying to use any network connections. I want to get everything from disk if possible to make my computations/processing as fast as possible because I want parse many files in parallel. I've read about some "block importer" tool but can't really see how to use it in my situation. Thanks.

Jameson Lopp

unread,
Feb 19, 2017, 1:33:49 AM2/19/17
to bitc...@googlegroups.com
I think what you want is the BlockFileLoader, as seen here in the BlockImporter: https://github.com/bitcoinj/bitcoinj/blob/master/tools/src/main/java/org/bitcoinj/tools/BlockImporter.java#L67

- Jameson

On Sat, Feb 18, 2017 at 11:51 PM, Nishil Shah <nishil...@gmail.com> wrote:
Is there anyway to import blk?????.dat files individually and parse them into Block objects? I'm not trying to use any network connections. I want to get everything from disk if possible to make my computations/processing as fast as possible because I want parse many files in parallel. I've read about some "block importer" tool but can't really see how to use it in my situation. Thanks.

--
You received this message because you are subscribed to the Google Groups "bitcoinj" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoinj+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Nishil Shah

unread,
Feb 19, 2017, 1:54:17 AM2/19/17
to bitc...@googlegroups.com
Yes, I mentioned that in my post. However, which options would I use in this case? And can I invoke this through java code?

Thanks again.

Sent from my iPhone
You received this message because you are subscribed to a topic in the Google Groups "bitcoinj" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/bitcoinj/4wB6-i7ys8E/unsubscribe.
To unsubscribe from this group and all its topics, send an email to bitcoinj+u...@googlegroups.com.

Nishil Shah

unread,
Feb 19, 2017, 1:58:38 AM2/19/17
to bitc...@googlegroups.com
What really is the get reference client list?

Sent from my iPhone

On Feb 19, 2017, at 12:33 AM, Jameson Lopp <jameso...@gmail.com> wrote:

You received this message because you are subscribed to a topic in the Google Groups "bitcoinj" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/bitcoinj/4wB6-i7ys8E/unsubscribe.
To unsubscribe from this group and all its topics, send an email to bitcoinj+u...@googlegroups.com.

Nishil Shah

unread,
Feb 19, 2017, 2:00:40 AM2/19/17
to bitc...@googlegroups.com
It doesn't take in a file name?

Sent from my iPhone

On Feb 19, 2017, at 12:33 AM, Jameson Lopp <jameso...@gmail.com> wrote:

You received this message because you are subscribed to a topic in the Google Groups "bitcoinj" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/bitcoinj/4wB6-i7ys8E/unsubscribe.
To unsubscribe from this group and all its topics, send an email to bitcoinj+u...@googlegroups.com.

Jameson Lopp

unread,
Feb 19, 2017, 4:25:24 AM2/19/17
to bitc...@googlegroups.com
Right, getReferenceClientList just tries to be smart and figure out the location of your block files automatically. If it doesn't work for your system, you can model its logic to create your own function that builds a list of your block files.

For your purposes, I'm assuming you're analyzing Bitcoin on mainnet, so you'd pass MainNetParams.get(). The only other thing for you to figure out then would be your data storage, which would be your "chain" object or whatever function to which you pass the blocks to be processed.

- Jameson

To unsubscribe from this group and all its topics, send an email to bitcoinj+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Nishil Shah

unread,
Feb 20, 2017, 4:19:43 PM2/20/17
to bitc...@googlegroups.com
I see, thanks. One last question: So shouldn't I be able to simply build a List<File> and append paths to my block files "/path/to/blk00000.dat"? I tried this and when I do blockFileLoader.next() I get an error. Don't think this works.

Nishil Shah

unread,
Feb 20, 2017, 4:20:07 PM2/20/17
to bitc...@googlegroups.com
I would use that List to initialize the block file loader.

Jameson Lopp

unread,
Feb 21, 2017, 11:31:53 AM2/21/17
to bitc...@googlegroups.com
You should definitely be able to build your own List<File> list that you pass to the loader - I've done this myself in the past.

What error are you seeing?

Nishil Shah

unread,
Feb 22, 2017, 1:53:29 PM2/22/17
to bitc...@googlegroups.com
next() just complains that there is no next item in the iterator. Does this mean the I/O failed? According to the docs, the BlockFileLoader swallows I/O exceptions, so I can't see the I/O error.

Nishil Shah

unread,
Feb 22, 2017, 3:14:56 PM2/22/17
to bitc...@googlegroups.com
Ah, I can't check right now, but I remember I was using new MainNetParams() as opposed to MainNetParams.get(). Could that be the difference?

Jameson Lopp

unread,
Feb 22, 2017, 5:49:08 PM2/22/17
to bitc...@googlegroups.com
That's quite odd; it sounds like you didn't actually add the File(s) to the List. Did you do a .exists() check on them like it does here? https://github.com/bitcoinj/bitcoinj/blob/3177bd52a2bfa491c5902e95b8840030e1a31159/core/src/main/java/org/bitcoinj/utils/BlockFileLoader.java#L67

I don't think the different styles of getting the params should matter. Feel free to share your code.

Nishil Shah

unread,
Feb 25, 2017, 5:20:46 PM2/25/17
to bitc...@googlegroups.com
Figured it out. I'm using hadoop and the files are stored on the distributed file system. Need to find a workaround. If I read all the bytes from the file, can I construct a block with it by passing it in to the Block constructor? What exactly is the "payloadBytes" parameter?

Thanks.

Jameson Lopp

unread,
Feb 25, 2017, 5:34:29 PM2/25/17
to bitc...@googlegroups.com
Aha, if you can somehow get access to the bytes then you should be able to duplicate the byte logic here: https://github.com/bitcoinj/bitcoinj/blob/3177bd52a2bfa491c5902e95b8840030e1a31159/core/src/main/java/org/bitcoinj/utils/BlockFileLoader.java#L129

Basically, you need to do some low level byte parsing so that you can eventually read in what the size of the next Block is, then you can take the next N bytes and pass them to the deserializer in order to actually construct the Block object: https://github.com/bitcoinj/bitcoinj/blob/3177bd52a2bfa491c5902e95b8840030e1a31159/core/src/main/java/org/bitcoinj/utils/BlockFileLoader.java#L152

- Jameson

Nishil Shah

unread,
Feb 26, 2017, 3:00:17 PM2/26/17
to bitc...@googlegroups.com
Oh, that's perfect. I'm very, very close. Do you mind taking at a short snippet of my code I put into this test file:


So, I replicated most of the logic found in BlockFileLoader.java. Rather than reading from input stream, I am trying to keep and index to iterate through the actual bytes themselves.

Line 66 yields the error: NegativeArraySizeException at org.bitcoinj.core.Message.readBytes. I'm guessing that my index is wrong at some point. Can't spot the bug, however.

Keep in mind, I want to build from the bytes array, because in another situation I do not want to use the FileInputStream.

Any ideas? Thank you so much. 

Jameson Lopp

unread,
Feb 26, 2017, 4:41:54 PM2/26/17
to bitc...@googlegroups.com
The logic looks correct to me; it blows up on the very first block? Have you tried any other .dat files?

I hope that this import tool hasn't become obsolete due to any changes in how bitcoind stores blocks on disk...

Nishil Shah

unread,
Feb 28, 2017, 8:57:35 PM2/28/17
to bitc...@googlegroups.com
Got! I had to apply the mask of 0xff because the inputStream.read() function returns ints while i was getting bytes from the bytes[] array.

Nishil Shah

unread,
Feb 28, 2017, 8:57:41 PM2/28/17
to bitc...@googlegroups.com
Thank you for all your help.
Reply all
Reply to author
Forward
0 new messages