Ruby byte access to disk sectors like dd does

Gary Hasson

unread,

Nov 26, 2009, 4:07:20 PM11/26/09

to

I have been unable to find any reference to Ruby methods that provide
raw disk access the way dd does. I frequently use the Linux utility
program, dd, and would like to be able to do the same type of byte-level
and sector-level access to the hard drive with Ruby. (What I am looking
for is not file access.) Are there Ruby methods to provide this
capability (other than having Ruby access dd)?

Thanks,
Gary
--
Posted via http://www.ruby-forum.com/.

Bill Kelly

unread,

Nov 26, 2009, 10:28:23 PM11/26/09

to

From: "Gary Hasson" <ga...@oax.net>

>
> I have been unable to find any reference to Ruby methods that provide
> raw disk access the way dd does. I frequently use the Linux utility
> program, dd, and would like to be able to do the same type of byte-level
> and sector-level access to the hard drive with Ruby. (What I am looking
> for is not file access.) Are there Ruby methods to provide this
> capability (other than having Ruby access dd)?

[i think my previous post didn't make it to the list]

dd if=/dev/hdb bs=512 count=1 | hexdump -C

..should be equivalent to...

ruby -e 'File.open("/dev/hdb", "rb") {|io| print(io.read(512))}' | hexdump -C

Regards,

Bill

Gary Wright

unread,

Nov 26, 2009, 10:46:13 PM11/26/09

to

On Nov 26, 2009, at 4:07 PM, Gary Hasson wrote:

> I have been unable to find any reference to Ruby methods that provide
> raw disk access the way dd does. I frequently use the Linux utility
> program, dd, and would like to be able to do the same type of byte-level
> and sector-level access to the hard drive with Ruby. (What I am looking
> for is not file access.) Are there Ruby methods to provide this
> capability (other than having Ruby access dd)?

dd doesn't ever have 'raw disk access' or 'sector' level access to a disk drive so I'm not sure how to respond.

A raw disk device just presents the underlying drive as a file. You use seek and read and write to accomplish I/O but it is generally a mistake to think that you are accessing particular 'sectors' when you do this. Modern disks drives have complex geometries that aren't visible unless you are working at the device driver level.

To utilize 'raw' I/O with Ruby look at IO#sysread, IO#syswrite, and IO#sysseek. This would be analogous to a C program utilizing the similarly named system calls. You'll also want to open the device file in binary mode and also be sure that you don't use any of the buffered I/O methods (i.e. don't mix and match).

Gary Wright

Eleanor McHugh

unread,

Nov 27, 2009, 6:42:38 AM11/27/09

to

For a gentle intro to the more general topic of using Ruby for systems programming on a Unix box, take a look at the Ruby Plumber's Guide presentations linked from my signature. Also buy yourself a copy of Marc Rochkind's "Advanced Unix Programming" which is as light a read as is possible given the nature of the subject matter ;)

Ellie

Eleanor McHugh
Games With Brains
http://slides.games-with-brains.net
----
raise ArgumentError unless @reality.responds_to? :reason

Robert Klemme

unread,

Nov 27, 2009, 9:41:25 AM11/27/09

to

2009/11/27 Bill Kelly <bi...@cts.com>:

>
> From: "Gary Hasson" <ga...@oax.net>
>>
>> I have been unable to find any reference to Ruby methods that provide
>> raw disk access the way dd does. I frequently use the Linux utility
>> program, dd, and would like to be able to do the same type of byte-level
>> and sector-level access to the hard drive with Ruby. (What I am looking
>> for is not file access.) Are there Ruby methods to provide this
>> capability (other than having Ruby access dd)?
>
> [i think my previous post didn't make it to the list]
>
> dd if=/dev/hdb bs=512 count=1 | hexdump -C
>

> ...should be equivalent to...

>
> ruby -e 'File.open("/dev/hdb", "rb") {|io| print(io.read(512))}' | hexdump
> -C

Just a small remark: when using #read for reading I would prefer to
also use #write for writing. It is more symmetric and It may actually
be that #print does a bit more in light of i18n support in Ruby 1.9.
So that would be

ruby -e 'File.open("/dev/hdb", "rb") {|io|

$stdout.write(io.read(512))}' | hexdump -C

The more fun solution would of course be to code the hex dumping in
Ruby also. :-)

Kind regards

robert

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

Marc Heiler

unread,

Nov 27, 2009, 5:05:30 PM11/27/09

to

> The more fun solution would of course be to code the hex dumping in
> Ruby also. :-)

Hex Dumping in Ruby! I can hear a code puzzle coming up already!

Bill Kelly

unread,

Nov 27, 2009, 7:26:36 PM11/27/09

to

From: "Robert Klemme" <short...@googlemail.com>
>
> 2009/11/27 Bill Kelly <bi...@cts.com>:

>>
>> ruby -e 'File.open("/dev/hdb", "rb") {|io| print(io.read(512))}' | hexdump -C
>
> Just a small remark: when using #read for reading I would prefer to
> also use #write for writing. It is more symmetric and It may actually
> be that #print does a bit more in light of i18n support in Ruby 1.9.
> So that would be
>
> ruby -e 'File.open("/dev/hdb", "rb") {|io|
> $stdout.write(io.read(512))}' | hexdump -C

Hmm. From my point of view, symmetry-for-symmetry's sake can't
generally apply here, because, for example, I would expect the
following to be bad form:

$stdout.syswrite(io.sysread(512))}' | hexdump -C

I.e., since 'io' and stdout are separate streams, I may choose
to call *only* 'sys' methods on 'io', but I can't in turn
reasonably expect to call corresponding 'sys' methods on the
(presumably buffered) $stdout ...

However, your observation did make me curious as to how different
'print' might be from 'write' vis-a-vis m17n on ruby 1.9.x.

In the 1.9.2dev sources, it appears 'write' and 'print' are
nearly identical (rb_io_print calls rb_io_write), but, that
each argument to print will be followed by the "output record
separator" string, if it is non-nil. ( $\ )

So if we wish to avoid the possibility of $\ contaminating our
output, we indeed should call write instead of print.

> The more fun solution would of course be to code the hex dumping in
> Ruby also. :-)

:) Here's a stab at it... (it does work properly if the last
read returns fewer than 16 bytes.)

ruby -e 'File.open("/dev/hdb", "rb") {|io| print(io.read(512))}' |

ruby -e 'i=0; while(x=ARGF.read(16)); puts("%08x %-32s %s" % [i, x.unpack("H*"), x.tr("^\040-\176",".")]); i+=16; end'

Regards,

Bill

Robert Klemme

unread,

Nov 28, 2009, 11:04:04 AM11/28/09

to

On 11/28/2009 01:26 AM, Bill Kelly wrote:

>> The more fun solution would of course be to code the hex dumping in
>> Ruby also. :-)
>
> :) Here's a stab at it... (it does work properly if the last
> read returns fewer than 16 bytes.)

What makes it stop working properly if the file size is a multiple of 16?

> ruby -e 'File.open("/dev/hdb", "rb") {|io| print(io.read(512))}' |
> ruby -e 'i=0; while(x=ARGF.read(16)); puts("%08x %-32s %s" % [i, x.unpack("H*"), x.tr("^\040-\176",".")]); i+=16; end'

Why do you separate this in two processes? I've munged it a bit...

ruby -e 'File.open("/dev/hdb", "rb"){|io|i=0;while(l=io.read(16));
printf("%06x %-32s %s\n",i,l.unpack("H*").first,l.tr("^\040-\176","."));
i+=l.bytesize end}'

Cheers

Bill Kelly

unread,

Nov 28, 2009, 1:12:48 PM11/28/09

to

From: "Robert Klemme" <short...@googlemail.com>

> On 11/28/2009 01:26 AM, Bill Kelly wrote:
>>
>> :) Here's a stab at it... (it does work properly if the last
>> read returns fewer than 16 bytes.)
>
> What makes it stop working properly if the file size is a multiple of 16?

It works OK in either case, is what I was trying to say. :)

>> ruby -e 'File.open("/dev/hdb", "rb") {|io| print(io.read(512))}' |
>> ruby -e 'i=0; while(x=ARGF.read(16)); puts("%08x %-32s %s" % [i, x.unpack("H*"), x.tr("^\040-\176",".")]); i+=16; end'
>
> Why do you separate this in two processes? I've munged it a bit...

Same reason dd and hexdump are separate programs.

http://www.faqs.org/docs/artu/ch01s06.html

> ruby -e 'File.open("/dev/hdb", "rb"){|io|i=0;while(l=io.read(16));
> printf("%06x %-32s %s\n",i,l.unpack("H*").first,l.tr("^\040-\176","."));
> i+=l.bytesize end}'
>
>
> Cheers
>
> robert

Regards,

Bill

Gary Hasson

unread,

Nov 28, 2009, 7:25:26 PM11/28/09

to

Thank you all for your guidance. I am now able to do with Ruby what I
previously need dd for.

However, I am not able to cleanly test for the end of file due to the
following error:

until file_1.eof?
file_1_byte = file_1.sysread( 1 )
end

ERROR MSG:
/Ruby_file_test.rb:23:in `sysread': sysread for buffered IO (IOError)
from ./Ruby_file_test.rb:23

All of my file accesses are sysread, syswrite, and sysseek, which are
not compatible with IO#eof.

Using the file size to prevent hitting the EOF does not work for
/dev/sdb.

So, I just let it generate an EOF error and then rescue it.

It seems like there ought to be a working EOF function for IO#sys...

Daniel Berger

unread,

Nov 28, 2009, 7:48:23 PM11/28/09

to

On Nov 26, 2:07 pm, Gary Hasson <g...@oax.net> wrote:
> I have been unable to find any reference to Ruby methods that provide
> raw disk access the way dd does. I frequently use the Linux utility
> program, dd, and would like to be able to do the same type of byte-level
> and sector-level access to the hard drive with Ruby. (What I am looking
> for is not file access.) Are there Ruby methods to provide this
> capability (other than having Ruby access dd)?

gem install sys-filesystem

Regards,

Dan

Robert Klemme

unread,

Nov 29, 2009, 7:07:18 AM11/29/09

to

Question is why do you use #sysread in the first place? As far as I can
see there is no advantage of using #sysread here.

Also, reading a single byte at a time might be easier for the
implementation but it is almost inevitably dramatically slower than
reading in chunks. *If* you use #sysread you should definitively read
in chunks because #sysread does not buffer IIRC.

Kind regards

Robert Klemme

unread,

Nov 29, 2009, 7:57:13 AM11/29/09

to

On 28.11.2009 19:12, Bill Kelly wrote:
> From: "Robert Klemme" <short...@googlemail.com>
>> On 11/28/2009 01:26 AM, Bill Kelly wrote:
>>> :) Here's a stab at it... (it does work properly if the last
>>> read returns fewer than 16 bytes.)
>> What makes it stop working properly if the file size is a multiple of 16?
>
> It works OK in either case, is what I was trying to say. :)

Ah, OK. Thanks for the clarification!

>>> ruby -e 'File.open("/dev/hdb", "rb") {|io| print(io.read(512))}' |
>>> ruby -e 'i=0; while(x=ARGF.read(16)); puts("%08x %-32s %s" % [i, x.unpack("H*"), x.tr("^\040-\176",".")]); i+=16; end'
>> Why do you separate this in two processes? I've munged it a bit...
>
> Same reason dd and hexdump are separate programs.
>
> http://www.faqs.org/docs/artu/ch01s06.html

Well, but in light of that neither a Ruby replacement of dd or hexdump
is needed as they do exist already and do their job properly - and
likely even more efficient than Ruby. :-)

<quiz-idea>
Write a program that reads binary files and outputs them in a similar
format as "hexdump -C" integrating features of "dd": The program needs
to be able to understand a series of paired options "--start X --length
Y" and "--start X --end Z" (X, Y, Z of course being integer numbers with
obvious properties) which denote the regions of the stream to read. The
program must be capable of reading from a pipe as well! Bonus points
for accepting parameter "--block-size B" which determines the size of
chunks read in one read operation. You may also invent any number of
additional options which control formatting of the output. Source code
must be shorter than 20 lines of code per additional option. ;-)
</quiz-idea>

Kind regards

Rob Biedenharn

unread,

Nov 29, 2009, 4:29:02 PM11/29/09

to

That sounds like quiz #171 - hexdump - from early this year.

http://rubyquiz.strd6.com/quizzes/171

-Rob

Rob Biedenharn http://agileconsultingllc.com
R...@AgileConsultingLLC.com

Robert Klemme

unread,

Nov 30, 2009, 3:00:14 AM11/30/09

to

2009/11/29 Rob Biedenharn <R...@agileconsultingllc.com>:

>
> On Nov 29, 2009, at 8:00 AM, Robert Klemme wrote:
>> <quiz-idea>
>> Write a program that reads binary files and outputs them in a similar
>> format as "hexdump -C" integrating features of "dd": The program needs to be

>> able to understand a series of paired options "--start X --length Y" and

>> "--start X --end Z" (X, Y, Z of course being integer numbers with obvious
>> properties) which denote the regions of the stream to read. The program
>> must be capable of reading from a pipe as well! Bonus points for accepting
>> parameter "--block-size B" which determines the size of chunks read in one
>> read operation. You may also invent any number of additional options which
>> control formatting of the output. Source code must be shorter than 20 lines
>> of code per additional option. ;-)
>> </quiz-idea>

> That sounds like quiz #171 - hexdump - from early this year.
>
> http://rubyquiz.strd6.com/quizzes/171

I swear I wasn't aware of that. But the idea isn't too esoteric so it
comes as no surprise that several people have had it. There is at
least a small difference in that I included the "dd" part which does
not seem to be present in #171.

Gary Hasson

unread,

Nov 30, 2009, 12:16:15 PM11/30/09

to

My reasons for wanting to use Ruby in place of dd is that I wanted to be
able to do whatever I wanted to the hard drive from a wxRuby GUI. I have
several Bash scripts that do things that I would prefer to use a GUI
for. Plus, I wanted to see just how well Ruby could do low-level tasks.

I used sysread and syswrite because, as the manual states: "syswrite is
a low-level, unbuffered, nontranscoding version of write". If I am going
to modify the partition table on the hard drive, I want to be sure that
reads and writes are doing what I tell them to do.

As for writing only one byte in my example, it was simply a test
example. If I could correctly write one byte to the unpartitioned area
of my hard drive, then I knew that Ruby could do what I wanted.