Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

File > 2G on a dvd

13 views
Skip to first unread message

Rhialto

unread,
Nov 28, 2004, 6:34:35 PM11/28/04
to curren...@netbsd.org
I wanted to burn a file to dvd that is slightly over 2 gigabytes in
size. The cdrecord (cdrtools) in pkgsrc (version 2.00.3) refused to put
the file in the image; it claimed the file was too large.

mkisofs: Value too large to be stored in data type. File FOO is too large - ignoring

After googling for the message I followed a suggestion (
http://lantech.geekvenue.net/chucktips/jason/chuck/1077301682/index_html
) and retried with cdrtools 2.01 (after adapting the patches from pkgsrc
a bit). This indeed made an image without complaint, however the file
does not look good (NetBSD/alpha 1.6.2):

-r--r--r-- 1 root wheel 18446744071564173312 Oct 30 13:23 FOO

Also, when trying to look at the data, the file appears empty.

but when I put the dvd in my laptop (NetBSD/i386 2.0_BETA) it is ok:

-r--r--r-- 1 root wheel 2149588992 Oct 30 13:23 FOO

Note that the low 32 bits of 18446744071564173312 form 2149588992.

According to diff this file is identical to the original.

Mounting the dvd on the Alpha and nfs-mounting it on the i386 gives the
same behaviour as seen directly on the Alpha.

I see several things that could be going on here:

1a) some bug in cd9660 in 1.6.2 that was fixed in 2.0
1b) some bug in cd9660 that manifests itself only on 64-bit systems

and/or

2) some bug in mkisofs that still creates corrupt cds, perhaps only when
run on a 64-bit system

Does anybody know, at least about 1a or 1b?

-Olaf.
--
-- Ceterum censeo "authored[1]" delendum esse.
___ Olaf 'Rhialto' Seibert -- [1] Ugly English neologism[2].
\X/ rhialto/at/xs4all.nl -- [2] For lawyers whose English/Latin is below par.

Rhialto

unread,
Nov 28, 2004, 7:21:14 PM11/28/04
to curren...@netbsd.org
On Mon 29 Nov 2004 at 00:08:07 +0100, Rhialto wrote:
> I see several things that could be going on here:
>
> 1a) some bug in cd9660 in 1.6.2 that was fixed in 2.0
> 1b) some bug in cd9660 that manifests itself only on 64-bit systems
>
> and/or
>
> 2) some bug in mkisofs that still creates corrupt cds, perhaps only when
> run on a 64-bit system
>
> Does anybody know, at least about 1a or 1b?

I tried running 'isoinfo' on the dvd (from the cdrecord pkg)

on Alpha:
-r--r--r-- 1 0 0 2149588992 Oct 30 2004 [1225476 00] FOO
on i386:
-r--r--r-- 1 0 0 -2145378304Oct 30 2004 [1225476] FOO

since here the Alpha does get the size right it seems like the on-disk
value is probably ok. Unless isoinfo itself always truncates the size to
the lowest 32 bits.

Frederick Bruckman

unread,
Nov 29, 2004, 2:47:39 AM11/29/04
to Rhialto, curren...@netbsd.org
In article <20041129002...@azenomei.knuffel.net>,

rhi...@azenomei.knuffel.net (Rhialto) writes:
>
> on Alpha:
> -r--r--r-- 1 0 0 2149588992 Oct 30 2004 [1225476 00] FOO
> on i386:
> -r--r--r-- 1 0 0 -2145378304Oct 30 2004 [1225476] FOO
>
> since here the Alpha does get the size right it seems like the on-disk
> value is probably ok. Unless isoinfo itself always truncates the size to
> the lowest 32 bits.

ISO 9660 only permits a 32-bit number there. The field is indeed 64-bits
wide, but that's only because it's supposed to contain both the big-endian
and little-endian representations of the 32-bit data-length, back-to-back
(or face-to-face, perhaps). If you put something else there, it's not an
ISO 9660 file system, and it's probably a bug that it works at all.

Now, ISO 9660 (level 3) does have a mechanism for larger files, but it's
not widely supported. Simply put, consecutive directory entries with the
same name are to be taken for "file sections" of the same file. I recently
posted a patch to the cdrecord list to do the level 3 thing. Unfortunately,
NetBSD can't read such a file system correctly, but it just works with my
DVD-Video/MPEG4 hardware player, and it also happens to work with "isoinfo"
with a slight change (included):

http://lists.berlios.de/pipermail/cdrecord-developers/2004-November/002997.html


Frederick

Martin Husemann

unread,
Nov 29, 2004, 4:30:15 AM11/29/04
to Rhialto, curren...@netbsd.org
On Mon, Nov 29, 2004 at 12:08:07AM +0100, Rhialto wrote:
> 1a) some bug in cd9660 in 1.6.2 that was fixed in 2.0

Yes, we changed the interpretation of the 32bits for size from signed
to unsigned.

I think I even patched up one or two mkisofs/growisofs variants in pkgsrc.

Martin

Wolfgang Solfrank

unread,
Nov 29, 2004, 9:27:38 AM11/29/04
to Frederick Bruckman, curren...@netbsd.org
Hi,

> Now, ISO 9660 (level 3) does have a mechanism for larger files, but it's
> not widely supported. Simply put, consecutive directory entries with the
> same name are to be taken for "file sections" of the same file. I recently
> posted a patch to the cdrecord list to do the level 3 thing. Unfortunately,

> NetBSD can't read such a file system correctly, ...

Well, I once was thinking about support for this in NetBSD's cd9660, but
discovered that the there is no easy support for this, as the "file sections"
can be of arbitrary lengths, e.g. the first 3 sections could be 3 bytes each.
Which means that you cannot easily convert a file offset to a block number
and vice versa. However, the fs independent code in NetBSD assumes in various
places that it _can_ do this conversion without the help of fs dependent code.

Ciao,
Wolfgang
--
w...@TooLs.DE Wolfgang Solfrank, TooLs GmbH

Frederick Bruckman

unread,
Nov 29, 2004, 9:56:28 AM11/29/04
to Wolfgang Solfrank, curren...@netbsd.org

I would think it would be much like appending a file, where the
subsequent bits may end up in a different cylinder group, but I don't
know my way around the file system code. [The level 3 file sections
can be anywhere on the disk, but they must be listed in order in the
directory, consecutively, and all but the first has the
"multi-section" bit set, all to make it practical for an
implementation to assemble a file from them, no doubt.]


Frederick

Wolfgang Solfrank

unread,
Nov 29, 2004, 11:43:54 AM11/29/04
to Frederick Bruckman, curren...@netbsd.org
Hi,

>> Well, I once was thinking about support for this in NetBSD's cd9660, but
>> discovered that the there is no easy support for this, as the "file sections"
>> can be of arbitrary lengths, e.g. the first 3 sections could be 3 bytes each.
>> Which means that you cannot easily convert a file offset to a block number
>> and vice versa. However, the fs independent code in NetBSD assumes in various
>> places that it _can_ do this conversion without the help of fs dependent code.
>
>
> I would think it would be much like appending a file, where the
> subsequent bits may end up in a different cylinder group, but I don't
> know my way around the file system code. [The level 3 file sections can
> be anywhere on the disk, but they must be listed in order in the
> directory, consecutively, and all but the first has the "multi-section"
> bit set, all to make it practical for an implementation to assemble a
> file from them, no doubt.]

No, this is not the same as files splitted to different cylinder groups,
as in that case, all parts are multiples of the block size. I.e., in that
case the conversion of file offsets to relative block numbers in the file
is trivial. See above about arbitrary lengths of the sections on ISO 9660.

Rhialto

unread,
Nov 29, 2004, 12:39:07 PM11/29/04
to Wolfgang Solfrank, Frederick Bruckman, curren...@netbsd.org
On Mon 29 Nov 2004 at 14:56:54 +0100, Wolfgang Solfrank wrote:
> Well, I once was thinking about support for this in NetBSD's cd9660,

as was I, when I read about this "level 3" thing,

> but discovered that the there is no easy support for this, as the
> "file sections" can be of arbitrary lengths, e.g. the first 3 sections
> could be 3 bytes each.
> Which means that you cannot easily convert a file offset to a block
> number and vice versa. However, the fs independent code in NetBSD
> assumes in various places that it _can_ do this conversion without the
> help of fs dependent code.

Not knowing the ins and outs of the filesystem code, I can't think why
any fs independent code would need a byte offset <-> relative block
number conversion. Any place where that is seriously used would be
inside the specific filesystem code itself? Perhaps it won't much matter
if the naively calculated relative block numbers are only imaginary.
And perhaps the issues are even smaller because cd9660 is a read-only
filesystem (although it would be nice if it could be implemented r/w,
but I'm not sure the specification allows that easily).

> Wolfgang

Wolfgang Solfrank

unread,
Nov 29, 2004, 1:30:35 PM11/29/04
to Rhialto, curren...@netbsd.org
Hi,

> Not knowing the ins and outs of the filesystem code, I can't think why
> any fs independent code would need a byte offset <-> relative block
> number conversion. Any place where that is seriously used would be
> inside the specific filesystem code itself? Perhaps it won't much matter
> if the naively calculated relative block numbers are only imaginary.
> And perhaps the issues are even smaller because cd9660 is a read-only
> filesystem (although it would be nice if it could be implemented r/w,
> but I'm not sure the specification allows that easily).

Well, there are various things that come into play here. There actually
_is_ no interface where fs independent code could ask the filesystem what
was the relative block number of some byte offset. But the buffer cache
is only buffering full blocks, and it is caching those as file relative
blocks, not filesystem relative ones (maybe this has changed with ubc?
probably not much, but I'd have to look), so it does determine the file
relative block number by division of the file offset.

One could probably work around this limitation in the filesystem code,
but that'd need quite a bit of hacking...

Rhialto

unread,
Nov 29, 2004, 5:22:46 PM11/29/04
to Martin Husemann, Rhialto, curren...@netbsd.org
On Mon 29 Nov 2004 at 09:56:41 +0100, Martin Husemann wrote:
> On Mon, Nov 29, 2004 at 12:08:07AM +0100, Rhialto wrote:
> > 1a) some bug in cd9660 in 1.6.2 that was fixed in 2.0
>
> Yes, we changed the interpretation of the 32bits for size from signed
> to unsigned.

Yes, I see this diff that seems to be the relevant one:

diff -u /mnt/usr/src/sys/isofs/cd9660/cd9660_node.h /usr/src/sys/fs/cd9660/cd9660_node.h
--- /mnt/usr/src/sys/isofs/cd9660/cd9660_node.h 2001-09-15 22:36:36.000000000 +0200
+++ /usr/src/sys/fs/cd9660/cd9660_node.h 2004-06-22 11:02:45.000000000 +0200
@@ -90,9 +86,9 @@
doff_t i_offset; /* offset of free space in directory */
ino_t i_ino; /* inode number of found directory */

- long iso_extent; /* extent of file */
- long i_size;
- long iso_start; /* actual start of data of file (may be different */
+ unsigned long iso_extent; /* extent of file */
+ unsigned long i_size;
+ unsigned long iso_start; /* actual start of data of file (may be different */
/* from iso_extent, if file has extended attributes) */
ISO_RRIP_INODE inode;
};

The i_size is set like this in cd9660_vfsops.c:

ip->i_size = isonum_733(isodir->size);

iso.h defines isonum_733() so:

/* 7.3.3: unsigned both-endian (little, then big) 32-bit value */
static __inline int
#if __STDC__
isonum_733(u_char *p)
#else
isonum_733(p)
u_char *p;
#endif
{
...

Unfortunately here the comment does not match the code. I expect that
there is still going to be sign extension here, and the Alpha will still
see the file size as negative (unsigned long is bigger than signed int,
so sign extension will occur).

(one kernel build later)

Indeed. I need to change isonum_733() to unsigned int (maybe even
u_int32_t, and maybe also all other isonum_7xx() functions).

(one kernel build later)

Yes, the file looks ok now on the Alpha too:

-r--r--r-- 1 root wheel 2149588992 Oct 30 13:23 FOO

Since I read that 2.0 is now final, this looks like it's going to be for
2.1.

Frederick Bruckman

unread,
Dec 2, 2004, 1:10:37 AM12/2/04
to Wolfgang Solfrank, curren...@netbsd.org
In article <41AB69AE...@tools.de>,

w...@tools.de (Wolfgang Solfrank) writes:
>
>> Not knowing the ins and outs of the filesystem code, I can't think why
>> any fs independent code would need a byte offset <-> relative block
>> number conversion. Any place where that is seriously used would be
>> inside the specific filesystem code itself?
>
> Well, there are various things that come into play here. There actually
> _is_ no interface where fs independent code could ask the filesystem what
> was the relative block number of some byte offset. But the buffer cache
> is only buffering full blocks, and it is caching those as file relative
> blocks, not filesystem relative ones (maybe this has changed with ubc?
> probably not much, but I'd have to look), so it does determine the file
> relative block number by division of the file offset.

Suppose we just assume that partial blocks won't occur in the middle of a
file? OSTA's UDF 2.50 has this to say...

2.3.6.4 Uint64 InformationLength
Only the last extent of the file body may have an extent length that is
not a multiple of the block size, see ECMA 167 4/12.1 and 4/14.14.1.1.

so recalling that we're only invoking "level 3" to permit the full size of
a file in the UDF part to be expressed in the ISO 9660 part, a more
permissive implementation in the kernel would have no practical value. The
worst consequence of ignoring the general case, is that such a strange file
would be read as having a few bytes of garbage in the middle of it.


Frederick

Rhialto

unread,
Dec 7, 2004, 12:27:24 PM12/7/04
to Frederick Bruckman, Wolfgang Solfrank, curren...@netbsd.org
On Thu 02 Dec 2004 at 00:02:15 -0600, Frederick Bruckman wrote:
> Suppose we just assume that partial blocks won't occur in the middle of a
> file?

That sounds like an excellent idea. So, when cd9660 encounters the
multiple directory entries belonging to the same file, it could/should
check for internal partial blocks and round up (so the total size
matches with the added internal junk).

Alternatively, if there are internal partial blocks, it could handle the
file as it is now (whatever that happens to be, show only a partial file
maybe), or even more alternatively it could even show multiple files
with some derived name for each. Depending on the application this
might even be not completely useless.

Of course one might also implement UDF, but a quick Google session does
not yet reveal some source code to drop in. There is
http://sourceforge.net/projects/linux-udf/ though.

> Frederick

Frederick Bruckman

unread,
Dec 7, 2004, 9:06:15 PM12/7/04
to Rhialto, Wolfgang Solfrank, curren...@netbsd.org
On Tue, 7 Dec 2004, Rhialto wrote:

> On Thu 02 Dec 2004 at 00:02:15 -0600, Frederick Bruckman wrote:
>> Suppose we just assume that partial blocks won't occur in the middle of a
>> file?
>
> That sounds like an excellent idea. So, when cd9660 encounters the
> multiple directory entries belonging to the same file, it could/should
> check for internal partial blocks and round up (so the total size
> matches with the added internal junk).

Yes, that's what I was thinking, rounding up to 2K blocks, leaving
garbage in the middle where necessary. It's not like there many level
3 images to be found in the wild. The only reason I brought it up, is
that it's a way to be able to represent arbitrarily large files on ISO
9660 images.

The ideas presented in the ISO 9660 spec were ahead of their time. You
were supposed to be able to create different files from different cuts
of the same data -- think "deleted scenes" or "director's cut". Level
3 was never widely implemented, and the so the DVD-Video folks
evidently decided to do the same thing at a higher level. Now, while
the spec apparently also allows for partial blocks in the middle of a
file, I can't see any purpose to creating such a file, and I believe
it must be a defect in the spec. What we would have then, would be
technically a partial implementation of ISO 9660 Level 3, but so what?

> Alternatively, if there are internal partial blocks, it could handle the
> file as it is now (whatever that happens to be, show only a partial file
> maybe), or even more alternatively it could even show multiple files
> with some derived name for each. Depending on the application this
> might even be not completely useless.

I think, now, that that's just too complicated. You can already see
the "raw" directory entries with a tool such as "isoinfo", and knowing
that file sections must be continuous extents, you could extract them
with "dd" (or "isoinfo"). There's no need to expose internals to the
file system.

What NetBSD does *now* -- listing only the tail end of the file -- is
maximally dumb. If only it got the start right, but the length wrong,
that would be better. ("mplayer" would just work, as it appears to
ignore the file length in favor of the information in the VOB.)

> Of course one might also implement UDF, but a quick Google session does
> not yet reveal some source code to drop in. There is
> http://sourceforge.net/projects/linux-udf/ though.

I'd be surprised if that were useful to NetBSD. UDF is a more evolved
version of ISO 9660 (with new directory structures), and the specs are
already public. Besides, as you might expect, linux-udf is GPL'd.


Frederick

0 new messages