Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

EOFError: marshal data too short -- causes?

5,325 views
Skip to first unread message

Glenn Linderman

unread,
Dec 29, 2015, 1:58:34 AM12/29/15
to
Here's a sanatized stack trace off my web server:

File ".../cgihelpers.py", line 10, in
import cgitb
File ".../py34/lib/python3.4/cgitb.py", line 24, in
import inspect
File ".../py34/lib/python3.4/inspect.py", line 54, in
from dis import COMPILER_FLAG_NAMES as _flag_names
File "", line 2237, in _find_and_load
File "", line 2226, in _find_and_load_unlocked
File "", line 1200, in _load_unlocked
File "", line 1129, in _exec
File "", line 1467, in exec_module
File "", line 1570, in get_code
File "", line 656, in _compile_bytecode
EOFError: marshal data too short


It worked this morning, and does this now. I hadn't changed anything.

The only reference I find on Google seems to be related to
out-of-disk-space, so I cleaned up some files, but I'm not sure of how
the web server mounts things; df didn't show particularly high usage in
any of the disks it reported on.

The files I cleaned up are ever-growing log files that I need to clean
up now and then manually, but cleaning them up didn't help... and df
wasn't complaining anyway. So maybe out-of-disk-space is a red herring,
or only one cause of this symptom.

When I log in to the web server with Putty, I can run Python at the
command prompt, and "import dis" suffices to reproduce the problem.

Are there other causes of this symptom than out-of-disk-space? If it is
out-of-disk-space, how could one determine which disk?

Maybe if not really out of space, it is a bad permission, disallowing
creation of a temporary file? But how could one determine which
temporary file, and where it would be written?


Terry Reedy

unread,
Dec 29, 2015, 2:20:24 AM12/29/15
to
On 12/29/2015 1:50 AM, Glenn Linderman wrote:
> Here's a sanatized stack trace off my web server:
>
> File ".../cgihelpers.py", line 10, in
> import cgitb
> File ".../py34/lib/python3.4/cgitb.py", line 24, in
> import inspect
> File ".../py34/lib/python3.4/inspect.py", line 54, in
> from dis import COMPILER_FLAG_NAMES as _flag_names
> File "", line 2237, in _find_and_load
> File "", line 2226, in _find_and_load_unlocked
> File "", line 1200, in _load_unlocked
> File "", line 1129, in _exec
> File "", line 1467, in exec_module
> File "", line 1570, in get_code
> File "", line 656, in _compile_bytecode
> EOFError: marshal data too short
>
>
> It worked this morning, and does this now. I hadn't changed anything.

Since it crashes trying to unmarshal compiled dis bytecode, I would
assume that the .pyc file is corrupted and remove it. Based on the
above, it should be in
.../py34/lib/python3.4/__pycache__/dis.*.pyc
Python will then recompile dis and write a new .pyc file.

--
Terry Jan Reedy

Glenn Linderman

unread,
Dec 29, 2015, 3:08:18 AM12/29/15
to
Thank you, thank you, thank you, thank you, thank you, thank you, thank
you, thank you, thank you, thank you, thank you, thank you, thank you,
thank you, thank you, thank you, thank you, thank you, thank you, thank
you, thank you, thank you, thank you, thank you, thank you, thank you,
thank you, thank you, thank you, thank you, thank you, thank you, thank
you, thank you, thank you, thank you, thank you, thank you, thank you,
thank you!

The site is working now, again. Thank you, again.

Because of the File "" lines, and the fact that there were recent
comments about frozen import stuff on python-dev, I was thinking that
the corruption was at a lower level, and never thought to zap a .pyc.

OK, so I actually renamed it instead of zapping it. Them, actually,
there were both .pyc and .pyo.

Now the __pycache__ directory is full of .pyc and .pyo files (from the
install? On April 19th? (the date on most of the files). But:

arel...@areliabledomain.com [~/py34/lib/python3.4/__pycache__]# ll dis*
-rw-r--r-- 1 areliabl areliabl 14588 Dec 29 00:27 dis.cpython-34.pyc
-rw-r--r-- 1 areliabl areliabl 8192 Dec 28 19:16 dis.cpython-34.pyc-xxx
-rw-r--r-- 1 areliabl areliabl 14588 Apr 19 2015 dis.cpython-34.pyo-xxx

(I renamed the existing .py* files by appending -xxx).

So we can see that somehow, today at 19:16 (probably UTC) the dis.*.pyc
file got chopped to 8192 bytes. That's a suspicious number, being a
power of 2... but... I haven't updated Python since originally
installing it on the web server, on April 19th. So why would _that_
.pyc file have today's date? Because of UTC, the replacement has
tomorrow's date :) But it seems like it should have had Apr 19 2015
like all the rest.

Except, all the rest don't have Apr 19 2015... most of them do, but
there are a fair number from today, and a couple from Dec 15 (listed
below). I don't quickly see any others that are suspiciously exactly a
power of 2!

But, under the assumption that the install created all these files in
the first place, why would _any_ of them have newer dates? I haven't
gone around deleting any .pyc files since April. And if they are
already there, why would Python rebuild them? Isn't the point of the
.pyc files to not need to recompile? And even if the original build
didn't build them, since I haven't touch the python sources on this web
server for months, shouldn't all the files be months old, at least?

And isn't it rather suspicious that of the ones that are rebuilt, that
all of them have exactly the same timestamp, rather than being sprinkled
around with different dates? Well, the two from Dec 15 have the same
time, and all the ones from today have the same time. But that doesn't
seem like some sort of "random error or access conflict accessing file
causing it to be rebuilt"....

Should I accuse my web host of playing with these files? Are they
backing up/restoring? Are they simply touching the files? Is their
infrastructure flaky such that whole groups of files get deleted now and
then (and either rebuilt or restored with a different date)?


arel...@areliabledomain.com [~/py34/lib/python3.4/__pycache__]# find
-mtime -240 -ls
113654924 20 drwxr-xr-x 2 areliabl areliabl 20480 Dec 29 00:27 .
113639463 76 -rw-r--r-- 1 areliabl areliabl 76650 Dec 28 19:16
./inspect.cpython-34.pyc
113639477 16 -rw-r--r-- 1 areliabl areliabl 14588 Dec 29 00:27
./dis.cpython-34.pyc
113639458 16 -rw-r--r-- 1 areliabl areliabl 12361 Dec 28 19:16
./ast.cpython-34.pyc
113639451 36 -rw-r--r-- 1 areliabl areliabl 32773 Dec 28 19:16
./shutil.cpython-34.pyc
113639468 12 -rw-r--r-- 1 areliabl areliabl 10371 Dec 28 19:16
./contextlib.cpython-34.pyc
113639469 16 -rw-r--r-- 1 areliabl areliabl 15916 Dec 28 19:16
./lzma.cpython-34.pyc
113639475 16 -rw-r--r-- 1 areliabl areliabl 15130 Dec 28 19:16
./bz2.cpython-34.pyc
113639486 68 -rw-r--r-- 1 areliabl areliabl 67747 Dec 28 19:16
./tarfile.cpython-34.pyc
113639479 68 -rw-r--r-- 1 areliabl areliabl 65875 Dec 28 19:16
./argparse.cpython-34.pyc
113639459 48 -rw-r--r-- 1 areliabl areliabl 45794 Dec 28 19:16
./zipfile.cpython-34.pyc
113639484 32 -rw-r--r-- 1 areliabl areliabl 31374 Dec 28 19:16
./platform.cpython-34.pyc
113639450 8 -rw-r--r-- 1 areliabl areliabl 4608 Dec 15 03:01
./copyreg.cpython-34.pyc
113639474 16 -rw-r--r-- 1 areliabl areliabl 13665 Dec 28 19:16
./textwrap.cpython-34.pyc
113639454 44 -rw-r--r-- 1 areliabl areliabl 43428 Dec 28 19:16
./subprocess.cpython-34.pyc
113639452 40 -rw-r--r-- 1 areliabl areliabl 39051 Dec 15 03:01
./threading.cpython-34.pyc
113639472 16 -rw-r--r-- 1 areliabl areliabl 12721 Dec 28 19:16
./gettext.cpython-34.pyc
113639470 8 -rw-r--r-- 1 areliabl areliabl 8060 Dec 28 19:16
./copy.cpython-34.pyc
113639467 8 -rw-r--r-- 1 areliabl areliabl 6370 Dec 28 19:16
./hashlib.cpython-34.pyc
113639483 12 -rw-r--r-- 1 areliabl areliabl 11461 Dec 28 19:16
./pprint.cpython-34.pyc
113639460 8 -rw-r--r-- 1 areliabl areliabl 7871 Dec 28 19:16
./string.cpython-34.pyc
113639478 32 -rw-r--r-- 1 areliabl areliabl 29697 Dec 28 19:16
./cgi.cpython-34.pyc
113639464 20 -rw-r--r-- 1 areliabl areliabl 17601 Dec 28 19:16
./pkgutil.cpython-34.pyc
113639453 8 -rw-r--r-- 1 areliabl areliabl 6437 Dec 28 19:16
./quopri.cpython-34.pyc
113639476 8 -rw-r--r-- 1 areliabl areliabl 8192 Dec 28 19:16
./dis.cpython-34.pyc-xxx
113639466 92 -rw-r--r-- 1 areliabl areliabl 90383 Dec 28 19:16
./pydoc.cpython-34.pyc
113639461 12 -rw-r--r-- 1 areliabl areliabl 8466 Dec 28 19:16
./_weakrefset.cpython-34.pyc
113639465 20 -rw-r--r-- 1 areliabl areliabl 19060 Dec 28 19:16
./random.cpython-34.pyc
arel...@areliabledomain.com [~/py34/lib/python3.4/__pycache__]#

D'Arcy J.M. Cain

unread,
Dec 29, 2015, 8:57:04 AM12/29/15
to
On Tue, 29 Dec 2015 00:01:00 -0800
Glenn Linderman <v+py...@g.nevcal.com> wrote:
> OK, so I actually renamed it instead of zapping it. Them, actually,

Really, just zap them. They are object code. Even if you zap a
perfectly good .pyc file a perfectly good one will be re-created as
soon as you import it. No need to clutter up you file system.

--
D'Arcy J.M. Cain
Vybe Networks Inc.
http://www.VybeNetworks.com/
IM:da...@Vex.Net VoIP: sip:da...@VybeNetworks.com

Terry Reedy

unread,
Dec 29, 2015, 4:01:54 PM12/29/15
to
On 12/29/2015 3:01 AM, Glenn Linderman wrote:

> Now the __pycache__ directory is full of .pyc and .pyo files (from the
> install?

The installer optionally runs compileall on /Lib and recursively on its
subpackages. The option defaults to 'yes', at least for 'install for
everyone', as writing files within the default Program Files location
requires Admin permission.

On April 19th? (the date on most of the files). But:
>
> arel...@areliabledomain.com [~/py34/lib/python3.4/__pycache__]# ll dis*
> -rw-r--r-- 1 areliabl areliabl 14588 Dec 29 00:27 dis.cpython-34.pyc
> -rw-r--r-- 1 areliabl areliabl 8192 Dec 28 19:16 dis.cpython-34.pyc-xxx
> -rw-r--r-- 1 areliabl areliabl 14588 Apr 19 2015 dis.cpython-34.pyo-xxx
>
> (I renamed the existing .py* files by appending -xxx).
>
> So we can see that somehow, today at 19:16 (probably UTC) the dis.*.pyc
> file got chopped to 8192 bytes. That's a suspicious number, being a
> power of 2... but... I haven't updated Python since originally
> installing it on the web server, on April 19th. So why would _that_
> .pyc file have today's date? Because of UTC, the replacement has
> tomorrow's date :) But it seems like it should have had Apr 19 2015
> like all the rest.

I updated to 2.7.11, 3.4.4, and 3.5.1 a couple of weeks ago, so the
timestamps are all fresh. So I don't know what happened with 3.4.3
timestamps from last April and whether Windows itself touches the files.
I just tried importing a few and Python did not.

> And isn't it rather suspicious that of the ones that are rebuilt, that
> all of them have exactly the same timestamp, rather than being sprinkled
> around with different dates? Well, the two from Dec 15 have the same
> time, and all the ones from today have the same time. But that doesn't
> seem like some sort of "random error or access conflict accessing file
> causing it to be rebuilt"....
>
> Should I accuse my web host of playing with these files? Are they
> backing up/restoring? Are they simply touching the files? Is their
> infrastructure flaky such that whole groups of files get deleted now and
> then (and either rebuilt or restored with a different date)?

You could ask, without 'accusing'. Or you could re-run compileall
yourself. Or you could upgrade to 3.4.4 (I recommend this) and let the
installer do so, or not.

--
Terry Jan Reedy

Glenn Linderman

unread,
Dec 30, 2015, 3:42:28 PM12/30/15
to
On 12/29/2015 5:56 AM, D'Arcy J.M. Cain wrote:
> On Tue, 29 Dec 2015 00:01:00 -0800
> Glenn Linderman <v+py...@g.nevcal.com> wrote:
>> OK, so I actually renamed it instead of zapping it. Them, actually,
> Really, just zap them. They are object code. Even if you zap a
> perfectly good .pyc file a perfectly good one will be re-created as
> soon as you import it. No need to clutter up you file system.
>
Yes, the only value would be if the type of corruption could be
determined from the content.

Glenn Linderman

unread,
Dec 30, 2015, 3:50:22 PM12/30/15
to
On 12/29/2015 1:00 PM, Terry Reedy wrote:
> I updated to 2.7.11, 3.4.4, and 3.5.1 a couple of weeks ago, so the
> timestamps are all fresh. So I don't know what happened with 3.4.3
> timestamps from last April and whether Windows itself touches the
> files. I just tried importing a few and Python did not.

I'm a Windows user, too, generally, but the web host runs Linux.

I suppose, since the install does the compileall, that I could set all
the __pycache__ files to read-only, even for "owner". Like you said,
those files _can't_ be updated without Admin/root permission when it is
a root install... so there would be no need, once compileall has been
done, for the files to be updated until patches would be applied. This
isn't a root install, though, but a "user" install.

Level1 support at the web host claims they never touch user files unless
the user calls and asks them to help with something that requires it.
And maybe Level1 support religiously follows that policy, but other
files have changed, so that policy doesn't appear to be universally
applied for all personnel there... so the answer isn't really responsive
to the question, but the tech I talked to was as much a parrot as a tech...

Glenn

Glenn Linderman

unread,
Jan 7, 2016, 2:02:55 PM1/7/16
to
So this morning the problem happens again. 48 files or directories
modified at 5:47am, while I'm sound asleep so the web site is down for
over 3 hours until I wake up and notice (both because my bootup process
always checks, and because I had several emails about it).

2 of the files had the suspicious 4096 EOF, and deleting the first
caused it to be rebuilt and the second to be noticed, and deleting the
second caused it to be rebuilt and cured the site.

But all the touched files are .pyc files (and the directories
__pycache__ directories). None of the source files were modified. So
why would any .pyc files ever be updated if the source files are not?
Are there _any_ Python-specific reasons?

My only speculation is a problem accessing the .pyc file on first
attempt, which would be a file system problem, not a Python problem?

Are there other speculative reasons?

And then why would a short file be built? Conflict with multiple
processes trying to rebuild it at the same time? Or another file system
problem? Or???

This-could-be-annoying-if-it-keeps-happening-ly yours,
Glenn

Glenn Linderman

unread,
Jan 8, 2016, 3:12:31 PM1/8/16
to
On 1/7/2016 7:44 PM, Dennis Lee Bieber wrote:
> On Thu, 7 Jan 2016 10:55:38 -0800, Glenn Linderman <v+py...@g.nevcal.com>
> declaimed the following:
>
>> But all the touched files are .pyc files (and the directories
>> __pycache__ directories). None of the source files were modified. So
>> why would any .pyc files ever be updated if the source files are not?
>> Are there _any_ Python-specific reasons?
>>
> Two different versions of Python accessing the same modules? That might
> induce one to do a rebuild of the .pyc from the .py... Especially if the
> variant is something being run (directly or indirectly) from some scheduled
> task.

Thanks for the idea, but there is only one version of Python installed
in that directory structure (and all of my "shared host" portion of the
server).

> Some conflict with a backup process corrupting files?

This seems more possible, simply because I don't understand their backup
process. Still, if Python only reads shared, and there'd be no reason
for a backup process to do more than read shared, there shouldn't be a
conflict...

But it seems clear that there _is_ some conflict or some mysterious bug
that is triggering something.

The installation happened really fast, when I did it, many many files
seem to have the same timestamp. Maybe some of the sources are really
close in touch time to the .pyc? But still, it seems that would have
been resolved long ago... But I could go "touch" all the .pyc to give
them nice new timestamps?

Maybe I should mark all the current .pyc read-only even to myself? That
would prevent them from being rebuilt by Python, and maybe when whatever
goes wrong goes wrong, an error would return to the user, but maybe the
next access would work.... that'd be better than letting whatever goes
wrong modify/corrupt the .pyc, so that _every future_ access fails?

Maybe before launching Python I should do a "find ~/py34 --name *.pyc >
/dev/null" (or some log file) to cache the directory structure before
Python looks at it, and give the mysterious bug a chance to happen with
an irrelevant activity?

0 new messages