Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Handling large files with Emacs

958 views
Skip to first unread message

Fab

unread,
Oct 23, 2012, 3:33:07 PM10/23/12
to
Dear All

I know this subject might have been discussed alot already.

I had to work with a vast bibTeX file today and Emacs almost died and
got super slow, which made working with Emacs impossible. So I switched
to Vim. However, I use Emacs since some years now and the only thing
that really bothers me is this issue. I do not need to work with large
files a lot, but when I do, Emacs is not my side kick anymore. I tried
some additional packages, but none of them really satisfied me. Why is
Vim still so fast and smooth when it deals with large files? How do you
handle this issue with Emacs? I would love to see Emacs get over this
issue some day.

Thanks for your feedback!

Fab

Eli Zaretskii

unread,
Oct 23, 2012, 3:52:20 PM10/23/12
to help-gn...@gnu.org
> From: Fab <fab...@gmail.com>
> Date: Tue, 23 Oct 2012 21:33:07 +0200
>
> I had to work with a vast bibTeX file today and Emacs almost died and
> got super slow, which made working with Emacs impossible. So I switched
> to Vim. However, I use Emacs since some years now and the only thing
> that really bothers me is this issue. I do not need to work with large
> files a lot, but when I do, Emacs is not my side kick anymore. I tried
> some additional packages, but none of them really satisfied me. Why is
> Vim still so fast and smooth when it deals with large files? How do you
> handle this issue with Emacs? I would love to see Emacs get over this
> issue some day.

How large is "large"? Which version of Emacs was that? On what
platform (CPU and OS)?

It's impossible to give you any meaningful answer without knowing at
least that much.

Fab

unread,
Oct 23, 2012, 4:09:40 PM10/23/12
to
Eli Zaretskii <el...@gnu.org> writes:

>> From: Fab <fab...@gmail.com>
>> Date: Tue, 23 Oct 2012 21:33:07 +0200
>>
>> I had to work with a vast bibTeX file today and Emacs almost died and
>> got super slow, which made working with Emacs impossible. So I switched
>> to Vim. However, I use Emacs since some years now and the only thing
>> that really bothers me is this issue. I do not need to work with large
>> files a lot, but when I do, Emacs is not my side kick anymore. I tried
>> some additional packages, but none of them really satisfied me. Why is
>> Vim still so fast and smooth when it deals with large files? How do you
>> handle this issue with Emacs? I would love to see Emacs get over this
>> issue some day.
>
> How large is "large"? Which version of Emacs was that? On what
> platform (CPU and OS)?
Actually "only" ~6mb with 120000 lines, Emacs version is GNU Emacs
24.2.50.1 (x86_64-unknown-linux-gnu, X toolkit, Xaw scroll bars) on an
AMD64-bit quad core with Debian 6.0.4

> It's impossible to give you any meaningful answer without knowing at
> least that much.
Thanks
Fab

Eli Zaretskii

unread,
Oct 23, 2012, 4:40:02 PM10/23/12
to help-gn...@gnu.org
> From: Fab <fab...@gmail.com>
> Date: Tue, 23 Oct 2012 22:09:40 +0200
>
> > How large is "large"? Which version of Emacs was that? On what
> > platform (CPU and OS)?
> Actually "only" ~6mb with 120000 lines

That's ridiculously small. I routinely edit files approaching 500MB
without any problems, and that's on a 32-bit machine, whereas yours is
a 64-bit one. So something is seriously wrong with your system, or
maybe with the Emacs binary (but I can hardly believe it).

Fab

unread,
Oct 23, 2012, 4:49:57 PM10/23/12
to
I will go further into that! Thanks for the feedback

Tom

unread,
Oct 24, 2012, 3:52:58 AM10/24/12
to help-gn...@gnu.org
Eli Zaretskii <eliz <at> gnu.org> writes:

>
> That's ridiculously small. I routinely edit files approaching 500MB
> without any problems, and that's on a 32-bit machine, whereas yours is
> a 64-bit one. So something is seriously wrong with your system, or
> maybe with the Emacs binary (but I can hardly believe it).
>
>

I think it's because syntax highlight or something. When I try opening
SQL dumps (say, 200MB) then Emacs grinds to a halt for a minute or
so, and moving in the file is very slow even after that.

If I don't use an .sql extension, so that the file is not opened
in SQL mode then it's much better. So opening big files in fundamental
mode works well, but if it has its own mode with syntax highlighting
then it's pretty much unusable.

I think Fab has a similar problem, because he opens the big file
in Bibtex mode which also does it's own stuff, parsing the buffer
with regexps for syntax highlighting, or something like this.


Jambunathan K

unread,
Oct 24, 2012, 5:22:07 AM10/24/12
to Tom, help-gn...@gnu.org
Tom <adatg...@gmail.com> writes:

> Eli Zaretskii <eliz <at> gnu.org> writes:
>
>>
>> That's ridiculously small. I routinely edit files approaching 500MB
>> without any problems, and that's on a 32-bit machine, whereas yours is
>> a 64-bit one. So something is seriously wrong with your system, or
>> maybe with the Emacs binary (but I can hardly believe it).
>>
>>
>
> I think it's because syntax highlight or something. When I try opening
> SQL dumps (say, 200MB) then Emacs grinds to a halt for a minute or
> so, and moving in the file is very slow even after that.

Have you experimented with `font-lock-maximum-size' together with
`font-lock-support-mode'?

,----[ C-h v font-lock-maximum-size RET ]
| font-lock-maximum-size is a variable defined in `font-lock.el'.
| Its value is 256000
|
| This variable is obsolete since 24.1.
|
| Documentation:
| Maximum buffer size for unsupported buffer fontification.
| When `font-lock-support-mode' is nil, only buffers smaller than
| this are fontified. This variable has no effect if a Font Lock
| support mode (usually `jit-lock-mode') is enabled.
|
| If nil, means size is irrelevant.
| If a list, each element should be a cons pair of the form (MAJOR-MODE . SIZE),
| where MAJOR-MODE is a symbol or t (meaning the default). For example:
| ((c-mode . 256000) (c++-mode . 256000) (rmail-mode . 1048576))
| means that the maximum size is 250K for buffers in C or C++ modes, one megabyte
| for buffers in Rmail mode, and size is irrelevant otherwise.
|
| You can customize this variable.
|
| [back]
`----

>
> If I don't use an .sql extension, so that the file is not opened
> in SQL mode then it's much better. So opening big files in fundamental
> mode works well, but if it has its own mode with syntax highlighting
> then it's pretty much unusable.
>
> I think Fab has a similar problem, because he opens the big file
> in Bibtex mode which also does it's own stuff, parsing the buffer
> with regexps for syntax highlighting, or something like this.
>
>
>

--

Tom

unread,
Oct 24, 2012, 5:34:41 AM10/24/12
to help-gn...@gnu.org
Jambunathan K <kjambunathan <at> gmail.com> writes:
> >
> > I think it's because syntax highlight or something. When I try opening
> > SQL dumps (say, 200MB) then Emacs grinds to a halt for a minute or
> > so, and moving in the file is very slow even after that.
>
> Have you experimented with `font-lock-maximum-size' together with
> `font-lock-support-mode'?
>

Not yet, because I rarely need this and if I do then it's simpler to open
the file in fundamental mode.

But the real question is if font locking is really the cuplrit then why
do we need to resort such special settings? We have fast enough computers
and AFAIK jit font-lock is the default, so it should not be a problem.
Even if lisp perfomance cannot be improved much, font lock should be
clever enough to stay in the background and do stealth fontification
on remote parts of the large buffer only if the user is idle and only
in chunks, so it can yield to user input. Isn't this what JIT font-lock
is supposed to do in the first place?


Tom

unread,
Oct 24, 2012, 5:40:18 AM10/24/12
to help-gn...@gnu.org
Jambunathan K <kjambunathan <at> gmail.com> writes:

>
> Have you experimented with `font-lock-maximum-size' together with
> `font-lock-support-mode'?
>
> ,----[ C-h v font-lock-maximum-size RET ]
> | font-lock-maximum-size is a variable defined in `font-lock.el'.
> | Its value is 256000
> |
> | This variable is obsolete since 24.1.
> |

BTW, why is it obsolete? Does it mean it's unnecessary and the
size of the buffer should not be a problem?

I have this default setting like above and font lock is turned
on in big buffers nevertheless.


Jambunathan K

unread,
Oct 24, 2012, 9:51:32 AM10/24/12
to Tom, help-gn...@gnu.org
Read the docstring again. Kicks in only for specific settings of
`font-lock-support-mode'. (I was confused as well, before I looked at
re-read the docstring.)

I am just another user. There is no overhead to experimenting with
other settings and seeing whether there are differences in the
behaviour.
--

Stefan Monnier

unread,
Oct 24, 2012, 11:29:26 AM10/24/12
to
> I had to work with a vast bibTeX file today and Emacs almost died and
> got super slow, which made working with Emacs impossible.

Sounds like a performance bug in bibtex-mode. Please M-x
report-emacs-bug, ideally with a sample large bibtex file and a recipe
showing exactly which operation(s) get slow.


Stefan

Fab

unread,
Oct 24, 2012, 1:19:29 PM10/24/12
to
Thank you all for the excellent feedback!

The first thing that came to my mind was font-lock. However, disabling
font-lock did not solve my problem, so I decided to ask here. After Eli's
response from yesterday, I started to debug my init file. It actually
was all my mistake and I was not aware the linum.el caused all the
trouble! In fact, I do not really need this minor-mode but I never
really disabled it from my init file. After doing so, I have absolutely
no problem with editing the vast bibTeX file, even with font-lock. I
have also created a huge ASCII file of approx 500MB and it is no problem
to edit the file.

Thank you!
Fab

Eli Zaretskii

unread,
Oct 24, 2012, 2:16:12 PM10/24/12
to help-gn...@gnu.org
> From: Tom <adatg...@gmail.com>
> Date: Wed, 24 Oct 2012 09:34:41 +0000 (UTC)
>
> Jambunathan K <kjambunathan <at> gmail.com> writes:
> > >
> > > I think it's because syntax highlight or something. When I try opening
> > > SQL dumps (say, 200MB) then Emacs grinds to a halt for a minute or
> > > so, and moving in the file is very slow even after that.
> >
> > Have you experimented with `font-lock-maximum-size' together with
> > `font-lock-support-mode'?
> >
>
> Not yet, because I rarely need this and if I do then it's simpler to open
> the file in fundamental mode.
>
> But the real question is if font locking is really the cuplrit then why
> do we need to resort such special settings? We have fast enough computers
> and AFAIK jit font-lock is the default, so it should not be a problem.

And it isn't a problem, indeed, unless the mode in question does
something pathological with its definition of font-lock-keywords etc.

> Even if lisp perfomance cannot be improved much, font lock should be
> clever enough to stay in the background and do stealth fontification
> on remote parts of the large buffer only if the user is idle and only
> in chunks, so it can yield to user input. Isn't this what JIT font-lock
> is supposed to do in the first place?

It is, and it does. But whenever you scroll to another portion in the
buffer, JIT font-lock fontifies the displayed portion before
displaying it, which could slow down redisplay, regardless of stealth.

Eli Zaretskii

unread,
Oct 24, 2012, 2:16:43 PM10/24/12
to help-gn...@gnu.org
> From: Tom <adatg...@gmail.com>
> Date: Wed, 24 Oct 2012 09:40:18 +0000 (UTC)
>
> Jambunathan K <kjambunathan <at> gmail.com> writes:
>
> >
> > Have you experimented with `font-lock-maximum-size' together with
> > `font-lock-support-mode'?
> >
> > ,----[ C-h v font-lock-maximum-size RET ]
> > | font-lock-maximum-size is a variable defined in `font-lock.el'.
> > | Its value is 256000
> > |
> > | This variable is obsolete since 24.1.
> > |
>
> BTW, why is it obsolete? Does it mean it's unnecessary and the
> size of the buffer should not be a problem?

Yes.

Tom

unread,
Oct 24, 2012, 3:09:20 PM10/24/12
to help-gn...@gnu.org
Eli Zaretskii <eliz <at> gnu.org> writes:
> >
> > BTW, why is it obsolete? Does it mean it's unnecessary and the
> > size of the buffer should not be a problem?
>
> Yes.
>

My other tip is long lines cause problems. SQL dumps have usually
really long lines (the one below has lines with 1 million characters).

Here's as a testcase a dump from Wikipedia which is a 16MB download
and it extracts to a 64MB sql file:

http://dumps.wikimedia.org/enwiki/20110901/enwiki-20110901-category.sql.gz

If I extract it, open it in emacs and try to scroll in the file here
and there then emacs freezes for a minute or longer.

Actually with the above file I had to kill emacs, because it did
not respond to anything after I did some random scrolling in the
file, jumping to the end and then scrolling backwards, etc.

I tried it with -Q (emacs 24.1.1 on Windows).


Stefan Monnier

unread,
Oct 24, 2012, 3:32:30 PM10/24/12
to
> was all my mistake and I was not aware the linum.el caused all the
> trouble! In fact, I do not really need this minor-mode but I never

You might like to try nlinum.el (from GNU ELPA).
It provides similar functionality, but with a slightly different
implementation and I've just updated it to version 1.1 (which should
appear in GNU ELPA within a few days) which speeds it up significantly
on large files.


Stefan

Stefan Monnier

unread,
Oct 24, 2012, 3:47:57 PM10/24/12
to
> My other tip is long lines cause problems.

Yes, we have various performance problems with very long lines.
Some of those can be worked around, e.g. using fundamental-mode or
turning off things like line-number-mode.


Stefan

Ludwig, Mark

unread,
Oct 24, 2012, 4:12:11 PM10/24/12
to Tom, help-gn...@gnu.org
> From: Tom
> Sent: Wednesday, October 24, 2012 2:09 PM
> To: help-gn...@gnu.org
> Subject: Re: Handling large files with Emacs
Sure, I reported this as bug 9589 (http://debbugs.gnu.org/cgi/bugreport.cgi?bug=9589) against 23.1 on Windows. You may wish to look at the bug report for the partial information I determined about which commands are fast and which are slow with very long lines.

Cheers,
Mark


Drew Adams

unread,
Oct 24, 2012, 4:19:29 PM10/24/12
to Fab, help-gn...@gnu.org
> It actually was all my mistake and I was not aware the
> linum.el caused all the trouble!

Maybe linum should (by default) let users know when they might incur a
performance penalty because the buffer is large. Maybe we should have a user
option that specifies a size limit beyond which linum would avert you of the
potential slowdown, or even (optionally) turn itself off.

That's what we did for font-locking, for instance (option
`font-lock-maximum-size'), before just-in-time font-locking was available.

IOW, maybe we should not be too quick to just ascribe this to user error. Maybe
linum could be more helpful to users.


0 new messages