Improving vim startup time for very large files

Ron Aaron

unread,

Jul 17, 2013, 1:58:45 AM7/17/13

to vim...@googlegroups.com

I (and my colleagues) often need to view extremely large log files (> 1G). From force of habit we use vim; but vim takes a very long time to open huge files.

Even turning off the swap etc only partially mitigates the load time.

I would like to suggest that perhaps vim could be modified to open files in 'stages'. That is, the first relatively small chunk could be read in an lines processed, and the file shown, while vim continued to load the rest of the file.

This would give a better user experience since the load time would be very fast even on large files. The downside of course is that nothing requiring line-numbers could be done until the entire file had been read.

Comments?

Ingo Karkat

unread,

Jul 17, 2013, 2:20:15 AM7/17/13

to vim...@googlegroups.com

I'd say the biggest downside is that such an implementation would
require big and fundamental changes to how a buffer is internally
represented, and because the implementation is in C, it is quite hard to
introduce the proxy objects that would stand for unloaded parts. So
there's little chance that this will happen. (Not that my potentially
wrong assessment should discourage anyone from attempting to implement
this - I would love to see such a patch.)

The best I can think of is the LargeFile plugin; please try it out:
http://www.vim.org/scripts/script.php?script_id=1506

-- regards, ingo

Ben Fritz

unread,

Jul 17, 2013, 11:22:41 AM7/17/13

to vim...@googlegroups.com

On Wednesday, July 17, 2013 1:20:15 AM UTC-5, Ingo Karkat wrote:
> On 17-Jul-2013 07:58 +0200, Ron Aaron wrote:
>
>
>
> > I (and my colleagues) often need to view extremely large log files (>
>
> > 1G). From force of habit we use vim; but vim takes a very long time
>
> > to open huge files.
>
>

> The best I can think of is the LargeFile plugin; please try it out:
>
> http://www.vim.org/scripts/script.php?script_id=1506
>

I've personally found, even on a quad-core Windows 7 system, with my entire Vim config disabled, and the settings in LargeFile applied manually, that just reading or writing files of several hundred Megabytes is unbelievably slow in Vim.

The "head" and "tail" (and I think there is another) command in Unix-like systems allows extracting parts of a file, and I'm sure there's a way to put it back together later. I'm not sure of the equivalent on Windows. I don't actually ever need to edit a file that big, so I haven't bothered looking for a plugin to do that splicing for me.

Other than that...you'll probably just need to use a different program for this.

Mike Williams

unread,

Jul 17, 2013, 11:49:51 AM7/17/13

to vim...@googlegroups.com, Ben Fritz

Does anyone have hard numbers? I have just loaded an ~900MB PDF file in
~7s (Win7 x64, 8GB, Core2Duo 2.3GHz), my normal VIM config (although I
do have maxmem always set to maximum). First time to load the file took
an age (>40s) due to loading it off disk - once it is in the OS file
cache restarting vim to read the file was quick (I'd expect some delay
with such a large file). Is this the sort of pattern you are seeing?

Mike
--
EXPERIENCE - experience is a wonderful thing. It enables you to
recognise a mistake when you make it again.

Benjamin Fritz

unread,

Jul 17, 2013, 12:13:22 PM7/17/13

to Mike Williams, vim_dev

On Wed, Jul 17, 2013 at 10:49 AM, Mike Williams
<mike.w...@globalgraphics.com> wrote:
>
> Does anyone have hard numbers? I have just loaded an ~900MB PDF file in ~7s
> (Win7 x64, 8GB, Core2Duo 2.3GHz), my normal VIM config (although I do have
> maxmem always set to maximum).

Now try writing it. I suppose if Vim is only being used as a viewer
this might be a non-issue, but I discovered the problem when trying to
create a file with a huge number of lines to test how Vim responded
to...something. I don't exactly remember what I was trying to test,
only that I gave up on having Vim create the file and instead did it
using command-line tools (a huge pain on Windows), and then eventually
gave up on testing in general because Vim was taking so long to
manipulate the file.

> First time to load the file took an age
> (>40s) due to loading it off disk

That sounds about right. Or maybe longer.

> - once it is in the OS file cache
> restarting vim to read the file was quick (I'd expect some delay with such a
> large file). Is this the sort of pattern you are seeing?
>

I don't recall.

Mike Williams

unread,

Jul 17, 2013, 12:54:59 PM7/17/13

to Benjamin Fritz, vim_dev

On 17/07/2013 17:13, Benjamin Fritz wrote:
> On Wed, Jul 17, 2013 at 10:49 AM, Mike Williams
> <mike.w...@globalgraphics.com> wrote:
>>
>> Does anyone have hard numbers? I have just loaded an ~900MB PDF file in ~7s
>> (Win7 x64, 8GB, Core2Duo 2.3GHz), my normal VIM config (although I do have
>> maxmem always set to maximum).
>
> Now try writing it. I suppose if Vim is only being used as a viewer
> this might be a non-issue, but I discovered the problem when trying to
> create a file with a huge number of lines to test how Vim responded
> to...something. I don't exactly remember what I was trying to test,
> only that I gave up on having Vim create the file and instead did it
> using command-line tools (a huge pain on Windows), and then eventually
> gave up on testing in general because Vim was taking so long to
> manipulate the file.

Elapsed time is ~30s. Putting a profiler on VIM while it was writing
the file it reported around ~5s CPU time driving the write to disk - the
rest of it is waiting for file IO to complete. So both reading and
writing of large files is (not too surprisingly) IO bound and dependent
on OS behaviour and current system usage (available memory for file
cache, paging other apps and data out as required, etc.)

The solution for handling this is a concurrent reading/writing threads
with some new and interesting problems dealing with editing before the
concurrent activities have finished. Given how atypical this issue is I
would think there is not a great push to improve things.

>> First time to load the file took an age
>> (>40s) due to loading it off disk
>
> That sounds about right. Or maybe longer.

Cold file caches are a pain in the a**e! ;-)

>> - once it is in the OS file cache
>> restarting vim to read the file was quick (I'd expect some delay with such a
>> large file). Is this the sort of pattern you are seeing?
>>
>
> I don't recall.
>

Ben Fritz

unread,

Jul 17, 2013, 1:18:46 PM7/17/13

to vim...@googlegroups.com, Benjamin Fritz

On Wednesday, July 17, 2013 11:54:59 AM UTC-5, Mike Williams wrote:
> On 17/07/2013 17:13, Benjamin Fritz wrote:
>
> > On Wed, Jul 17, 2013 at 10:49 AM, Mike Williams
>
> > <mike.w...@globalgraphics.com> wrote:
>
> >>
>
> >> Does anyone have hard numbers? I have just loaded an ~900MB PDF file in ~7s
>
> >> (Win7 x64, 8GB, Core2Duo 2.3GHz), my normal VIM config (although I do have
>
> >> maxmem always set to maximum).
>
> >
>
> > Now try writing it. I suppose if Vim is only being used as a viewer
>
> > this might be a non-issue, but I discovered the problem when trying to
>
> > create a file with a huge number of lines to test how Vim responded
>
> > to...something. I don't exactly remember what I was trying to test,
>
> > only that I gave up on having Vim create the file and instead did it
>
> > using command-line tools (a huge pain on Windows), and then eventually
>
> > gave up on testing in general because Vim was taking so long to
>
> > manipulate the file.
>
>
>
> Elapsed time is ~30s.

That's not what I saw. I let Vim run after doing :w for several minutes, and then force-killed it.

My file was millions of very small lines (maybe empty, I don't remember). I don't plan on trying again for now.

Ernie Rael

unread,

Jul 17, 2013, 6:13:20 PM7/17/13

to vim...@googlegroups.com

On 7/17/2013 9:54 AM, Mike Williams wrote:
> ...

> Elapsed time is ~30s. Putting a profiler on VIM while it was writing
> the file it reported around ~5s CPU time driving the write to disk -
> the rest of it is waiting for file IO to complete. So both reading
> and writing of large files is (not too surprisingly) IO bound and
> dependent on OS behaviour and current system usage (available memory
> for file cache, paging other apps and data out as required, etc.)

On some systems it also depends on how it is written. Is it a single
huge write, or lots of small ones, or something in between.

When it's writing, vim copies all the characters doesn't it? I wonder
how well it would perform if the bytes were copied into a memory mapped
file.

-ernie

John Little

unread,

Jul 18, 2013, 11:29:59 PM7/18/13

to vim...@googlegroups.com

On Wednesday, July 17, 2013 5:58:45 PM UTC+12, Ron Aaron wrote:
> I (and my colleagues) often need to view extremely large log files (> 1G).From force of habit we use vim;

Vim's syntax colouring is brilliant for viewing log files; simple enough for on-the-fly highlighting of the stuff one is interested in.

> but vim takes a very long time to open huge files.

A "very long time"? My vim on Kubuntu 13.04, on a 7 year old dual core Athlon, 4 GiB RAM, takes 30 s to load 1 GiB of C source, after clearing the OS disc cache. I mention this because there have been reports of slowness of vim running on "network shares", and maybe there's some such slowing you down.

If you're just looking at the end of the file,

tail -n 100000 large_file | vim -

is quick and can be useful. (Shame vim won't open a loop device.)

Regards, John Little

Reply all

Reply to author

Forward