[vim/vim] Optimize file read performance for large files (PR #19612)

mattn

unread,

Mar 8, 2026, 3:28:35 AM (4 days ago) Mar 8

to vim/vim, Subscribed

Use word-at-a-time technique to skip ASCII bytes during UTF-8 validation in fileio.c
Replace byte-by-byte newline/NUL scanning with memchr() to leverage SIMD-optimized libc implementations
Skip unnecessary netbeans/channel notifications and may_invoke_listeners() during initial file read (ML_APPEND_NEW) in memline.c

Performance

$ TERM=dumb ./vim -u NONE -N --not-a-term -c 'q' /tmp/large_test_6000000.txt

(6,000,000 lines, 657MB, 3-run average)

	real	user	sys
Before	2.01s	1.07s	0.94s
After	1.18s	0.30s	0.87s
Improvement	-41%	-72%	-7%

Total 40% performance improvement. The main gain is in user-space CPU time, reduced by ~72% through avoiding per-byte iteration.

You can view, comment on, or merge this pull request online at:

https://github.com/vim/vim/pull/19612

Commit Summary

97d20ff Optimize file read performance for large files

File Changes

(2 files)

M src/fileio.c (164)
M src/memline.c (18)

Patch Links:

—
Reply to this email directly, view it on GitHub.
You are receiving this because you are subscribed to this thread.

mattn

unread,

Mar 8, 2026, 6:49:18 AM (4 days ago) Mar 8

to vim/vim, Push

@mattn pushed 1 commit.

a429e48 Fix UTF-8 validation loop after removing ++p from for-header

—
View it on GitHub or unsubscribe.
You are receiving this because you are subscribed to this thread.

mattn

unread,

Mar 8, 2026, 6:59:19 AM (4 days ago) Mar 8

to vim/vim, Push

@mattn pushed 1 commit.

d88a68f Fix UTF-8 validation loop after removing ++p from for-header

—
View it on GitHub or unsubscribe.

You are receiving this because you are subscribed to this thread.

mattn

unread,

Mar 8, 2026, 7:15:50 AM (4 days ago) Mar 8

to vim/vim, Push

@mattn pushed 1 commit.

c650ad8 fix(fileio): fix NONASCII_MASK for Win64 where long_u != long

—
View it on GitHub or unsubscribe.

You are receiving this because you are subscribed to this thread.

mattn

unread,

Mar 8, 2026, 9:20:35 AM (4 days ago) Mar 8

to vim/vim, Push

@mattn pushed 1 commit.

0aa7029 docs(fileio): update NONASCII_MASK comment to describe the macro's purpose

—
View it on GitHub or unsubscribe.

You are receiving this because you are subscribed to this thread.

Christian Brabandt

unread,

Mar 9, 2026, 3:21:20 PM (3 days ago) Mar 9

to vim/vim, Subscribed

chrisbra left a comment (vim/vim#19612)

Thanks. I reviewed the changes, but I am not sure I missed anything so I am a bit cautious to include this. Let me double check this over the next day

—
Reply to this email directly, view it on GitHub.

You are receiving this because you are subscribed to this thread.

mattn

unread,

Mar 9, 2026, 7:50:41 PM (3 days ago) Mar 9

to vim/vim, Subscribed

mattn left a comment (vim/vim#19612)

@chrisbra Thank you.

Below is a review result by Claude Opus 4.6

Code Review: PR #19612 — Optimize file read performance for large files

Overview

This PR optimizes Vim's file reading path in three ways:

Word-at-a-time ASCII skipping during UTF-8 validation (fileio.c)
memchr()-based newline/NUL scanning replacing byte-by-byte iteration (fileio.c)
Skip unnecessary notifications (netbeans, channel, listeners) during initial file read (memline.c)

Claims ~40% wall-time improvement on a 657MB file.

Analysis

1. Word-at-a-time ASCII skip (fileio.c, NONASCII_MASK)

Correctness concern — alignment and strict aliasing:

memcpy(&word, p, sizeof(long_u)) is correct for avoiding strict-aliasing violations — good.
No alignment requirement since memcpy is used instead of pointer cast.

Potential issue — reading past buffer bounds:

while (p + (int)sizeof(long_u) <= ascii_end)

ascii_end = ptr + size. The check ensures we don't read past ptr + size. This looks correct.

Minor style note: The cast (int)sizeof(long_u) is fine on practical platforms but sizeof returns size_t (unsigned). On platforms where int is smaller, this could theoretically be an issue, but sizeof(long_u) is 4 or 8, so it's safe in practice.

2. Loop refactor for `++p` (UTF-8 validation loop)

The original loop had for (p = ptr; ; ++p) with p += l - 1 at the end (because ++p in the for-loop adds 1). The new code removes ++p from the for-statement and uses explicit p += l and ++p in the branches.

Correctness concern — BAD_DROP case:

if (bad_char_behavior == BAD_DROP)
{
    mch_memmove(p, p + 1, todo - 1);
    --size;
}
else
{
    if (bad_char_behavior != BAD_KEEP)
        *p = bad_char_behavior;
    ++p;
}

In the BAD_DROP case, p is not incremented — correct, because the byte was removed and the next byte is now at p. Good.

3. memchr()-based newline/NUL scanning

Correctness: The restructuring replaces a single byte-by-byte loop with memchr() for NL, then a nested memchr() for NUL within each segment. The logic is equivalent but worth careful review:

NUL→NL replacement before newline: The inner loop memchr(scan, NUL, nl - scan) correctly scans only up to the found newline.
Tail handling (no more newlines): memchr(ptr, NUL, end - ptr) for remaining data — correct.
size = -1 at end: This replaces the old --size >= 0 loop exit. The linerest calculation uses ptr - line_start which is now based on end, so size is no longer needed. However...

Concern with size = -1: After the while (ptr < end) loop, size is set to -1. Looking at the code after this block, linerest = (long)(ptr - line_start) uses ptr directly, so size isn't used for that. But I'd verify that size isn't read elsewhere after this point in the same iteration. The old loop naturally left size as -1 due to --size >= 0 termination, so this should be equivalent.

Edge case — break on error: When ml_append fails or read_count hits 0, the loop breaks with line_start = ptr. The old code had line_start = ptr set inside the inner else-branch. The new code sets line_start = ptr + 1 after the skip/append block, then ++ptr. On break, line_start is explicitly set to ptr (the NUL-terminated newline position). This matches the old behavior.

4. memline.c — skip notifications during ML_APPEND_NEW

Correctness: ML_APPEND_NEW is the flag passed during initial file read. Skipping netbeans notifications, channel writes, and may_invoke_listeners() during initial read is logically sound — no one is listening for line-by-line changes during file load.

Risk: If any plugin or integration relies on per-line notifications during :edit file load, this would be a behavior change. However, the buffer isn't fully set up during initial read, so this seems safe. The existing readfile() flow already batches other operations.

Suggestions

NONASCII_MASK portability: The macro assumes long_u is a power-of-2 size and that the division trick works. This is a well-known idiom (Linux kernel's has_zero() / has_byte()), so it's fine, but a brief comment noting the origin might help future maintainers.
Consider CHAR_BIT guard: The mask assumes 8-bit bytes. Vim already assumes this everywhere, so this is a non-issue in practice.
Test coverage: The PR doesn't include new tests. The UTF-8 validation refactor changes control flow — especially the BAD_DROP / BAD_KEEP branches. Existing tests in test_utf8.vim or test_fileio.vim should cover these paths, but it would be good to confirm the existing test suite passes with these changes, particularly:
- Files with mixed ASCII/non-ASCII content
- Files with invalid UTF-8 sequences ('fileencoding' conversion with ++bad=)
- Files with embedded NUL bytes
- DOS-format files (CR-LF detection and the file_rewind / goto retry path)
The size = -1 assignment is a bit opaque. A comment explaining why (to match the contract expected by linerest calculation) would help readability.

Summary

Overall: This is a well-structured performance optimization. The memchr() and word-at-a-time techniques are proven approaches. The memline.c changes are a clean, low-risk optimization. The main risk area is the control-flow refactor in the UTF-8 validation loop and the newline scanning loop — both are behavioral equivalents of the originals but need thorough testing with edge cases (bad UTF-8, NUL bytes, DOS line endings, read_count limits, skip_count). If the existing test suite passes, this looks good to merge.

—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you are subscribed to this thread.

mattn

unread,

Mar 9, 2026, 7:57:03 PM (3 days ago) Mar 9

to vim/vim, Subscribed

mattn left a comment (vim/vim#19612)

Below are the function call dwell times when loading a 600MB file before and after applying this patch.

Before

readfile       75.02%
ml_append_int  6.03%
memset         4.05%
ml_updatechunk 2.47%
ml_find_line   2.27%
memmove        1.43%

After

ml_append_int  18.36%
readfile       14.59%
memset         12.32%
memchr         9.10%
ml_updatechunk 7.50%
ml_find_line   6.89%

—
Reply to this email directly, view it on GitHub.

You are receiving this because you are subscribed to this thread.

NRK

unread,

Mar 10, 2026, 11:16:23 AM (2 days ago) Mar 10

to vim/vim, Subscribed

@N-R-K requested changes on this pull request.

In src/fileio.c:

>  		    int	 l;
 
+		    // Skip ASCII bytes quickly using word-at-a-time check.
+		    {
+			char_u *ascii_end = ptr + size;
+			while (p + (int)sizeof(long_u) <= ascii_end)

This pattern of computing pointer and then checking if it's in bound is common, but UB. From the standard:

If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined.

The proper way to check this is to see how much space is remaining by subtraction and then comparing it against the amount you want to add:

⬇️ Suggested change

-			while (p + (int)sizeof(long_u) <= ascii_end)
+			while (ascii_end - p >= sizeof(long_u))

—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you are subscribed to this thread.

mattn

unread,

Mar 11, 2026, 10:19:54 AM (yesterday) Mar 11

to vim/vim, Push

@mattn pushed 1 commit.

1f5ac30 Update src/fileio.c

—
View it on GitHub or unsubscribe.

You are receiving this because you are subscribed to this thread.

mattn

unread,

Mar 11, 2026, 10:22:55 AM (yesterday) Mar 11

to vim/vim, Push

@mattn pushed 2 commits.

7195801 Merge https://github.com/vim/vim into optimize-file-read-performance
4533670 Merge branch 'optimize-file-read-performance' of github.com:mattn/vim into optimize-file-read-performance

—
View it on GitHub or unsubscribe.

You are receiving this because you are subscribed to this thread.

mattn

unread,

Mar 11, 2026, 10:29:06 AM (yesterday) Mar 11

to vim/vim, Push

@mattn pushed 1 commit.

80d8bcd Fix sign-compare warning in fileio.c

—
View it on GitHub or unsubscribe.

You are receiving this because you are subscribed to this thread.

Christian Brabandt

unread,

Mar 11, 2026, 4:41:39 PM (20 hours ago) Mar 11

to vim/vim, Subscribed

chrisbra left a comment (vim/vim#19612)

Alright thanks. Nice optimization. 🙏

—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you are subscribed to this thread.

Christian Brabandt

unread,

Mar 11, 2026, 4:41:48 PM (20 hours ago) Mar 11

to vim/vim, Subscribed

Closed #19612 via 2ca96b0.

—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you are subscribed to this thread.

Reply all

Reply to author

Forward

[vim/vim] Optimize file read performance for large files (PR #19612)

mattn

You can view, comment on, or merge this pull request online at:

Commit Summary

File Changes

Patch Links:

mattn

mattn

mattn

mattn

Christian Brabandt

mattn

Code Review: PR #19612 — Optimize file read performance for large files

Overview

Analysis

1. Word-at-a-time ASCII skip (fileio.c, NONASCII_MASK)

2. Loop refactor for ++p (UTF-8 validation loop)

3. memchr()-based newline/NUL scanning

4. memline.c — skip notifications during ML_APPEND_NEW

Suggestions

Summary

mattn

Before

After

NRK

mattn

mattn

mattn

Christian Brabandt

Christian Brabandt

2. Loop refactor for `++p` (UTF-8 validation loop)