Replace per-byte ga_append() loop with a single ga_grow() and mch_memmove() for each source blob. This eliminates N grow-checks and function call overhead for blob concatenation.
Here is benchmark result of 16MB + 16MB blob concatenation × 50 iterations
Before: 7.6107s
After: 1.4694s
let s:target_size = 10000000 let s:iterations = 50 let s:b1 = 0zDEADBEEF while len(s:b1) < s:target_size let s:b1 = s:b1 + s:b1 endwhile let s:b2 = copy(s:b1) echo printf('blob len: %d + %d bytes', len(s:b1), len(s:b2)) " Warm up for i in range(3) let _ = s:b1 + s:b2 endfor let s:start = reltime() for i in range(s:iterations) let _ = s:b1 + s:b2 endfor let s:elapsed = reltimefloat(reltime(s:start)) echo printf('blob len: %d + %d bytes, iterations: %d, total: %.4f s, per-op: %.6f s', \ len(s:b1), len(s:b2), s:iterations, s:elapsed, s:elapsed / s:iterations)
https://github.com/vim/vim/pull/19494
(1 file)
—
Reply to this email directly, view it on GitHub.
You are receiving this because you are subscribed to this thread.![]()
@chrisbra commented on this pull request.
In src/eval.c:
>
if (b == NULL)
return;
- for (i = 0; i < blob_len(b1); i++)
- ga_append(&b->bv_ga, blob_get(b1, i));
- for (i = 0; i < blob_len(b2); i++)
- ga_append(&b->bv_ga, blob_get(b2, i));
+ len1 = (int)blob_len(b1);
+ len2 = (int)blob_len(b2);
+
+ if (len1 + len2 > 0 && ga_grow(&b->bv_ga, len1 + len2) == OK)
again, len1 + len2 can overflow
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you are subscribed to this thread.![]()
@Copilot commented on this pull request.
Improves performance of blob concatenation in eval_addblob() by replacing per-byte appends with a single pre-allocation and bulk memory copies.
Changes:
mch_memmove() and set final ga_len.💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
In src/eval.c:
> + len1 = (int)blob_len(b1);
+ len2 = (int)blob_len(b2);
+
+ if (len1 + len2 > 0 && ga_grow(&b->bv_ga, len1 + len2) == OK)
+ {
+ if (len1 > 0)
+ mch_memmove((char_u *)b->bv_ga.ga_data,
+ (char_u *)b1->bv_ga.ga_data, (size_t)len1);
+ if (len2 > 0)
+ mch_memmove((char_u *)b->bv_ga.ga_data + len1,
+ (char_u *)b2->bv_ga.ga_data, (size_t)len2);
+ b->bv_ga.ga_len = len1 + len2;
+ }
There’s a potential signed integer overflow in len1 + len2 (used in the > 0 check, passed to ga_grow(), and assigned to ga_len). If the sum overflows, ga_grow() may under-allocate and the subsequent mch_memmove() writes can go out of bounds. Consider computing the total in size_t (or the project’s preferred unsigned size type), explicitly checking len1 <= INT_MAX - len2 (or equivalent limit for ga_len/ga_grow), and only then converting to int for ga_grow()/ga_len.
In src/eval.c:
> + mch_memmove((char_u *)b->bv_ga.ga_data, + (char_u *)b1->bv_ga.ga_data, (size_t)len1); + if (len2 > 0) + mch_memmove((char_u *)b->bv_ga.ga_data + len1, + (char_u *)b2->bv_ga.ga_data, (size_t)len2);
This change reaches into b1->bv_ga.ga_data / b2->bv_ga.ga_data directly, which couples eval_addblob() to the internal blob storage representation (and to ga_itemsize == 1). To make this more robust, consider adding a brief comment (or an assertion, if the codebase uses them here) documenting the assumption that blob data is stored contiguously as bytes in bv_ga.ga_data.
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you are subscribed to this thread.![]()
@mattn pushed 2 commits.
—
View it on GitHub or unsubscribe.
You are receiving this because you are subscribed to this thread.![]()
@mattn commented on this pull request.
In src/eval.c:
>
if (b == NULL)
return;
- for (i = 0; i < blob_len(b1); i++)
- ga_append(&b->bv_ga, blob_get(b1, i));
- for (i = 0; i < blob_len(b2); i++)
- ga_append(&b->bv_ga, blob_get(b2, i));
+ len1 = (int)blob_len(b1);
+ len2 = (int)blob_len(b2);
+
+ if (len1 + len2 > 0 && ga_grow(&b->bv_ga, len1 + len2) == OK)
Thank you. Fixed
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you are subscribed to this thread.![]()