[vim/vim] Use the file header to distinguish between .tbz files (Issue #16761)

16 views
Skip to first unread message

Jim Zhou

unread,
Feb 28, 2025, 1:53:38 PM2/28/25
to vim/vim, Subscribed

Currently, .tbz files are treated as .tar.bz2 archives, so bunzip2 is used to decompress them. However, with the growing popularity of bzip3, some .tbz files may be compressed using bzip3 instead.

By using the file header to distinguish between these two formats, we can make the handling of .tbz files better.

Here are some references on file format I found:


Reply to this email directly, view it on GitHub.
You are receiving this because you are subscribed to this thread.Message ID: <vim/vim/issues/16761@github.com>

JimZhouZZYJimZhouZZY created an issue (vim/vim#16761)

Currently, .tbz files are treated as .tar.bz2 archives, so bunzip2 is used to decompress them. However, with the growing popularity of bzip3, some .tbz files may be compressed using bzip3 instead.

By using the file header to distinguish between these two formats, we can make the handling of .tbz files better.

Here are some references on file format I found:


Reply to this email directly, view it on GitHub.
You are receiving this because you are subscribed to this thread.Message ID: <vim/vim/issues/16761@github.com>

Christian Brabandt

unread,
Mar 1, 2025, 11:18:01 AM3/1/25
to vim/vim, Subscribed

Currently, .tbz files are treated as .tar.bz2 archives, so bunzip2 is used to decompress them. However, with the growing popularity of bzip3, some .tbz files may be compressed using bzip3 instead.

That doesn#t seem correct. tar.vim uses file to determine the subtype and file correctly recognizes bzip3 and bzip2 compressed archives so tar.vim will correctly use either bzip3 or bzip2 to decode that archive. See here

But I notice we introduce a syntax error when reading from tar files. I fixed that in 8ac975d


Reply to this email directly, view it on GitHub.

You are receiving this because you are subscribed to this thread.Message ID: <vim/vim/issues/16761/2692302115@github.com>

chrisbrachrisbra left a comment (vim/vim#16761)

Currently, .tbz files are treated as .tar.bz2 archives, so bunzip2 is used to decompress them. However, with the growing popularity of bzip3, some .tbz files may be compressed using bzip3 instead.

That doesn#t seem correct. tar.vim uses file to determine the subtype and file correctly recognizes bzip3 and bzip2 compressed archives so tar.vim will correctly use either bzip3 or bzip2 to decode that archive. See here

But I notice we introduce a syntax error when reading from tar files. I fixed that in 8ac975d


Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you are subscribed to this thread.Message ID: <vim/vim/issues/16761/2692302115@github.com>

Christian Brabandt

unread,
Mar 1, 2025, 11:18:03 AM3/1/25
to vim/vim, Subscribed

Closed #16761 as completed.


Reply to this email directly, view it on GitHub.

You are receiving this because you are subscribed to this thread.Message ID: <vim/vim/issue/16761/issue_event/16522703620@github.com>

Christian Brabandt

unread,
Mar 1, 2025, 1:18:21 PM3/1/25
to vim/vim, Subscribed

Reopened #16761.


Reply to this email directly, view it on GitHub.

You are receiving this because you are subscribed to this thread.Message ID: <vim/vim/issue/16761/issue_event/16522995566@github.com>

Jim Zhou

unread,
Mar 1, 2025, 1:19:06 PM3/1/25
to vim/vim, Subscribed

Maybe we can separate the tarfile =~# '\.\(bz2\|tbz\|tb2\)$' case into two cases. For example,

elseif tarfile =~# '\.\(bz2\|tb2\)$'
   exe "sil! r! bzip2 -d -c -- ".shellescape(tarfile,1)." | ".g:tar_cmd." -".g:tar_browseoptions." - "
elseif tarfile =~# '\.tbz'
 " Read the first three characters of the file to determine whether it is 
 " compressed by bzip2 or bzip3.
 " bzip2 files has header 'BZh', bzip3 files has header 'BZ3'
 let header = strpart(readfile(a:filename, 0, 1), 0, 3)
 if first_chars == 'BZh'
  exe "sil! r! bzip2 -d -c -- ".shellescape(tarfile,1)." | ".g:tar_cmd." -".g:tar_browseoptions." - "
 elseif first_chars == 'BZ3'
  exe "sil! r! bzip3 -d -c -- ".shellescape(tarfile,1)." | ".g:tar_cmd." -".g:tar_browseoptions." - "


Reply to this email directly, view it on GitHub.

You are receiving this because you are subscribed to this thread.Message ID: <vim/vim/issues/16761/2692352153@github.com>

JimZhouZZYJimZhouZZY left a comment (vim/vim#16761)

Maybe we can separate the tarfile =~# '\.\(bz2\|tbz\|tb2\)$' case into two cases. For example,

elseif tarfile =~# '\.\(bz2\|tb2\)$'
   exe "sil! r! bzip2 -d -c -- ".shellescape(tarfile,1)." | ".g:tar_cmd." -".g:tar_browseoptions." - "
elseif tarfile =~# '\.tbz'
 " Read the first three characters of the file to determine whether it is 
 " compressed by bzip2 or bzip3.
 " bzip2 files has header 'BZh', bzip3 files has header 'BZ3'
 let header = strpart(readfile(a:filename, 0, 1), 0, 3)
 if first_chars == 'BZh'
  exe "sil! r! bzip2 -d -c -- ".shellescape(tarfile,1)." | ".g:tar_cmd." -".g:tar_browseoptions." - "
 elseif first_chars == 'BZ3'
  exe "sil! r! bzip3 -d -c -- ".shellescape(tarfile,1)." | ".g:tar_cmd." -".g:tar_browseoptions." - "


Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you are subscribed to this thread.Message ID: <vim/vim/issues/16761/2692352153@github.com>

Jim Zhou

unread,
Mar 2, 2025, 9:52:39 PM3/2/25
to vim/vim, Subscribed

Closed #16761 as completed.


Reply to this email directly, view it on GitHub.

You are receiving this because you are subscribed to this thread.Message ID: <vim/vim/issue/16761/issue_event/16527402023@github.com>

Reply all
Reply to author
Forward
0 new messages