[Critical] Parallel extraction silently corrupts files

370 views
Skip to first unread message

伊斯塔凛

unread,
May 19, 2024, 11:14:09 AM5/19/24
to Bandizip for Windows forum
When parallel extraction is enabled, Bandizip will experience unexpected file writes resulting in file corruption. Bandizip's extraction was canceled when decompressing an existing file, but the actual writing was already in progress, causing the file to be overwritten and damaged. This damage is very subtle and leads to permanent data loss.

伊斯塔凛

unread,
May 19, 2024, 11:28:37 AM5/19/24
to Bandizip for Windows forum
After discovering the problem I realized that this issue had occurred many times, but since Bandizip writes meta-information about the original file, there was no way to check it except for the file size, and assessing the magnitude of the data loss was impossible.

seyo IM

unread,
May 19, 2024, 9:34:39 PM5/19/24
to Bandizip for Windows forum
Hello, this is Bandisoft.

To fix the issue you have described here, we need to be able to replicate the problem but we could not. The following is what we tried for replicating.

1. Compress a.jpg, b.jpg, and c.jpg as test.zip
2. Create another b.jpg in c:\out folder
3. Decompress test.zip in c:\out
4. Dialog box appears to ask whether to overwrite

In this process, the b.jpg file was not overwritten. Please let us know if there is any other way to replicate your problem.

2024년 5월 20일 월요일 오전 12시 28분 37초 UTC+9에 伊斯塔凛님이 작성:

伊斯塔凛

unread,
May 20, 2024, 1:45:44 AM5/20/24
to Bandizip for Windows forum
You didn't test it completely, the file was not overwritten when the dialog was shown, but after canceling the extraction, the file was randomly overwritten.

Like I said at the beginning, the problematic operation is canceling the override.

伊斯塔凛

unread,
May 20, 2024, 2:24:33 AM5/20/24
to Bandizip for Windows forum
This problem exists in all versions after Bandizip 7.22. The possible file damage in more than two years is difficult to estimate, and even now it is impossible to troubleshoot damaged files.

seyo IM

unread,
May 20, 2024, 7:38:24 PM5/20/24
to Bandizip for Windows forum
Thanks to your report, we found that the issue you described here does exist in the app. We greatly appreciate your help.

We are aware that the issue is a very serious one, and now looking for ways to fix it as soon as possible. We will let you know if we make any progress.

Thank you very much once again, and our deepest apologies for any damage or inconvenience caused by this.

Feel free to contact us again if you have any other questions or issues.

2024년 5월 20일 월요일 오후 3시 24분 33초 UTC+9에 伊斯塔凛님이 작성:

伊斯塔凛

unread,
May 20, 2024, 10:49:49 PM5/20/24
to Bandizip for Windows forum
I have doubts about the data extraction process of Bandizip. The software sets the original creation and modification time in advance before the complete extraction of each single file. This is the biggest factor that hinders file inspection. This seems very unreasonable.

In addition, I would like the developer's advice on troubleshooting corrupted files. As I said, corrupted files lack specificity.

seyo IM

unread,
May 21, 2024, 1:02:05 AM5/21/24
to Bandizip for Windows forum
Hello, this is Bandisoft.

We have now fixed the issue of unintended overwriting, and the fix has been applied to Bandizip 7.34, the app's latest version. (Press F1 to update the app.) Thank you for your report once again.

For the "Date modified" you have described here, we are going to fix the issue in the next update.

2024년 5월 21일 화요일 오전 11시 49분 49초 UTC+9에 伊斯塔凛님이 작성:

伊斯塔凛

unread,
May 21, 2024, 1:38:09 AM5/21/24
to Bandizip for Windows forum
Bandizip 7.34 does not fix the problem, but aggravates it. If the user selects some (not all) overwrite files, corrupted data is written. It is recommended that developers test software more rigorously, as this is related to data security for users.

seyo IM

unread,
May 21, 2024, 2:58:10 AM5/21/24
to Bandizip for Windows forum
We have just released Bandizip 7.35 where the Parallel Extraction is temporarily disabled. We have a plan to take enough time to diagnose the problem more accurately and fix it.

Our deepest apologies once again.

2024년 5월 21일 화요일 오후 2시 38분 9초 UTC+9에 伊斯塔凛님이 작성:

seyo IM

unread,
May 22, 2024, 1:48:10 AM5/22/24
to Bandizip for Windows forum
Hello, this is Bandisoft.

First, the issue of the timestamps that occurs during the Parallel Extraction has now been fixed.

0. Unzip files when the Parallel Extraction is enabled.
1. Bandizip asks whether to overwrite the first file.
2. Bandizip starts extraction of the first file.
3. Bandizip asks whether to overwrite the second file.
4. Bandizip pauses the first extraction.
5. Select "Cancel" in the dialog box.
6. Bandizip cancels the first extraction

This behavior of the app is not a bug, but still is un-intuitive for users. So we have modified the app; now the dialog box asking how to resolve file conflict, will turn on "Apply to all files" by default if the Parallel Extraction is enabled.

This fix has been applied to the app's beta version, not an official full version. Please visit the link below to download the beta version.

Download Bandizip Beta Version: https://www.bandisoft.com/bandizip/beta/

2024년 5월 21일 화요일 오후 3시 58분 10초 UTC+9에 seyo IM님이 작성:

伊斯塔凛

unread,
May 22, 2024, 3:57:31 AM5/22/24
to Bandizip for Windows forum
I am helplessly aware that the developers of Bandizip do not realize what the problem is, if there are three files a b c here and agree to overwrite a and cancel at b, then there will still be random corruption appearing in all files (including the overwritten part), and beta modifying one of the defaults does not help in the slightest.

I really doubt Bandizip has done rigorous testing, if the developers can't guarantee data integrity, they might as well remove such an unreliable feature.

伊斯塔凛

unread,
May 22, 2024, 4:28:07 AM5/22/24
to Bandizip for Windows forum
The developer's ability to handle concurrent programming is worrying, essentially the extraction process has two phases, unzip and copy, the resolution of file conflicts is in the copy phase, copy if you agree to overwrite, abort if you cancel. However, Bandizip now incorrectly mixes the two processes, making the dialog box incorrectly control concurrency. The correct process should handle file conflicts both before extraction and before copying, thus completely eliminating accidental data overwriting.

伊斯塔凛

unread,
May 22, 2024, 4:39:14 AM5/22/24
to Bandizip for Windows forum
Bandizip's concurrency code caused a bizarre result. The dialog box did pause decompression and copying, but could not interrupt them correctly, and even overwritten data by mistake.

KH Park

unread,
May 22, 2024, 4:40:57 AM5/22/24
to Bandizip for Windows forum

Hello. I'm a developer of Bandizip.

First of all, thank you for your bug reporting. I could fix the bug with your bug report.


Parallel extraction means there are several extracting threads and if you cancel one of them, all threads are stopped.

Let's check the timeline.

* Step1 - Thread1(a.jpg) : Ask overwrite -> User choose "yes" -> Extracting a.jpg
* Step2 - Thread2(b.jpg) : Ask overwrite -> User choose "cancel" -> Thread2 stopped
* Step3 - Thread1(a.jpg): Thread1 is Stopped while extracting. 

The step3 is the problem, it looked like a.jpg was broken because you chose "overwrite" to a.jpg, but it was not still finished when you chose "cancel" to b.jpg.

So, it's the intended behavior.


If you find an error that does not belong to this case, please let me know step in more detail.



Best Regards, 
Park, KH



--
You received this message because you are subscribed to the Google Groups "Bandizip for Windows forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bandizip-win...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bandizip-win/0ef77700-d783-4af1-8690-742806897c98n%40googlegroups.com.

伊斯塔凛

unread,
May 22, 2024, 5:01:10 AM5/22/24
to Bandizip for Windows forum
Your summary is basically right, but the software is doing it wrong, big time.

Starting at the very root, the software should not actively overwrite the file whenever it wants to, in other words, it should assume that the user does not want to overwrite the file.

We know very well that a normal decompression process is to decompress the file to the cache path - copy it to the target path, and it is safe to copy it to the cache of the target path - delete the target file - rename the copied file, so it is actually very difficult to lose data, even if you interrupt any of these processes.

The problem with Bandizip is that neither secure handling nor fine concurrency control is used, in other words, the software is not designed to respect the user's data security.

伊斯塔凛

unread,
May 22, 2024, 5:16:34 AM5/22/24
to Bandizip for Windows forum
Or to make it even simpler, it's pretty ridiculous to just interrupt all processes when canceling, in which case why not think of it as skipping all?

seyo IM

unread,
May 22, 2024, 8:04:59 PM5/22/24
to Bandizip for Windows forum
- ...copy it (a file) to the cache path > delete the target file > rename the copied file...

This would make the extraction performance poor. As far as we know, none of the existing archivers works in such a way. (If there is one, please let us know and it would be a reference to an app improvement.)

- ...interrupt all processes when canceling...

If the app does not work in this way, when the app parallel-extracts several files and you want to cancel the extraction, you should repeat pressing the cancel button for each file being extracted.

Please contact us again if you have any other questions or issues.

2024년 5월 22일 수요일 오후 6시 16분 34초 UTC+9에 伊斯塔凛님이 작성:

伊斯塔凛

unread,
May 22, 2024, 10:12:15 PM5/22/24
to Bandizip for Windows forum
1. I mean copying as a new file when writing to the target path, and then controlling whether to remove the old file or not, so there are no additional reads or writes in this process, and the overhead from deleting and renaming is almost unnoticeable under normal circumstances.

2. Like I said, developers are not good at handling concurrent programming, and it makes sense to just interrupt when “canceling” in a single thread, but this method can't be replicated in concurrent programming, and interrupting a normal thread is unthinkable.

I mean a single decompression task, a zip file contains a b c, I agree to overwrite a at this point, and cancel when decompressing to b, interrupting all processes is very stupid, a solution would be to consider all remaining files to be handled as “skipped” instead of Bandizip's practice of corrupting all files.

伊斯塔凛

unread,
May 22, 2024, 10:30:27 PM5/22/24
to Bandizip for Windows forum
It is only reasonable to interrupt all processes if the user has not agreed to overwrite any files, and to skip the rest when agreeing to overwrite part of the file. This has nothing to do with whether to overwrite all by default.

伊斯塔凛

unread,
May 22, 2024, 10:58:35 PM5/22/24
to Bandizip for Windows forum
"Cancel" can represent several different processing methods, but the software should respect the user's data security, so its processing method of interrupting all processes is inappropriate.

If the user really needs to terminate the extraction immediately, it should be done with "Interrupt" or using the Task Manager. Do users really understand that "Cancel" is equivalent to "Interrupt" in Bandizip, so the data is not completely written?

伊斯塔凛

unread,
May 22, 2024, 11:21:51 PM5/22/24
to Bandizip for Windows forum
As for the software you mentioned that can be used as a reference, it seems that there is no compression software with this method, but there is such download software. The download software renames the file after it is downloaded, allowing users to easily identify data integrity without any performance loss.
Message has been deleted

伊斯塔凛

unread,
May 23, 2024, 12:48:47 AM5/23/24
to Bandizip for Windows forum
Here is a summary of Bandizip's problems, Bandizip has designed a simple extraction process, directly overwrite the data to the corresponding file, “cancel” means “break”, Bandizip will leave an incomplete file, this operation is not secure enough, because the user does not need a corrupted data at any time

The developer then added a concurrency feature to Bandizip, but unfortunately he copied the logic from the single-threaded version and magnified the problem due to sloppy thread control, resulting in a disaster - no one knew what file would be corrupted.

To solve this problem perfectly, the answer is the following:
1. Writes files by copying and then replacing them, which is almost undetectable under normal circumstances, but maximizes data protection.
2. Modify the software's “Cancel” operation, which should respect the integrity of the data, as mentioned above, and continue with the required process in multi-threaded operations, skipping the other processes.
3. (Optional) The “Cancel” operation can be accompanied by a purge operation to clear the unfinished decompressed files, so that the user does not need to delete them manually. Note that developers should test it rigorously to avoid deleting files by mistake.
4. (Optional) If the developer feels that immediate interruption of decompression is necessary, he can add the option “Interrupt” and warn the user that incomplete data will be retained.

伊斯塔凛

unread,
Jun 26, 2024, 12:43:25 AM6/26/24
to Bandizip for Windows forum
Unable to agree with this perfunctory fix, I can only recommend users to disable this unreliable feature.
Reply all
Reply to author
Forward
0 new messages