Thanks to Greg Troxel for reviewing the existing docs and suggestions
for improvement.
Signed-off-by: Rob Browning <
r...@defaultvalue.org>
Tested-by: Rob Browning <
r...@defaultvalue.org>
---
Documentation/
bup-fsck.1.md | 36 ++++++++++++++------
Documentation/
bup-gc.1.md | 34 +++++++++++++-----
Documentation/
bup-rm.1.md | 15 +++++---
Documentation/
bup-validate-object-links.1.md | 15 ++++----
Documentation/
bup-validate-refs.1.md | 32 ++++++++++++-----
5 files changed, 94 insertions(+), 38 deletions(-)
diff --git a/Documentation/
bup-fsck.1.md b/Documentation/
bup-fsck.1.md
index 9e47b6dc..7cd56ee2 100644
--- a/Documentation/
bup-fsck.1.md
+++ b/Documentation/
bup-fsck.1.md
@@ -13,16 +13,29 @@ bup fsck [-r] [-g] [-v] [\--quick] [-j *jobs*] [\--par2-ok]
# DESCRIPTION
-`bup fsck` validates bup repositories much the way `git fsck`
-validates git repositories. When *packfile*s (which must end in
-.pack) are specified, pack-related operations are limited to those
-files, otherwise all packfiles in the current repository are
-considered.
-
-It can also generate and/or use "recovery blocks" using the
-`par2`(1) tool (if you have it installed). This allows you
-to recover from damaged blocks covering up to 5% of your
-`.pack` files.
+When *packfile*s (which must end in .pack) are specified, pack-related
+operations are limited to those files, otherwise all packfiles in the
+current repository are considered.
+
+Currently `bup fsck` checks the data in the repository for corruption.
+More specifically, it checks the integrity of the data *packfile*s and
+their corresponding indexes to ensure that they have not changed since
+they were written. It does not check higher level concerns like
+connectivity (missing objects), e.g. whether all the data referred to
+by a save actually exists in the repository. For some higher level
+checks, see `bup-validate-object-links`(1) and `bup-validate-refs`(1).
+The checks `bup fsck` performs are focused on detecting, and
+potentially repairing, file corruption, while the higher level
+problems are more likely to be caused by (hopefully rarer) bugs.
+
+When checking the packfiles and indexes, right now fsck will normally
+rely on `git-verify-pack`(1), but with `--quick` (more below), bup
+will just check the index and packfile checksums itself.
+
+To allow repairs, fsck must be asked via `--generate` to generate
+`par2`(1) "recovery blocks" (if you have it installed). These blocks
+allow you to recover from damage affecting up to 5% of your `.pack`
+files.
In a normal backup system, damaged blocks are less
important, because there tends to be enough data duplicated
@@ -124,7 +137,8 @@ errors and a value other than zero or one for errors.
# SEE ALSO
-`bup-damage`(1), `fsck`(1), `git-fsck`(1)
+`bup-damage`(1), `fsck`(1), `git-fsck`(1),
+`bup-validate-object-links`(1), and `bup-validate-refs`(1)
# BUP
diff --git a/Documentation/
bup-gc.1.md b/Documentation/
bup-gc.1.md
index 147bcb80..9e5c2e2f 100644
--- a/Documentation/
bup-gc.1.md
+++ b/Documentation/
bup-gc.1.md
@@ -12,11 +12,12 @@ bup gc [-#|\--verbose] <*branch*|*save*...>
# DESCRIPTION
-`bup gc` removes (permanently deletes) unreachable data from the
-repository, data that isn't referred to directly or indirectly by the
-current set of branches (backup sets) and tags. But bear in mind that
-given deduplication, deleting a save and running the garbage collector
-might or might not actually delete anything (or reclaim any space).
+`bup gc` removes (permanently deletes) unreachable data, also referred
+to as garbage, from the repository; this is data that isn't referred
+to directly or indirectly by the current set of branches (backup sets)
+and tags. But bear in mind that given deduplication, deleting a save
+and running the garbage collector might or might not actually delete
+anything (or reclaim any space).
With the current, proababilistic implementation, some fraction of the
unreachable data may be retained. In exchange, the garbage collection
@@ -33,12 +34,22 @@ succeeds, that's a fairly encouraging sign that the commands worked
correctly. (The `dev/compare-trees` command in the source tree can be
used to help test before/after results.)
+The collection proceeds by rewriting packfiles to remove unreachable
+objects, and the new packfiles will respect `pack.packSizeLimit`
+(`bup-config`(5)). Each will combine content from any number of
+existing packfiles.
+
+When an existing packfile is removed, any affiliated files
+(e.g. `.idx`, `.par2`, etc.) should also be removed; `.par2` files can
+be reestablished by `bup-fsck`(1).
+
# OPTIONS
\--threshold=N
: only rewrite a packfile if it's over N percent garbage and
- contains no unreachable trees or commits. The default threshold
- is 10%.
+ contains no unreachable trees or commits. If the packfile does
+ contain unreachable trees or commits, it will be rewritten
+ regardless. The default threshold is 10%.
-v, \--verbose
: increase verbosity (can be given more than once).
@@ -47,7 +58,7 @@ used to help test before/after results.)
: set the compression level to # (a value from 0-9, where 9 is the
highest and 0 is no compression). Defaults to a configured
pack.compression or core.compression, or 1 (fast, loose
- compression).
+ compression). This applies to any rewritten packfiles.
\--ignore-missing
: report missing objects, but don't stop the collection.
@@ -59,8 +70,13 @@ Encountering any missing object is considered an error.
# EXAMPLES
- # Remove all saves of "home" and most of the otherwise unreferenced data.
+ # Remove all saves of "home" (remove the branch).
$ bup rm home
+
+ # Remove two saves from the archives branch.
+ $ bup rm archives/2025-01-01-030405 archives/2025-10-04-134117
+
+ # Remove most of the otherwise unreferenced data.
$ bup gc
# SEE ALSO
diff --git a/Documentation/
bup-rm.1.md b/Documentation/
bup-rm.1.md
index ddc52a74..e9b7d07b 100644
--- a/Documentation/
bup-rm.1.md
+++ b/Documentation/
bup-rm.1.md
@@ -18,9 +18,16 @@ any storage space), but it may make it very difficult or impossible to
refer to the deleted items, unless there are other references to them
(e.g. tags).
-A subsequent garbage collection, either by a `bup gc`, or by a normal
-`git gc`, may permanently delete data that is no longer reachable from
-the remaining branches or tags, and reclaim the related storage space.
+A subsequent garbage collection by `bup gc` may permanently delete
+data that is no longer reachable from the remaining branches or tags
+and reclaim the related storage space. (The same would apply to a
+`git gc` on the repository, but the safety of running `git gc` on a
+bup repository has not been carefully evaluated, and should be
+avoided.)
+
+If you are familiar with `git`(1), removing branches is similar to
+`git branch -D BRANCH` and removing saves is like a `git-rebase
+--interactive` that removes commits.
WARNING: This is one of the few bup commands that modifies your
archive in intentionally destructive ways.
@@ -33,7 +40,7 @@ archive in intentionally destructive ways.
-*#*, \--compress=*#*
: set the compression level to # (a value from 0-9, where
9 is the highest and 0 is no compression). The default
- is 6. Note that `bup rm` may only write new commits.
+ is 6. This applies to the rewritten branches (commits).
# EXAMPLES
diff --git a/Documentation/
bup-validate-object-links.1.md b/Documentation/
bup-validate-object-links.1.md
index 91e76a7a..8334f00e 100644
--- a/Documentation/
bup-validate-object-links.1.md
+++ b/Documentation/
bup-validate-object-links.1.md
@@ -12,12 +12,15 @@ bup validate-object-links
# DESCRIPTION
-`bup validate-object-links` scans the objects in the repository for
-and reports any "broken links" it finds, i.e. any links from a tree or
-commit in the repository to an object that doesn't exist. Currently,
-it doesn't include "loose objects" (those not in packfiles -- which
-git may create, but bup doesn't), and it can't handle tag objects
-(which bup also doesn't create).
+`bup validate-object-links` scans the objects in the repository and
+reports any references from a tree or commit to an object that does
+not exist in the repository. Currently, it doesn't scan "loose
+objects" (those not in packfiles) or notice them when checking for
+existence, and it cannot handle tag objects. Note that `bup` doesn't
+create tags or loose objects, but `git` may.
+
+The existence check only consults the repository indexes; it does not
+try to read the object, so it could be misled by an incorrect index.
Whenever a broken link (missing reference) is found, an ASCII encoded
line formatted like this will be printed to standard output:
diff --git a/Documentation/
bup-validate-refs.1.md b/Documentation/
bup-validate-refs.1.md
index 6386104c..2e8fc930 100644
--- a/Documentation/
bup-validate-refs.1.md
+++ b/Documentation/
bup-validate-refs.1.md
@@ -12,14 +12,21 @@ bup validate-refs [\--links] [\--bupm] [*ref*...]
# DESCRIPTION
-`bup validate-refs` can check repository references (e.g. saves) for
-commits or trees (directories) that refer to missing objects, and for
-abridged bupm files (metadata storage), reporting the paths to those
-it finds. If no *ref*s are provided, it checks all refs, otherwise it
-only checks those specified. If no checks are explicitly requested,
-then a default set of checks will be performed, currently `--links`
-and `--bupm`. If problems are found, `bup-get`(1) `--repair` may be
-able to help.
+`bup validate-refs` can check repository *ref*s (e.g. branches (backup
+sets) or saves) for commits or trees (directories) that refer to
+missing objects, and for damaged bupm files (metadata storage),
+reporting the paths to any it finds. If no *ref*s are provided, it
+checks all refs, otherwise it only checks those specified. If no
+checks are explicitly requested, then a default set of checks will be
+performed, currently `--links` and `--bupm`, and if problems are
+found, `bup-get`(1) `--repair` may be able to help.
+
+`validate-refs` checks everything reachable from a given branch or
+save, which includes all of the saves preceeding it. See EXAMPLES
+below.
+
+The existence check only consults the repository indexes; it does not
+try to read the object, so it could be misled by an incorrect index.
At the moment, the broken path information is only logged to standard
error, and is not well specified (i.e. suitable for inspection, but
@@ -51,6 +58,15 @@ has encountered before.
checks for the existence of the leaf (blob) data, it does not
attempt to read that data.
+# EXAMPLES
+
+ # Check --links and --bupm for all refs
+ $ bup validate-refs
+
+ # Check --links for archives/2025-01-01-030405 and
+ # all of the saves before it.
+ $ bup validate-refs --links archives/2025-01-01-030405
+
# EXIT STATUS
The exit status will be 1 if any broken links are found, 0 if none are
--
2.47.3