[PATCH 1/1] Unmap mmapped mem explicitly to free the fd soon

23 views
Skip to first unread message

Luca Carlon

unread,
Apr 4, 2020, 2:27:03 PM4/4/20
to bup-...@googlegroups.com, Luca Carlon
Mmaping files keeps fds open on the original idx files. Removing open files
on shares mounted thorugh the cifs kernel driver (and probably other cases)
keeps entries in the directory, leading to a crash when opening removed
files. This patch unmaps mem to free the fd sooner.

Signed-off-by: Luca Carlon <carlo...@gmail.com>
---
lib/bup/gc.py | 3 +++
lib/bup/git.py | 6 ++++++
2 files changed, 9 insertions(+)

diff --git a/lib/bup/gc.py b/lib/bup/gc.py
index b23029d..9824838 100644
--- a/lib/bup/gc.py
+++ b/lib/bup/gc.py
@@ -177,6 +177,7 @@ def sweep(live_objects, existing_count, cat_pipe, threshold, compression,
% path_msg(git.repo_rel(basename(idx_name))))
ns.stale_files.append(idx_name)
ns.stale_files.append(idx_name[:-3] + b'pack')
+ idx.close()
continue

live_frac = idx_live_count / float(len(idx))
@@ -184,6 +185,7 @@ def sweep(live_objects, existing_count, cat_pipe, threshold, compression,
if verbosity:
log('keeping %s (%d%% live)\n' % (git.repo_rel(basename(idx_name)),
live_frac * 100))
+ idx.close()
continue

if verbosity:
@@ -197,6 +199,7 @@ def sweep(live_objects, existing_count, cat_pipe, threshold, compression,

ns.stale_files.append(idx_name)
ns.stale_files.append(idx_name[:-3] + b'pack')
+ idx.close()

if verbosity:
progress('preserving live data (%d%% complete)\n'
diff --git a/lib/bup/git.py b/lib/bup/git.py
index d1515e3..c7bf899 100644
--- a/lib/bup/git.py
+++ b/lib/bup/git.py
@@ -425,6 +425,9 @@ class PackIdxV1(PackIdx):
for ofs in range(start, start + 24 * self.nsha, 24):
yield self.map[ofs : ofs + 20]

+ def close(self):
+ self.map.close();
+

class PackIdxV2(PackIdx):
"""Object representation of a Git pack index (version 2) file."""
@@ -468,6 +471,9 @@ class PackIdxV2(PackIdx):
for ofs in range(start, start + 20 * self.nsha, 20):
yield self.map[ofs : ofs + 20]

+ def close(self):
+ self.map.close();
+

_mpi_count = 0
class PackIdxList:
--
2.20.1

Johannes Berg

unread,
Apr 5, 2020, 5:14:35 AM4/5/20
to Luca Carlon, bup-...@googlegroups.com
On Sat, 2020-04-04 at 20:26 +0200, Luca Carlon wrote:
> Mmaping files keeps fds open on the original idx files. Removing open files
> on shares mounted thorugh the cifs kernel driver (and probably other cases)
> keeps entries in the directory, leading to a crash when opening removed
> files. This patch unmaps mem to free the fd sooner.

Makes sense.

I'm surprised you fixed only 'bup gc' - but I guess my earlier fix
addressed this issue for "normal" operation?

> live_frac = idx_live_count / float(len(idx))
> @@ -184,6 +185,7 @@ def sweep(live_objects, existing_count, cat_pipe, threshold, compression,
> if verbosity:
> log('keeping %s (%d%% live)\n' % (git.repo_rel(basename(idx_name)),
> live_frac * 100))
> + idx.close()

It's not _really_ needed here, as we will not delete this one ("keeping"
branch), but of course it still makes sense...

Maybe we should wrap the entire loop in a try/finally: idx.close()
instead of adding it in all branches?

johannes


Johannes Berg

unread,
Apr 5, 2020, 5:46:33 AM4/5/20
to Luca Carlon, bup-...@googlegroups.com
On Sun, 2020-04-05 at 11:14 +0200, Johannes Berg wrote:
>
> Maybe we should wrap the entire loop in a try/finally: idx.close()
> instead of adding it in all branches?

Or, perhaps separately as a further cleanup, maybe we should teach the
index objects the context manager protocol, so we can write


with git.open_idx(..) as idx:
# do whatever is needed

instead of try/finally.

johannes

Luca Carlon

unread,
Apr 5, 2020, 6:05:06 AM4/5/20
to Johannes Berg, bup-list
I tried to keep the changes to the minimum, but my opinion is that
mmap objects should be closed asap, just like files. So context
manager would be a good idea.
If you want I can update my patch accordingly.
Regards.
--
Dr. Luca Carlon
Software Engineer

Johannes Berg

unread,
Apr 5, 2020, 8:45:38 AM4/5/20
to Luca Carlon, bup-list
On Sun, 2020-04-05 at 12:04 +0200, Luca Carlon wrote:
> I tried to keep the changes to the minimum, but my opinion is that
> mmap objects should be closed asap, just like files. So context
> manager would be a good idea.

Actually, the idx goes out of scope at the end of the loop, so could a
__delete__ method even already do it? But I guess that's not guaranteed,
since the gc need not kick in there.

> If you want I can update my patch accordingly.

Up to Rob, really. I'd prefer to have that solution in the end, but for
ease of cherry-picking the bugfix it may well be better to have that as
two separate patches.

johannes

Greg Troxel

unread,
Apr 5, 2020, 11:39:10 AM4/5/20
to Johannes Berg, Luca Carlon, bup-list
I have only looked at this briefly.

I am a fan of all objects being closed/freed as soon as they are no
longer needed, to minimize resource usage.


Can you explain the sequence that happens, what goes wrong with cifs,
and why it's ok with a normal fs? I don't doubt you - would just like
to understand. Is it something like

open file
mmap file to handle
close file
(leave mmap handle open)


try to open file


but I think I have that pretty wrong. SO that's just an example of
what I'd like to understand.


As for the fix form, if there is a general pattern that does the
close/free mostly automatically and is prompt, that sounds good. As
I think Johannes meant, even in cases that don't trigger a bug with
cifs, it is still good to free resources.

Johannes Berg

unread,
Apr 5, 2020, 2:58:47 PM4/5/20
to Greg Troxel, Luca Carlon, bup-list
On Sun, 2020-04-05 at 11:39 -0400, Greg Troxel wrote:

> Can you explain the sequence that happens, what goes wrong with cifs,
> and why it's ok with a normal fs? I don't doubt you - would just like
> to understand. Is it something like
>
> open file
> mmap file to handle
> close file
> (leave mmap handle open)
>
>
> try to open file
>

I think it's "try to delete file" at the end, instead, but otherwise
seems right?

> As for the fix form, if there is a general pattern that does the
> close/free mostly automatically and is prompt, that sounds good. As
> I think Johannes meant, even in cases that don't trigger a bug with
> cifs, it is still good to free resources.

Well, I was really thinking two things:

1) It's really easy (perhaps less so in this case) to add another path
here that "forgets" to release the map, getting us into this
situation again; using a try/finally or context manager (which looks
nicer) would avoid that

2) Using a context manager makes this really simple and thus easier to
apply in other places, in fact, that could easily become the
preferred way of opening an index, refactoring other places as
needed, since it's so easy to use.

johannes

Greg Troxel

unread,
Apr 5, 2020, 3:09:41 PM4/5/20
to Johannes Berg, Luca Carlon, bup-list
Johannes Berg <joha...@sipsolutions.net> writes:

> On Sun, 2020-04-05 at 11:39 -0400, Greg Troxel wrote:
>
>> Can you explain the sequence that happens, what goes wrong with cifs,
>> and why it's ok with a normal fs? I don't doubt you - would just like
>> to understand. Is it something like
>>
>> open file
>> mmap file to handle
>> close file
>> (leave mmap handle open)
>>
>>
>> try to open file
>>
>
> I think it's "try to delete file" at the end, instead, but otherwise
> seems right?

So you are saying that delete fails if the file is still mmapped, and it
remains in the directory?

How does that lead to the observed symptoms of a file that exists not
being able to be opened?

> 1) It's really easy (perhaps less so in this case) to add another path
> here that "forgets" to release the map, getting us into this
> situation again; using a try/finally or context manager (which looks
> nicer) would avoid that

> 2) Using a context manager makes this really simple and thus easier to
> apply in other places, in fact, that could easily become the
> preferred way of opening an index, refactoring other places as
> needed, since it's so easy to use.

That sounds fine to me.

Luca Carlon

unread,
Apr 5, 2020, 3:15:00 PM4/5/20
to Greg Troxel, Johannes Berg, bup-list
> Can you explain the sequence that happens, what goes wrong with cifs,
> and why it's ok with a normal fs? I don't doubt you - would just like
> to understand. Is it something like
>
> open file
> mmap file to handle
> close file
> (leave mmap handle open)
>
>
> try to open file
>
>
> but I think I have that pretty wrong. SO that's just an example of
> what I'd like to understand.

Sequence is:
open file
mmap file
close file
(leave mmap handle open)
remove file
(handle results now dangling)
list directory
open file
crash on cifs (and probably also elsewhere)

Python docs say: "on Windows, attempting to remove a file that is in
use causes an exception to be raised; on Unix, the directory entry is
removed but the storage allocated to the file is not made available
until the original file is no longer in use.". This seems to be true
for ext4 (at least), but not on cifs, it is simple to verify this. The
result is that the following step tries to open a removed file, which
results in the crash I reported. Also, anyone who knows what happens
in the "remove file" step on cygwin? According to the docs, I'd say it
may crash.
I also investigated a bit about this difference between ext4 and cifs
in the kernel, but my investigation is not over (and probably not that
much relevant after all).

> As for the fix form, if there is a general pattern that does the
> close/free mostly automatically and is prompt, that sounds good. As
> I think Johannes meant, even in cases that don't trigger a bug with
> cifs, it is still good to free resources.

I agree. close() is in fact called in the the examples:
https://docs.python.org/3/library/mmap.html.

Regards.

Luca Carlon

unread,
Apr 5, 2020, 3:21:58 PM4/5/20
to Johannes Berg, Greg Troxel, bup-list
> 2) Using a context manager makes this really simple and thus easier to
> apply in other places, in fact, that could easily become the
> preferred way of opening an index, refactoring other places as
> needed, since it's so easy to use.

I'm not a python dev, but this is the way I would go in other
"managed" languages.
Regards.

Greg Troxel

unread,
Apr 5, 2020, 3:31:31 PM4/5/20
to Luca Carlon, Johannes Berg, bup-list
Luca Carlon <carlo...@gmail.com> writes:

> Sequence is:
> open file
> mmap file
> close file
> (leave mmap handle open)
> remove file
> (handle results now dangling)
> list directory
> open file
> crash on cifs (and probably also elsewhere)

What does crash mean? Do you mean "python system call function throws
error"?

Is this a file that should be read? Or is it that it ought to be gone
but because it's (wrongly, in my unixhead view) in the directory, the
code tries to open it. But because it's a not-really-file, the open
fails as if it weren't there?

> Python docs say: "on Windows, attempting to remove a file that is in
> use causes an exception to be raised; on Unix, the directory entry is
> removed but the storage allocated to the file is not made available
> until the original file is no longer in use.". This seems to be true
> for ext4 (at least), but not on cifs, it is simple to verify this.

It seems the remove function does the unlink syscall and that's ok.

> The
> result is that the following step tries to open a removed file, which
> results in the crash I reported. Also, anyone who knows what happens
> in the "remove file" step on cygwin? According to the docs, I'd say it
> may crash.

I see what you mean - that remove may fail. But cygwin tries to
present posix, and I don't know what posix says.

> I also investigated a bit about this difference between ext4 and cifs
> in the kernel, but my investigation is not over (and probably not that
> much relevant after all).

Well, it seems possible that cifs continuing to present a file that has
been unlinked as being in the directory may be wrong, since cifs is a
file system on unix that provides access to a remote filesystem. But
certainly bup should avoid the bad behavior that exposes this bug, even
if it is a bug.

>> As for the fix form, if there is a general pattern that does the
>> close/free mostly automatically and is prompt, that sounds good. As
>> I think Johannes meant, even in cases that don't trigger a bug with
>> cifs, it is still good to free resources.
>
> I agree. close() is in fact called in the the examples:
> https://docs.python.org/3/library/mmap.html.

All sounds good to me.

Luca Carlon

unread,
Apr 5, 2020, 6:52:49 PM4/5/20
to Greg Troxel, Johannes Berg, bup-list
> > Sequence is:
> > open file
> > mmap file
> > close file
> > (leave mmap handle open)
> > remove file
> > (handle results now dangling)
> > list directory
> > open file
> > crash on cifs (and probably also elsewhere)
>
> What does crash mean? Do you mean "python system call function throws
> error"?

I mean bup quits unexpectedly, printing a stack trace.

> Is this a file that should be read? Or is it that it ought to be gone
> but because it's (wrongly, in my unixhead view) in the directory, the
> code tries to open it. But because it's a not-really-file, the open
> fails as if it weren't there?

My understanding is that it is only read because it is reported as an
entry of the directory.

Regards.

Greg Troxel

unread,
Apr 5, 2020, 7:58:01 PM4/5/20
to Luca Carlon, Johannes Berg, bup-list
Luca Carlon <carlo...@gmail.com> writes:

>> What does crash mean? Do you mean "python system call function throws
>> error"?
>
> I mean bup quits unexpectedly, printing a stack trace.

Thanks. Just trying to separate bup exiting from kernel or fs issues.

bup is very pythonic, including the attitude that it's ok to present
stack traces to users :-)

>> Is this a file that should be read? Or is it that it ought to be gone
>> but because it's (wrongly, in my unixhead view) in the directory, the
>> code tries to open it. But because it's a not-really-file, the open
>> fails as if it weren't there?
>
> My understanding is that it is only read because it is reported as an
> entry of the directory.

Thanks - now I think I understand.

Rob Browning

unread,
Apr 25, 2020, 5:56:35 PM4/25/20
to Luca Carlon, Johannes Berg, bup-list
Luca Carlon <carlo...@gmail.com> writes:

> I tried to keep the changes to the minimum, but my opinion is that
> mmap objects should be closed asap, just like files. So context
> manager would be a good idea.
> If you want I can update my patch accordingly.

Agreed, that sounds good. Please do. i.e. let's handle this either via
try/finally, or likely preferable, a context manager in master (at
least). We already use @contextmanager in buptest.py and
lib/bup/helpers.py, so there's even precedent.

And if it's not too disruptive, I'd like to put that on 0.30.x too.
(I'm prepping a branch I'll push soon (preview at 0.30.x-proposed) for a
stable release.)

Thanks for the help
--
Rob Browning
rlb @defaultvalue.org and @debian.org
GPG as of 2011-07-10 E6A9 DA3C C9FD 1FF8 C676 D2C4 C0F0 39E9 ED1B 597A
GPG as of 2002-11-03 14DD 432F AE39 534D B592 F9A0 25C8 D377 8C7E 73A4

Rob Browning

unread,
Apr 25, 2020, 5:58:36 PM4/25/20
to Johannes Berg, Luca Carlon, bup-list
Johannes Berg <joha...@sipsolutions.net> writes:

> Actually, the idx goes out of scope at the end of the loop, so could a
> __delete__ method even already do it? But I guess that's not guaranteed,
> since the gc need not kick in there.

Exactly. I believe it's "unspecified", even though in current cpython
it may well get cleaned up "immediately", always or nearly always.

Luca Carlon

unread,
May 11, 2020, 1:58:39 PM5/11/20
to Rob Browning, bup-list
Hello,
I tried to do what we planned, but the patch is getting pretty large.
I'm not sure if you guys want the patch to be placed here, but it is
only a wip at the moment. I'd appreciate if someone could have a look
at it: https://github.com/carlonluca/bup/commit/dfbb2c6edf6c96d74ef937a850608c3da339153e.
There are still a few cases where the concept of "free asap" is not
implemented: in particular for the PackWriter class I guess. But I'd
appreciate a comment from you before going on. long-check succeeds and
pruning on smb seems to succeed as well.
Regards.

Johannes Berg

unread,
May 11, 2020, 3:56:05 PM5/11/20
to Luca Carlon, Rob Browning, bup-list
Hi,

On Mon, 2020-05-11 at 19:58 +0200, Luca Carlon wrote:

> I tried to do what we planned, but the patch is getting pretty large.

Oh, nice. I didn't even think we'd do that in one big patch, but rather
add the ability and use it here to fix the issue, and then convert
things later.

> I'm not sure if you guys want the patch to be placed here, but it is
> only a wip at the moment. I'd appreciate if someone could have a look
> at it: https://github.com/carlonluca/bup/commit/dfbb2c6edf6c96d74ef937a850608c3da339153e.
> There are still a few cases where the concept of "free asap" is not
> implemented: in particular for the PackWriter class I guess. But I'd
> appreciate a comment from you before going on. long-check succeeds and
> pruning on smb seems to succeed as well.

Looks fine. IMHO inline on the list is fine too, would give a chance to
reply directly to pieces.

> @contextmanager
> def open_idx(filename):
> idx = open_idx_noctx(filename)
> try:
> yield idx
> finally:
> idx.close()


I think for piece-meal conversion (especially for fixing only your bug
first, so as to not hide the bugfix in the other changes) it would'be
good to not use @contextmanager here, but instead make the classes
context managers.

That is, leave open_idx() alone, and change PackIdx and PackMidx to have
__enter__() and __exit__() methods:

def __enter__(self):
return self
def __exit__(self):
self.close()


That way, you can write *both*

ix = open_idx()
try:
....
finally:
ix.close()


and

with open_idx() as ix:
...


and so you don't have to change most code, at least not right away. Same
for PackIdxList, instead of the @contextmanager open_idx_list().


The nice thing is that that's a trivial addition that you can do for the
bugfix, and then convert the other stuff separately where it's not as
important.


But that's really just my personal preference I guess, not sure how much
weight that has :-)

johannes

Luca Carlon

unread,
May 12, 2020, 1:29:21 PM5/12/20
to Johannes Berg, Rob Browning, bup-list
Hello,
I thought you wanted a complete patch.
What about this then? A minimal patch that seems to fix my issue
pruning over smb and introducing context manager for PackIdx.

From 36a31c973af273bc39ce6b0c29cae06f9a4d2fe7 Mon Sep 17 00:00:00 2001
From: Luca Carlon <carlo...@gmail.com>
Date: Tue, 12 May 2020 14:08:11 +0200
Subject: [PATCH 1/1] Use an optional context manager on the packidx class.

---
lib/bup/gc.py | 59 +++++++++++++++++++++++++-------------------------
lib/bup/git.py | 22 +++++++++++++++++++
2 files changed, 51 insertions(+), 30 deletions(-)

diff --git a/lib/bup/gc.py b/lib/bup/gc.py
index b23029d..e663cc0 100644
--- a/lib/bup/gc.py
+++ b/lib/bup/gc.py
@@ -163,40 +163,39 @@ def sweep(live_objects, existing_count,
cat_pipe, threshold, compression,
if verbosity:
qprogress('preserving live data (%d%% complete)\r'
% ((float(collect_count) / existing_count) * 100))
- idx = git.open_idx(idx_name)
-
- idx_live_count = 0
- for sha in idx:
- if live_objects.exists(sha):
- idx_live_count += 1
+ with git.open_idx(idx_name) as idx:
+ idx_live_count = 0
+ for sha in idx:
+ if live_objects.exists(sha):
+ idx_live_count += 1
+
+ collect_count += idx_live_count
+ if idx_live_count == 0:
+ if verbosity:
+ log('deleting %s\n'
+ % path_msg(git.repo_rel(basename(idx_name))))
+ ns.stale_files.append(idx_name)
+ ns.stale_files.append(idx_name[:-3] + b'pack')
+ continue
+
+ live_frac = idx_live_count / float(len(idx))
+ if live_frac > ((100 - threshold) / 100.0):
+ if verbosity:
+ log('keeping %s (%d%% live)\n' %
(git.repo_rel(basename(idx_name)),
+ live_frac * 100))
+ continue

- collect_count += idx_live_count
- if idx_live_count == 0:
if verbosity:
- log('deleting %s\n'
- % path_msg(git.repo_rel(basename(idx_name))))
+ log('rewriting %s (%.2f%% live)\n' % (basename(idx_name),
+ live_frac * 100))
+ for sha in idx:
+ if live_objects.exists(sha):
+ item_it = cat_pipe.get(hexlify(sha))
+ _, typ, _ = next(item_it)
+ writer.just_write(sha, typ, b''.join(item_it))
+
ns.stale_files.append(idx_name)
ns.stale_files.append(idx_name[:-3] + b'pack')
- continue
-
- live_frac = idx_live_count / float(len(idx))
- if live_frac > ((100 - threshold) / 100.0):
- if verbosity:
- log('keeping %s (%d%% live)\n' %
(git.repo_rel(basename(idx_name)),
- live_frac * 100))
- continue
-
- if verbosity:
- log('rewriting %s (%.2f%% live)\n' % (basename(idx_name),
- live_frac * 100))
- for sha in idx:
- if live_objects.exists(sha):
- item_it = cat_pipe.get(hexlify(sha))
- _, typ, _ = next(item_it)
- writer.just_write(sha, typ, b''.join(item_it))
-
- ns.stale_files.append(idx_name)
- ns.stale_files.append(idx_name[:-3] + b'pack')

if verbosity:
progress('preserving live data (%d%% complete)\n'
diff --git a/lib/bup/git.py b/lib/bup/git.py
index dd78aa4..ead9140 100644
--- a/lib/bup/git.py
+++ b/lib/bup/git.py
@@ -414,6 +414,12 @@ class PackIdxV1(PackIdx):
# Avoid slicing shatable for individual hashes (very high overhead)
self.shatable = buffer(self.map, self.sha_ofs, self.nsha * 24)

+ def __enter__(self):
+ return self
+
+ def __exit__(self, *args):
+ self.close()
+
def __len__(self):
return int(self.nsha) # int() from long for python 2

@@ -433,6 +439,11 @@ class PackIdxV1(PackIdx):
start = self.sha_ofs + 4
for ofs in range(start, start + 24 * self.nsha, 24):
yield self.map[ofs : ofs + 20]
+
+ def close(self):
+ if self.map is not None:
+ self.map.close()
+ self.map = None


class PackIdxV2(PackIdx):
@@ -451,6 +462,12 @@ class PackIdxV2(PackIdx):
self.ofs64table_ofs = self.ofstable_ofs + self.nsha * 4
# Avoid slicing this for individual hashes (very high overhead)
self.shatable = buffer(self.map, self.sha_ofs, self.nsha*20)
+
+ def __enter__(self):
+ return self
+
+ def __exit__(self, *args):
+ self.close()

def __len__(self):
return int(self.nsha) # int() from long for python 2
@@ -476,6 +493,11 @@ class PackIdxV2(PackIdx):
start = self.sha_ofs
for ofs in range(start, start + 20 * self.nsha, 20):
yield self.map[ofs : ofs + 20]
+
+ def close(self):
+ if self.map is not None:
+ self.map.close()
+ self.map = None


_mpi_count = 0
--
2.25.1

Regards.

Johannes Berg

unread,
May 12, 2020, 1:44:23 PM5/12/20
to Luca Carlon, Rob Browning, bup-list
On Tue, 2020-05-12 at 19:29 +0200, Luca Carlon wrote:
> Hello,
> I thought you wanted a complete patch.

Well I guess we do eventually want that :)

> What about this then? A minimal patch that seems to fix my issue
> pruning over smb and introducing context manager for PackIdx.

But that looks good to me as an intermediate fixup. Then all the
conversions to use context manager for other open_idx() calls can be
separate etc.

(FWIW, the patch got line-wrapped)

johannes

Luca Carlon

unread,
May 12, 2020, 6:14:35 PM5/12/20
to Johannes Berg, Rob Browning, bup-list
On Tue, May 12, 2020 at 7:44 PM Johannes Berg <joha...@sipsolutions.net> wrote:
> But that looks good to me as an intermediate fixup. Then all the
> conversions to use context manager for other open_idx() calls can be
> separate etc.

Or another option would probably be to move the implementation to the
base class:

From 63dcca5c84c21c1625a742c6fd5f38ab7d987015 Mon Sep 17 00:00:00 2001
From: Luca Carlon <carlo...@gmail.com>
Date: Tue, 12 May 2020 14:08:11 +0200
Subject: [PATCH 1/1] Use an optional context manager on the packidx class.

---
lib/bup/gc.py | 59 +++++++++++++++++++++++++-------------------------
lib/bup/git.py | 12 ++++++++++
2 files changed, 41 insertions(+), 30 deletions(-)
index dd78aa4..4c12df8 100644
--- a/lib/bup/git.py
+++ b/lib/bup/git.py
@@ -363,8 +363,15 @@ def _decode_packobj(buf):

class PackIdx:
def __init__(self):
+ self.map = None
assert(0)

+ def __enter__(self):
+ return self
+
+ def __exit__(self, *args):
+ self.close()
+
def find_offset(self, hash):
"""Get the offset of an object inside the index file."""
idx = self._idx_from_hash(hash)
@@ -399,6 +406,11 @@ class PackIdx:
return mid
return None

+ def close(self):
+ if self.map is not None:
+ self.map.close()
+ self.map = None
+

class PackIdxV1(PackIdx):
"""Object representation of a Git pack index (version 1) file."""
--
2.25.1

Btw, I hope it does not get wrapped again.
Regards.

Johannes Berg

unread,
May 12, 2020, 6:16:35 PM5/12/20
to Luca Carlon, Rob Browning, bup-list
On Wed, 2020-05-13 at 00:14 +0200, Luca Carlon wrote:
> On Tue, May 12, 2020 at 7:44 PM Johannes Berg <joha...@sipsolutions.net> wrote:
> > But that looks good to me as an intermediate fixup. Then all the
> > conversions to use context manager for other open_idx() calls can be
> > separate etc.
>
> Or another option would probably be to move the implementation to the
> base class:

I don't think the midx inherits from it?

> + if verbosity:
> + log('keeping %s (%d%% live)\n' %
> (git.repo_rel(basename(idx_name)),
> + live_frac * 100))
[snip]
> Btw, I hope it does not get wrapped again.

It did get wrapped there, it seems.

johannes

Luca Carlon

unread,
May 13, 2020, 1:52:38 PM5/13/20
to Johannes Berg, Rob Browning, bup-list
Not sure what is wrapping, maybe gmail... Can someone tell me if there
is something else needed to integrate the patches? Is one of these
patches acceptable?
Thanks.

Rob Browning

unread,
May 19, 2020, 8:08:21 PM5/19/20
to Luca Carlon, Johannes Berg, bup-list
Luca Carlon <carlo...@gmail.com> writes:

> Not sure what is wrapping, maybe gmail... Can someone tell me if there
> is something else needed to integrate the patches? Is one of these
> patches acceptable?

I'm tied up a bit at the moment, but specifically with respect to
sending patches, I'd suggest if you can arrange it, setting up
git-send-email. See HACKING for some related information, and (I'm
assuming based on your address) the git manpage also mentions
configuring it for gmail. https://git-scm.com/docs/git-send-email

Hope this helps

Luca Carlon

unread,
May 20, 2020, 1:26:04 AM5/20/20
to Rob Browning, Johannes Berg, bup-list
Hello,
ok, but which alternative do you prefer?
Regards.

---

Dr. Luca Carlon
Software Engineer

Greg Troxel

unread,
May 20, 2020, 6:09:20 PM5/20/20
to Luca Carlon, bup-list
Luca Carlon <carlo...@gmail.com> writes:

> ok, but which alternative do you prefer?

Rob is known to prefer sending patches (with format-patch or equivalent)
to the list. That way the proposed code change is thrust in all list
readers' faces, provoking more review and comments.

Luca Carlon

unread,
May 20, 2020, 7:09:15 PM5/20/20
to Greg Troxel, bup-list
I provided already around 4 different patches that fix the issue I reported. All the patches were sent here.
If there is a specific request to include one of the patches, please le me know, otherwise I spent more than enough time on this issue, no reason to invest more.
Regards.

---
Dr. Luca Carlon
Software Engineer

Johannes Berg

unread,
May 21, 2020, 1:27:24 PM5/21/20
to Luca Carlon, Greg Troxel, bup-list
Hi,

> I provided already around 4 different patches that fix the issue I
> reported.

Right, thanks for that.

> All the patches were sent here.

Well, there's a bit of an impedance mismatch here - I'm not going to
argue you didn't send them, of course you did, but you sent them inline
in other emails and they got line-wrapped (rather than with git send-
email), so from the perspective of the tooling they didn't really exist
as patches.

Which really is fine, you're reporting a bug and helping fix it.

> If there is a specific request to include one of the patches, please
> le me know, otherwise I spent more than enough time on this issue, no
> reason to invest more.

Fair enough.

I'll do what I think best and integrate it into my tree, and then Rob
can deal with it more easily in git. (In a couple of hours, once my kids
are all in bed, that is.)

johannes

Luca Carlon

unread,
May 21, 2020, 1:48:59 PM5/21/20
to Johannes Berg, Greg Troxel, bup-list
On Thu, May 21, 2020 at 7:27 PM Johannes Berg <joha...@sipsolutions.net> wrote:
> > All the patches were sent here.
>
> Well, there's a bit of an impedance mismatch here - I'm not going to
> argue you didn't send them, of course you did, but you sent them inline
> in other emails and they got line-wrapped (rather than with git send-
> email), so from the perspective of the tooling they didn't really exist
> as patches.

The first patch was sent following your guidelines. I then asked which
of the 4 approaches is the one you prefer. I'm still waiting for an
answer. I intended to follow the guidelines as soon as a choice was
made.
Regards.

--

Johannes Berg

unread,
May 21, 2020, 2:26:47 PM5/21/20
to Luca Carlon, Greg Troxel, bup-list
Right, sorry. I don't know. I just play the maintainer when Rob isn't
answering ;-)

But I guess my preference would be to have two patches:

1) 36a31c973af ("Use an optional context manager on the packidx class.")
but I guess it should also be added to the midx class?
(I wouldn't move it to the base class since the map object isn't
there)
Adding here just the change in "bup gc".

2) dfbb2c6edf6 ("Unmap mmapped mem explicitly to free the fd asap.")

Or at least similar to it, i.e. an incremental patch on top of 1) to
convert (most of the?) other users to the context manager. Since in
most cases the files aren't going to be deleted, this isn't as
important.

Then I'd (again, personally) take 1) for the soon-to-be-released 0.30.x
bugfix release, and 2) only in master for the next release.

But like I said, I can do that too.

(However, if you do resend, please send with Signed-off-by)

Thanks,
johannes

Luca Carlon

unread,
May 21, 2020, 4:36:49 PM5/21/20
to Johannes Berg, Greg Troxel, bup-list
I sent both those patches. Hope I followed the indications properly.
Regards.
Reply all
Reply to author
Forward
0 new messages