Assistance Implementing a git-remotehelpers remote

13 views
Skip to first unread message

xloem

unread,
May 23, 2022, 1:19:02 PM5/23/22
to dulwich-discuss
Hi,

After https://github.com/jelmer/dulwich/issues/952 I thought I'd ask for some advice about implementing a git-remotehelpers remote with dulwich.

I have git repository storage set up, but I'm not sure how to hook it into dulwich. The Repo internals seem really designed around a local filesystem, to me, and the http client class where I could hook the request function, does not have all the functions of the repo class.

I was looking into using the tcp server helpers for git-*-pack, and ran into the above, that thet expect a Repo-like class.

It looked to me like a dev more familiar with the insides of dulwich might be able to do it much more effectively, but I'm pretty new to dulwich so am likely missing some options.

Where would people start, implementing a full-featured git-remotehelpers remote with dulwich?

Jelmer Vernooij

unread,
May 27, 2022, 3:18:40 PM5/27/22
to xloem, dulwich-discuss
Hi there,

Can you perhaps provide a bit of background around what you're ultimately trying to do that you need remote helper support in dulwich for? Do you have an existing remote helper that you'd like Dulwich to interact with?

Jelmer

--
You received this message because you are subscribed to the Google Groups "dulwich-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dulwich-discu...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/dulwich-discuss/103c6e11-bd23-4106-b169-eb77389acfcfn%40googlegroups.com.

xloem

unread,
May 28, 2022, 12:14:21 PM5/28/22
to dulwich-discuss
Hi Jelmer,

Unlike the linked issue (which is why I came here), I'd like to build a git remote helper using Dulwich. I'm not trying to get Dulwich to interact with one.

Dulwich has already support for the git-*-pack protocols, so seems a good base for this. What do you think?

I've implemented remotehelpers before in for example nodejs, but generally in order to  make something quick and full-featured, I call out to a git subprocess to run git-send-pack in a live shadow repository in the filesystem, which leaves a lot of room for the files to get corrupt or otherwise out of sync with remote storage. I'd like to pull the git commandline binary out of the loop.

xloem

Jelmer Vernooij

unread,
Jun 2, 2022, 3:15:08 PM6/2/22
to xloem, dulwich-discuss
I've implemented remotehelpers before in for example nodejs, but generally in order to  make something quick and full-featured, I call out to a git subprocess to run git-send-pack in a live shadow repository in the filesystem, which leaves a lot of room for the files to get corrupt or otherwise out of sync with remote storage. I'd like to pull the git commandline binary out of the loop.what you think

I'd be open to adding a module that makes it easy to write remote helpers. Perhaps named dulwich.remane_helper.

There's some code in breezy that implements a remote helpers for bazaar on top of dulwich. It may be possible to reuse some code from that.

Let me know what you think - happy to help, even if it's just reviewing PRs for dulwich.

Jelmer

Karl Semich

unread,
Jun 2, 2022, 4:09:00 PM6/2/22
to dulwich-discuss
>> commandline binary out of the loop.what you think
>
> I'd be open to adding a module that makes it easy to write remote helpers.
> Perhaps named dulwich.remane_helper.
>
> There's some code in breezy that implements a remote helpers for bazaar on
> top of dulwich. It may be possible to reuse some code from that.
>
> Let me know what you think - happy to help, even if it's just reviewing PRs
> for dulwich.

What would you pursue first to plug a non-filesystem backend onto
dulwich's git-*-pack TCP protocols, with plan to use them over
stdin/stdout instead of tcp?

Breezy looks interesting at first glance. I found
https://github.com/breezy-team/breezy/blob/master/breezy/git/git_remote_helper.py
which supports fetch, option, push, and import, but doesn't connect to
the git pack protocols. I don't see where it connects to dulwich yet.

Jelmer Vernooij

unread,
Jun 6, 2022, 10:41:59 AM6/6/22
to Karl Semich, dulwich-discuss
It uses dulwich under the hood, and it implements the basics of the remote helper protocol. It doesn't support the "connect" command today, though, although that's optional.

In order to implement the "connect" command as well, you'd probably want to invoke/refactor some of the functionality in dulwich.client. I'd suggest starting without "connect" though, since that command is optional and it'll probably be the most complicated one to implement.

Hope that helps,

Jelmer


 

Karl Semich

unread,
Jun 6, 2022, 4:34:22 PM6/6/22
to dulwich-discuss
> It uses dulwich under the hood, and it implements the basics of the remote
> helper protocol. It doesn't support the "connect" command today, though,
> although that's optional.
>
> In order to implement the "connect" command as well, you'd probably want to
> invoke/refactor some of the functionality in dulwich.client. I'd suggest
> starting without "connect" though, since that command is optional and it'll
> probably be the most complicated one to implement.

I'd really like to keep the option open of going the full mile and
implementing "connect", which supersedes and is preferred by git to
most of the commands. Why do you mention dulwich.client ? It seems
like the needed workings are in dulwich.repo ?

I'm kind of maybe inferring it could make sense to factor out the
concept of "storage backend" with http and filesystem providers. Then
a user could implement other providers to make protocol servers that
work with other systems.

I do somewhat see that the other commands are sufficient functionality
and much simpler to implement. Hadn't thought of that.

Jelmer Vernooij

unread,
Jun 6, 2022, 7:45:52 PM6/6/22
to Karl Semich, dulwich-discuss
On Mon, 6 Jun 2022 at 21:34, Karl Semich <0xl...@gmail.com> wrote:
> It uses dulwich under the hood, and it implements the basics of the remote
> helper protocol. It doesn't support the "connect" command today, though,
> although that's optional.
>
> In order to implement the "connect" command as well, you'd probably want to
> invoke/refactor some of the functionality in dulwich.client. I'd suggest
> starting without "connect" though, since that command is optional and it'll
> probably be the most complicated one to implement.

I'd really like to keep the option open of going the full mile and
implementing "connect", which supersedes and is preferred by git to
most of the commands. Why do you mention dulwich.client ? It seems
like the needed workings are in dulwich.repo ?
Even if you start with the other commands, nothing stops you from implementing the more complicated commands like "connect" later.
 
dulwich.client implements the protocol that sits behind the "connect" command. "connect" doesn't involve repository operations, it just allows the remote helper to stream pack files using the standard git pack sending/receiving protocol that's also used over SSH. The "fetch" command allows the remote helper to somehow insert new objects into the repository. How it does that is up to it - it can use dulwich, but also e.g. core Git or even something completely different.
 
I'm kind of maybe inferring it could make sense to factor out the
concept of "storage backend" with http and filesystem providers. Then
a user could implement other providers to make protocol servers that
work with other systems.
 I'm not sure I understand this last paragraph.

Jelmer 
Reply all
Reply to author
Forward
0 new messages