Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Issues in the Patch Tagging Guidelines

9 views
Skip to first unread message

Guillem Jover

unread,
Aug 7, 2023, 7:30:04 PM8/7/23
to
Hi!

Lately I've been updating metadata in patches in packages I maintain and
noticed several issues with the Patch Tagging Guidelines, and after Lucas
created the new great patches UDD service [P] and we discussed some
other issues there, it looked like the guidelines could do with some
fixes and updates.

[P] <https://udd.debian.org/patches.cgi> also part of DMD.

The following list are some of the issues or things that might deserve
to be clarified, fixed or updated (for reference the current guidelines
can be found at <https://dep.debian.net/deps/dep3>):

* dpatch complicates parsing, it is deprecated, probably worth dropping
support for it.

* Although the guidelines seem to intend for git formatted patches to be
supported, the actual specification of the format is not very clear
on its usage and a strict reading does not really allow it.

- There is a requirement for the first field to appear on the first
line, but git formatted patches start with «From ».
- You cannot easily add custom Patch Guideline patches into the first
git stanza, those need to go into the git trailers in the commit
message, separated by whatever text is found in the description,
which does not follow the deb822 format.
- Having to store the patch guideline fields in git commit trailer
fields might mean these pollute the patch which might need to be
removed before submitting upstream.

* Forwarded does not support recording it being sent to an email address.
Not all upstreams have a public mailing list or web site to file reports
or send comments to, and it might be worth allowing a mailto: reference,
or simply an email address.

* Forwarded lists "yes" as a valid implicit value, but does not make it
clear whether an explicit one should be supported. If a mailto: or
email is accepted then this is probably less of an issue. I've also
used this value when I have sent the patch upstream and it has been
applied before I have gotten around to updating the patch metadata,
but I guess at that point not providing the field would also be
fine.

* Forwarded fuzzy specification means parsing its values is rather hard,
see UDD <https://lists.debian.org/debian-qa/2023/01/msg00022.html>.
It would be better to be strict here. In general choosing to be
fuzzy about the whole specification with the intent to help humans,
I think means that programs have a hard time (see UDD above, and
lintian, where both disagree on semantics) and even humans can get
more confused when crafting or parsing them.

* It is not clear, but I think «Origin: vendor» should be clarified to
state that the actual vendor name should be included if there is no
other reference (such as an URL) as in say «Origin: vendor, Debian»?
Otherwise an undefined vendor is not really distinguishable from the
«other» value as it could be any vendor. Also because if a «vendor»
maintainer has created the patch then there might be no URL to point
to except for the VCS it is kept in (if any at all).

* For Applied-Upstream it is not easy to predict what will be the
future upstream version that the patch will be included in. I've opted
for stuff like «3.2.1+, commit:<id>» when 3.2.1 is the last release,
but that does not seem optimal.

* The git Date field could somehow be used in place of Last-Update (even
though the Committer Date instead of the Author Date is what matches
here more closely, but which is not available from «git format-patch»).

* Add new metadata to track single-debian-patch autogenerated patches,
perhaps a new «Autogenerated: yes» or perhaps better something like
«Origin: autogenerated, by:dpkg-source» (or similar descriptive thing)?
These should in general not be warned as needing to be forwarded
upstream, at least not as-is (dpkg-source in 1.22.0 will add a
«Forwarded: not-needed» for these though).

* The language could use some clarification and standardize on the field
and stanza naming used in other parts of the deb822 ecosystem instead
of headers and sets of headers and similar.


This is my current list, Lucas (CCed) probably has other stuff.

Thanks,
Guillem

Lucas Nussbaum

unread,
Aug 10, 2023, 9:50:05 AM8/10/23
to
Hi,

On 08/08/23 at 01:25 +0200, Guillem Jover wrote:
> Hi!
>
> Lately I've been updating metadata in patches in packages I maintain and
> noticed several issues with the Patch Tagging Guidelines, and after Lucas
> created the new great patches UDD service [P] and we discussed some
> other issues there, it looked like the guidelines could do with some
> fixes and updates.
>
> [P] <https://udd.debian.org/patches.cgi> also part of DMD.
>
> The following list are some of the issues or things that might deserve
> to be clarified, fixed or updated (for reference the current guidelines
> can be found at <https://dep.debian.net/deps/dep3>):

For context, there are several bugs about UDD's implementation of patch parsing:
#1028503 UDD: Unknown "yes" value for Forwarded field in patch metadata
#1031381 UDD/dmd: Incorrectly considers patches needing work
#1043043 UDD patches: marks Forwarded as invalid if not 'no', 'not-needed', 'yes' or URL

The UDD parser of DEP3 metadata is
https://salsa.debian.org/qa/udd/-/blob/master/rimporters/patches.rb#L163

The issues listed below fall in two categories:
1/ general issues about DEP3
2/ issues about computation of the "Forwarded" field value

1/ General issues about DEP3
============================

> * dpatch complicates parsing, it is deprecated, probably worth dropping
> support for it.

I believe this is solved: dpatch is no longer available in
stable/testing/unstable, and we don't have any packages build-depending
on it. There are still some patches that are dpatch-formatted but do not
depend on dpatch-specific features, so this is not a big issue.

See:
select source, patch from sources_patches inner join patches using (hash, file)
where release='trixie' and headers ~* 'dpatch' order by 1,2;
=> 96 source packages, 295 patches

> * Although the guidelines seem to intend for git formatted patches to be
> supported, the actual specification of the format is not very clear
> on its usage and a strict reading does not really allow it.
>
> - There is a requirement for the first field to appear on the first
> line, but git formatted patches start with «From ».

That's OK, since From is an alternative to Author, so it's a valid DEP3
field.

> - You cannot easily add custom Patch Guideline patches into the first
> git stanza, those need to go into the git trailers in the commit
> message, separated by whatever text is found in the description,
> which does not follow the deb822 format.

That's OK. You can have random text before the next DEP3 headers. See
the first example in the DEP3 specification.

> - Having to store the patch guideline fields in git commit trailer
> fields might mean these pollute the patch which might need to be
> removed before submitting upstream.

In practice, supporting both git-formatted patches and custom-formatted
patches was a non-issue when writing a parser.

> * For Applied-Upstream it is not easy to predict what will be the
> future upstream version that the patch will be included in. I've opted
> for stuff like «3.2.1+, commit:<id>» when 3.2.1 is the last release,
> but that does not seem optimal.
>
> * The git Date field could somehow be used in place of Last-Update (even
> though the Committer Date instead of the Author Date is what matches
> here more closely, but which is not available from «git format-patch»).

Yes, the UDD implementation does that.

> * Add new metadata to track single-debian-patch autogenerated patches,
> perhaps a new «Autogenerated: yes» or perhaps better something like
> «Origin: autogenerated, by:dpkg-source» (or similar descriptive thing)?
> These should in general not be warned as needing to be forwarded
> upstream, at least not as-is (dpkg-source in 1.22.0 will add a
> «Forwarded: not-needed» for these though).

A default of 'Forwarded: no' would be more appropriate, since some of
individual changes could be relevant for upstream...

I was surprised to see that there are only 144 packages in testing using
single-debian-patch, according to
select source, patch, headers
from sources_patches inner join patches using (hash, file)
where release='trixie' and patch ='debian-changes' order by 1 ,2;

> * The language could use some clarification and standardize on the field
> and stanza naming used in other parts of the deb822 ecosystem instead
> of headers and sets of headers and similar.

> * It is not clear, but I think «Origin: vendor» should be clarified to
> state that the actual vendor name should be included if there is no
> other reference (such as an URL) as in say «Origin: vendor, Debian»?
> Otherwise an undefined vendor is not really distinguishable from the
> «other» value as it could be any vendor. Also because if a «vendor»
> maintainer has created the patch then there might be no URL to point
> to except for the VCS it is kept in (if any at all).

2/ issues about computation of the "Forwarded" field value
==========================================================

A clear status for each patch is useful to include that information in
various dashboards (tracker.d.o etc.)

Unfortunately, the fact that the value for Forwarded is not necessarily
explicit makes it hard to write a parser. It would be better if there
was a simple way to determine:
- if the patch was forwarded uptream (and how)
- if the patch does not need to be forwarded upstream
- if the patch still needs to be investigated

We could keep the current scheme of computing the real value of
Forwarded based on other fields, but then I think that DEP3 should
include pseudo-code to explain how to compute it.
I'm extremely biased of course, but I think that the UDD implementation could
serve as a basis. :-)
(And I would be happy to update the UDD implementation to match an updated DEP3).

> * Forwarded does not support recording it being sent to an email address.
> Not all upstreams have a public mailing list or web site to file reports
> or send comments to, and it might be worth allowing a mailto: reference,
> or simply an email address.
>
> * Forwarded lists "yes" as a valid implicit value, but does not make it
> clear whether an explicit one should be supported. If a mailto: or
> email is accepted then this is probably less of an issue. I've also
> used this value when I have sent the patch upstream and it has been
> applied before I have gotten around to updating the patch metadata,
> but I guess at that point not providing the field would also be
> fine.
>
> * Forwarded fuzzy specification means parsing its values is rather hard,
> see UDD <https://lists.debian.org/debian-qa/2023/01/msg00022.html>.
> It would be better to be strict here. In general choosing to be
> fuzzy about the whole specification with the intent to help humans,
> I think means that programs have a hard time (see UDD above, and
> lintian, where both disagree on semantics) and even humans can get
> more confused when crafting or parsing them.

So I think that the next step would be for someone to start a draft to
update DEP3. I might do it myself at some point, but would be very happy
if someone else did it.

Lucas

Jonathan Dowland

unread,
Aug 16, 2023, 6:10:04 AM8/16/23
to
On Thu, Aug 10, 2023 at 03:42:03PM +0200, Lucas Nussbaum wrote:
>On 08/08/23 at 01:25 +0200, Guillem Jover wrote:
>> - There is a requirement for the first field to appear on the first
>> line, but git formatted patches start with «From ».
>
>That's OK, since From is an alternative to Author, so it's a valid DEP3
>field.

Is "From " an alternative to "Author:"? "From:" is, but Git patches are
more mbox-like with "From " (space suffixed, not colon). DEP3 seems
remarkably ambiguous on the syntax. (RFC 2822 seems fairly clear though,
only colon delimits the header field name.)

>So I think that the next step would be for someone to start a draft to
>update DEP3. I might do it myself at some point, but would be very happy
>if someone else did it.

+1

--
Please do not CC me for listmail.

👱🏻 Jonathan Dowland
jm...@debian.org
🔗 https://jmtd.net

Guillem Jover

unread,
Aug 16, 2023, 6:30:05 AM8/16/23
to
Hi!

[ Started this some days ago and only finished it now, I see Jonathan has
covered some parts of this. ]
Yes, probably unclear, my point was to drop the dpatch support from
the Patch Guidelines, precisely due to this, and because otherwise a
strictly conforming parser would need to support it as it stands.

> > * Although the guidelines seem to intend for git formatted patches to be
> > supported, the actual specification of the format is not very clear
> > on its usage and a strict reading does not really allow it.
> >
> > - There is a requirement for the first field to appear on the first
> > line, but git formatted patches start with «From ».
>
> That's OK, since From is an alternative to Author, so it's a valid DEP3
> field.

Notice the space. Here's what a git format-patch patch looks like:

,---
From 9767e87d08803d78208464be3652a9231b6b08e3 Mon Sep 17 00:00:00 2001
From: Guillem Jover <gui...@hadrons.org>
Date: Thu, 10 Aug 2023 20:31:02 +0200
Subject: [PATCH] Patch description

Long explanation.
---
test | 1 +
1 file changed, 1 insertion(+)
create mode 100644 test

diff --git a/test b/test
new file mode 100644
index 0000000..58bc469
--- /dev/null
+++ b/test
@@ -0,0 +1 @@
+Something.
--
2.40.1
`---

Which is not what the example looks like.

> > - You cannot easily add custom Patch Guideline patches into the first
> > git stanza, those need to go into the git trailers in the commit
> > message, separated by whatever text is found in the description,
> > which does not follow the deb822 format.
>
> That's OK. You can have random text before the next DEP3 headers. See
> the first example in the DEP3 specification.

Right. I guess my core issue here is that the specification mixes git
format patches with something that is deb822 formatted, which seems
like a bad idea, because even though they are close they have different
formatting. I think ideally it would state that you have to use either
format, but not mix them?

> > - Having to store the patch guideline fields in git commit trailer
> > fields might mean these pollute the patch which might need to be
> > removed before submitting upstream.
>
> In practice, supporting both git-formatted patches and custom-formatted
> patches was a non-issue when writing a parser.

This is an issue for developers not parsers though.

> > * For Applied-Upstream it is not easy to predict what will be the
> > future upstream version that the patch will be included in. I've opted
> > for stuff like «3.2.1+, commit:<id>» when 3.2.1 is the last release,
> > but that does not seem optimal.
> >
> > * The git Date field could somehow be used in place of Last-Update (even
> > though the Committer Date instead of the Author Date is what matches
> > here more closely, but which is not available from «git format-patch»).
>
> Yes, the UDD implementation does that.
>
> > * Add new metadata to track single-debian-patch autogenerated patches,
> > perhaps a new «Autogenerated: yes» or perhaps better something like
> > «Origin: autogenerated, by:dpkg-source» (or similar descriptive thing)?
> > These should in general not be warned as needing to be forwarded
> > upstream, at least not as-is (dpkg-source in 1.22.0 will add a
> > «Forwarded: not-needed» for these though).
>
> A default of 'Forwarded: no' would be more appropriate, since some of
> individual changes could be relevant for upstream...

I do not maintain any package like this, but I think this would be
more annoying than helpful. Even if some parts have been forwarded,
that cannot be recorded in here, which would then become misleading.
The information might be present in the original split patches if
it's maintained in some VCS, but gets lost when applied. dpkg-source
adds a Description field mentioning this, perhaps it could also
reference the contents of any Vcs-* field as the possible source of
better metadata, but that seems redundant and can get out of sync,
maybe just a reference to debian/control would do. The patch header
injected by dpkg-source from git HEAD is currently:

,---
Description: Autogenerated patch header for a single-debian-patch file.
The delta against upstream is either kept as a single patch, or maintained
in some VCS, and exported as a single patch instead of more manageable
atomic patches.
Forwarded: not-needed

---
`---

> I was surprised to see that there are only 144 packages in testing using
> single-debian-patch, according to
> select source, patch, headers
> from sources_patches inner join patches using (hash, file)
> where release='trixie' and patch ='debian-changes' order by 1 ,2;

Right, although given that these can also be consumed by upstreams or
other distributions (not based on Debian), when patch fishing and
similar, I think making this clear in the guidelines would be the
better option.

> > * The language could use some clarification and standardize on the field
> > and stanza naming used in other parts of the deb822 ecosystem instead
> > of headers and sets of headers and similar.
>
> > * It is not clear, but I think «Origin: vendor» should be clarified to
> > state that the actual vendor name should be included if there is no
> > other reference (such as an URL) as in say «Origin: vendor, Debian»?
> > Otherwise an undefined vendor is not really distinguishable from the
> > «other» value as it could be any vendor. Also because if a «vendor»
> > maintainer has created the patch then there might be no URL to point
> > to except for the VCS it is kept in (if any at all).
>
> 2/ issues about computation of the "Forwarded" field value
> ==========================================================
>
> A clear status for each patch is useful to include that information in
> various dashboards (tracker.d.o etc.)
>
> Unfortunately, the fact that the value for Forwarded is not necessarily
> explicit makes it hard to write a parser. It would be better if there
> was a simple way to determine:
> - if the patch was forwarded uptream (and how)
> - if the patch does not need to be forwarded upstream
> - if the patch still needs to be investigated
>
> We could keep the current scheme of computing the real value of
> Forwarded based on other fields, but then I think that DEP3 should
> include pseudo-code to explain how to compute it.

One issue I see with always requiring a Forwarded field is that this
might imply being WET and thus possibly getting the information out of
sync. But perhaps the root problem is that multiple fields convey
related state, and it might be better to just have a single Status
field that conveys this information as it progresses over its various
states. For example I used something like that to track the status of
patches I sent to upstreams (before VCS and similar):

https://git.hadrons.org/cgit/users/guillem/patches.git/tree/FORMAT

(Which generates https://hadrons.org/~guillem/patches.html).

One problem with a single field, is that it might lose some of the
status change history, but meh, as it stands, the semantics seem to be
duplicating information and encode it in multiple places in slightly
subtle different ways.

So perhaps «Status» could subsume, at least Applied-Upstream and
Forwarded or some of Origin. For example:

Status: local (| Status: no-forward-needed ?)
Status: not-sent
Status: sent, commit:<id>|<url>
Status: applied, commit:<id>|<url>
Status: upstream, commit:<id>|<url>
Status: backport, commit:<id>|<url>
Status: fixed, commit:<id>|<url>
Status: rejected, <url>

Or similar.

> I'm extremely biased of course, but I think that the UDD implementation could
> serve as a basis. :-)

I don't think UDD currently takes into account stuff like:

Origin: upstream, commit:<id>
Origin: backport, commit:<id>
Applied-Upstream: <version>, commit:<id>

though, or variants with URLs, which clearly denote that forwarding is
not needed or has already been done in some way (from #1031381 above).

> (And I would be happy to update the UDD implementation to match an updated DEP3).
>
> > * Forwarded does not support recording it being sent to an email address.
> > Not all upstreams have a public mailing list or web site to file reports
> > or send comments to, and it might be worth allowing a mailto: reference,
> > or simply an email address.
> >
> > * Forwarded lists "yes" as a valid implicit value, but does not make it
> > clear whether an explicit one should be supported. If a mailto: or
> > email is accepted then this is probably less of an issue. I've also
> > used this value when I have sent the patch upstream and it has been
> > applied before I have gotten around to updating the patch metadata,
> > but I guess at that point not providing the field would also be
> > fine.
> >
> > * Forwarded fuzzy specification means parsing its values is rather hard,
> > see UDD <https://lists.debian.org/debian-qa/2023/01/msg00022.html>.
> > It would be better to be strict here. In general choosing to be
> > fuzzy about the whole specification with the intent to help humans,
> > I think means that programs have a hard time (see UDD above, and
> > lintian, where both disagree on semantics) and even humans can get
> > more confused when crafting or parsing them.
>
> So I think that the next step would be for someone to start a draft to
> update DEP3. I might do it myself at some point, but would be very happy
> if someone else did it.

I'd like to see opinions on some of these before starting to draft
stuff, but perhaps making some of these proposals more concrete with
wording might be easier to review or discuss?

Thanks,
Guillem

Simon McVittie

unread,
Aug 16, 2023, 7:30:04 AM8/16/23
to
On Wed, 16 Aug 2023 at 12:25:13 +0200, Guillem Jover wrote:
> Here's what a git format-patch patch looks like:
>
> ,---
> From 9767e87d08803d78208464be3652a9231b6b08e3 Mon Sep 17 00:00:00 2001
> From: Guillem Jover <gui...@hadrons.org>
> Date: Thu, 10 Aug 2023 20:31:02 +0200

One particularly popular tool for managing debian/patches is gbp pq, which
removes the leading "From " line, and the trailing "-- " and git version,
in an effort to normalize the patch content in a way that will reduce
spurious changes when the patch series is rebased.

I think ideally the Patch Tagging Guidelines should *allow* the first line
to start with "From ", with no semantic meaning, so that maintainers of
simple packages can drop `git format-patch` output directly into
debian/patches; but should not *require* that first line, so that the
current output of `gbp pq export` is considered to be equally correct.

I personally think the extent to which git has "won", both upstream and in
Debian package maintenance, means that the Patch Tagging Guidelines should
be encouraging the git-like style as primary, with From/Date/Subject
in a header and all Debian-specific fields in a trailing pseudo-header,
the same layout that's familiar for Signed-off-by and similar tags used
by many upstream developers:

From <anything here, optional, ignored>
From: ...
Date: ...
Subject: <short description>

<long description>

<more long description>

Signed-off-by: ...
Bug: https://...
Origin: vendor, Debian
Forwarded: https://...
---
<the actual diff>

That would encourage interoperability with how patches are represented and
tagged (formally or informally) in many non-Debian environments.

When interacting with developers of non-Debian-based distributions (for
example one of my $work projects is Arch-based), I'd like to be able to
point to DEP-3 and say "consider using this format" with a minimum of what
developers of the other project would consider to be weird Debianisms:
concepts like tracking whether a patch is upstreamable in principle,
and how far it has got in actually going upstream, are equally useful
for any downstream distribution. Emphasizing the familiar git-like format
in preference to deb822 would be a step in that direction.

smcv

Andrea Bolognani

unread,
Aug 16, 2023, 11:31:00 AM8/16/23
to
On Wed, Aug 16, 2023 at 12:22:33PM +0100, Simon McVittie wrote:
> I personally think the extent to which git has "won", both upstream and in
> Debian package maintenance, means that the Patch Tagging Guidelines should
> be encouraging the git-like style as primary, with From/Date/Subject
> in a header and all Debian-specific fields in a trailing pseudo-header,
> the same layout that's familiar for Signed-off-by and similar tags used
> by many upstream developers:
>
> From <anything here, optional, ignored>
> From: ...
> Date: ...
> Subject: <short description>
>
> <long description>
>
> <more long description>
>
> Signed-off-by: ...
> Bug: https://...
> Origin: vendor, Debian
> Forwarded: https://...
> ---
> <the actual diff>

While I completely agree with everything that you've said, I will
additionally point out that a notable use of patches is backporting
upstream fixes to older package releases.

In that context, it's generally useful (best practice?) to call 'git
cherry-pick -x' on the upstream commit, which will result in git
generating a line like

(cherry-picked from commit ...)

In that case, the patch metadata should IMO go *after* that line,
leaving the upstream commit message completely unchanged. See [1] for
an example of how that looks like in practice.

Incidentally, notice how I've had to include

Forwarded: not-needed
Origin: https://gitlab.com/libvirt/libvirt/-/commit/...

in order for tracker.d.o not to report this backport as a patch that
needs forwarding upstream. I feel that the presence of a
"(cherry-picked from commit ...)" line should imply at least the
first bit, but maybe there are scenarios that I haven't considered
and that make being explicit about it a necessity.


[1] https://salsa.debian.org/libvirt-team/libvirt/-/blob/26c4e495779fc9e7649956983818735bbc7300fd/debian/patches/backport/src-fix-max-file-limits-in-systemd-services.patch
--
Andrea Bolognani <e...@kiyuko.org>
Resistance is futile, you will be garbage collected.
signature.asc

Raphael Hertzog

unread,
Aug 18, 2023, 6:50:04 AM8/18/23
to
Hi,

On Tue, 08 Aug 2023, Guillem Jover wrote:
> Lately I've been updating metadata in patches in packages I maintain and
> noticed several issues with the Patch Tagging Guidelines, and after Lucas
> created the new great patches UDD service [P] and we discussed some
> other issues there, it looked like the guidelines could do with some
> fixes and updates.

I agree that the DEP could benefit from a refresh. However I will not
have the time to coordinate this work, sorry.

Cheers,
--
⢀⣴⠾⠻⢶⣦⠀ Raphaël Hertzog <her...@debian.org>
⣾⠁⢠⠒⠀⣿⡁
⢿⡄⠘⠷⠚⠋ The Debian Handbook: https://debian-handbook.info/get/
⠈⠳⣄⠀⠀⠀⠀ Debian Long Term Support: https://deb.li/LTS

Christian Kastner

unread,
Aug 18, 2023, 9:40:05 AM8/18/23
to
On 2023-08-16 13:22, Simon McVittie wrote:
> I personally think the extent to which git has "won", both upstream and in
> Debian package maintenance, means that the Patch Tagging Guidelines should
> be encouraging the git-like style as primary
I think this is an excellent argument. DEP-3 hails from a time where we
had (IIRC) 6 VCS supported on alioth, with svn being the dominant one.
Today, practically everyone uses git, so the reasonable thing to do
would be to optimize for that case.

Best,
Christian
0 new messages