Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Bug#1020248: debian-policy: Clarifying nomenclature for control file names

1 view
Skip to first unread message

Guillem Jover

unread,
Sep 18, 2022, 4:40:03 PM9/18/22
to
Package: debian-policy
Version: 4.6.1.1
Severity: wishlist

Hi!

This is a followup from my comment at:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=998165#43

To summarize, we have IMO confusing naming and nomenclature for the
various control files and paragraphs/stanzas, and this is even
confusing me when having to deal with dpkg code, so I'd like to give
these more clear and unambiguous new names, and I'd very strongly
prefer to agree on the same naming for Debian policy and dpkg, to
avoid further and worse confusion (even though they currently do not
match exactly anyway, but I'd prefer to not make it worse…).

Just for reference and to give some context, I've got the following
WIP branches, trying to clarify the names in documentation and in the
API on, which I'll probably rework (split/merge) and reword as needed,
so do not take them as anything set in stone:

https://git.hadrons.org/git/debian/dpkg/dpkg.git/log/?h=next/clarify-control-filenames
https://git.hadrons.org/git/debian/dpkg/dpkg.git/log/?h=next/deb822-field-types


File descriptions
-----------------

For example we have:

* debian/control:
policy → «Source package control file»
dpkg → «Debian source packages' master control file»

* .dsc:
policy → «Debian source control file»
dpkg → «Debian source packages' control file»

* DEBIAN/control
policy → «Binary package control files»
dpkg → «Debian binary packages' master control file»

These are quite confusingly close.

I've been considering naming debian/control something like
«Debian template source package control file», as that is used to
generate both the source and binary control files. And always
prefixing with Debian, so that would end up as:

* debian/control: «Debian source package template control file»
* .dsc: «Debian source package control file»
* DEBIAN/control: «Debian binary package control file»

This also removes the «master» usage in dpkg, for me for the same
reasons as I covered at
<https://lists.debian.org/debian-dpkg/2021/03/msg00002.html>.


File contents
-------------

We have references to the various parts being called as «paragraphs»,
«stanza», «blocks», but this seems to be more of an issue with dpkg, as
the usage in the Debian policy is quite clear and uniform now, so I'll
at least try to remove the «block» usage there, stanza has the nice
property of being shorter and policy already mentions that this is
currently a common alias, so I might keep paragraph and stanza for now
in dpkg.

The other thing affecting dpkg and debian-policy is how the parts
within the control files are referred to. We have for example:

dpkg → «general section of control info file»
«source stanza»
policy → «general paragraph»

dpkg → «package's section of control info file»
policy → «binary package paragraphs»


So, how does «source package paragraph» and «binary package paragraph»
(of the «template control file») sound instead?


If I've missed any other problematic nomenclature, I'm happy to
discuss and update those on the dpkg side.

Thanks,
Guillem

Guillem Jover

unread,
Sep 18, 2022, 6:50:03 PM9/18/22
to
On Sun, 2022-09-18 at 14:53:30 -0700, Sean Whitton wrote:
> On Sun 18 Sep 2022 at 10:28PM +02, Guillem Jover wrote:
>
> > So, how does «source package paragraph» and «binary package paragraph»
> > (of the «template control file») sound instead?
>
> Can we standardise on 'stanza', please?
>
> I thought that was already standard, and "paragraph" is for prose.

I was also thinking about whether I'd prefer paragraph or stanza, and
the latter seems more specific to deb822 "blocks", and as you say
paragraph seems more for prose.

I went for paragraph, because dpkg has some instances of it already in
docs and code (and stanza only in code), and mainly because the Debian
policy uses almost exclusively paragraph for this with a single
mention of "stanza" in a footnote to mention it's a common alias or
similar.

So, personally, I'd be happy to fully switch to stanza TBH, because
it seems more specific to our use, probably easier to search for, and
it's shorter.

Thanks,
Guillem

Russ Allbery

unread,
Sep 18, 2022, 8:40:03 PM9/18/22
to
Sean Whitton <spwh...@spwhitton.name> writes:
> On Mon 19 Sep 2022 at 12:45AM +02, Guillem Jover wrote:

>> So, personally, I'd be happy to fully switch to stanza TBH, because it
>> seems more specific to our use, probably easier to search for, and
>> it's shorter.

> I think this is fine for Policy to do.

I vote for switching to stanza. Paragraph is going to be confusing when
talking about package descriptions, which often have multiple paragraphs
in the normal English meaning of the term.

--
Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>

Russ Allbery

unread,
Sep 18, 2022, 9:00:02 PM9/18/22
to
Guillem Jover <gui...@debian.org> writes:

> I've been considering naming debian/control something like
> «Debian template source package control file», as that is used to
> generate both the source and binary control files. And always
> prefixing with Debian, so that would end up as:

> * debian/control: «Debian source package template control file»
> * .dsc: «Debian source package control file»
> * DEBIAN/control: «Debian binary package control file»

> This also removes the «master» usage in dpkg, for me for the same
> reasons as I covered at
> <https://lists.debian.org/debian-dpkg/2021/03/msg00002.html>.

I like this. It took a bit for my brain to adjust to it because
"template" felt wrong, but the more I thought about it, the more I think
that's correct and it's pointing out an error in my default way of
thinking about packages.

> File contents
> -------------

> We have references to the various parts being called as «paragraphs»,
> «stanza», «blocks», but this seems to be more of an issue with dpkg, as
> the usage in the Debian policy is quite clear and uniform now, so I'll
> at least try to remove the «block» usage there, stanza has the nice
> property of being shorter and policy already mentions that this is
> currently a common alias, so I might keep paragraph and stanza for now
> in dpkg.

> The other thing affecting dpkg and debian-policy is how the parts
> within the control files are referred to. We have for example:

> dpkg → «general section of control info file»
> «source stanza»
> policy → «general paragraph»

> dpkg → «package's section of control info file»
> policy → «binary package paragraphs»

> So, how does «source package paragraph» and «binary package paragraph»
> (of the «template control file») sound instead?

As mentioned in the other thread, I think source package stanza and binary
package stanza (of the template control file) sound great.

Obviously a patch to Policy would be delightful, but it's not blocking.
Just let us know if that's more than you have time for.

Guillem Jover

unread,
Sep 19, 2022, 5:00:03 PM9/19/22
to
Hi!

On Sun, 2022-09-18 at 17:34:57 -0700, Russ Allbery wrote:
> Sean Whitton <spwh...@spwhitton.name> writes:
> > On Mon 19 Sep 2022 at 12:45AM +02, Guillem Jover wrote:
> >> So, personally, I'd be happy to fully switch to stanza TBH, because it
> >> seems more specific to our use, probably easier to search for, and
> >> it's shorter.
>
> > I think this is fine for Policy to do.
>
> I vote for switching to stanza. Paragraph is going to be confusing when
> talking about package descriptions, which often have multiple paragraphs
> in the normal English meaning of the term.

Ok, I've prepared the attached incremental patch, which only switches
from paragraph(s) to stanza(s) all over the place.

I've updated all the specs for consistency. I've updated the footnote
to swap the preference and to mention paragraph is now discouraged
nomenclature. I've also updated all «id»s out of consistency, which
might break links, so I can revert that if you'd prefer. And I've
preserved the (upper) casing for one of the titles (“Stand-alone
License Stanza”, although that was not consistent with the other
titles, such as “Files stanza”, I'm happy to lower case that one).

I've gone one by one, but please review carefully as I might have
perhaps switched in excess!

Thanks,
Guillem
0001-Use-stanza-to-refer-to-deb822-parts-instead-of-parag.patch

Russ Allbery

unread,
Sep 20, 2022, 12:20:02 PM9/20/22
to
Guillem Jover <gui...@debian.org> writes:

> Ok, I've prepared the attached incremental patch, which only switches
> from paragraph(s) to stanza(s) all over the place.

Thanks, applied.

> I've updated all the specs for consistency. I've updated the footnote to
> swap the preference and to mention paragraph is now discouraged
> nomenclature. I've also updated all «id»s out of consistency, which
> might break links, so I can revert that if you'd prefer.

It looks like it was primarily in the copyright-format specification. I
think that's fine; we haven't historically tried hard to preserve anchors,
and if we ever did, we should probably use some scheme to assign stable
anchors rather than using the text of the heading.

> And I've preserved the (upper) casing for one of the titles
> (“Stand-alone License Stanza”, although that was not consistent with the
> other titles, such as “Files stanza”, I'm happy to lower case that one).

I personally have been convinced by a co-worker who did the research that
one should stop using title-casing in technical documents, since it's
mostly a US convention, US readers don't mind lowercase, and title-casing
can look weird to European readers. But that's a fix for another day.

> I've gone one by one, but please review carefully as I might have
> perhaps switched in excess!

Reviewed, and also checked for remaining uses of "paragraph." Everything
looked good.

Charles Plessy

unread,
Sep 20, 2022, 8:40:02 PM9/20/22
to
Hi all,

while I do not want to pull the handbrake I would like to add my
minority opinion to that change:

Le Tue, Sep 20, 2022 at 04:11:43PM +0000, Russ Allbery (@rra) a écrit :
>
> The «stanza» name is a commonly used and understood term when referring
> to deb822 blocks. Although «paragraph» is commonly used it has the
> problem of being confusing as it then makes it hard to distinguish
> actual text paragraphs in prose, while «stanza» is a very specific
> term that is not applied anywhere else in the deb822 context, so it&#39;s
> always more clear and specific.

I disagree with this point of view. In my own case I had to take a
dictionary to learn what a stanza is, while the word paragraph is surely
know at least to anybody who studied English in a classroom.

In my own field, (molecular biology) we (or at least some of us) are
putting some effort to eliminate jargon and use simple words that makes
written documents more accessible to the public. This is why I prefered
paragraph to stanza when working on the specification.

If I were to redo such a specification from scratch, I would ask
non-European language speakers their opinion too.

Have a nice day,

--
Charles

Russ Allbery

unread,
Sep 20, 2022, 9:20:02 PM9/20/22
to
Charles Plessy <ple...@debian.org> writes:

> I disagree with this point of view. In my own case I had to take a
> dictionary to learn what a stanza is, while the word paragraph is surely
> know at least to anybody who studied English in a classroom.

> In my own field, (molecular biology) we (or at least some of us) are
> putting some effort to eliminate jargon and use simple words that makes
> written documents more accessible to the public. This is why I prefered
> paragraph to stanza when working on the specification.

This is a very valid point, and I appreciate you bringing this up!

My personal opinion is that I don't think jargon is necessarily good or
bad. It has advantages and drawbacks.

One drawback that you're correctly pointing out is accessibility: jargon
can make things that would otherwise be comprehensible harder to
understand. It can also be off-putting and alienating to people, and thus
make it harder for them to get involved in a shared project.

The advantage of jargon, and the reason why jargon exists and why humans
keep inventing it, is that it's precise. You don't need as much context
to disambiguate what a sentence may be talking about. I do find the use
of paragraph the way we were previously using it to be confusing,
particularly given that the paragraphs contain fields which in turn
contain actual paragraphs in the normal sense of the term. In some
contexts, precision doesn't matter, but Debian Policy is one place where
we should try to be precise.

Stanza has the significant drawback that it's dictionary definition is
specific to poetry. In poetry, it's a close analog to how we're using it,
but that does make it somewhat obscure. It does have the minor advantage
of being terminology that was already in use for exactly this construction
in deb822 files.

I don't want to keep using paragraph, but I'd be open to some other term
that Guillem was also open to (I think matching the terminology in dpkg is
very important). Section or block are commonly used for things like this,
but aren't very precise, so I'm not that enthused by them.

> If I were to redo such a specification from scratch, I would ask
> non-European language speakers their opinion too.

I'm definitely interested in that opinion from anyone who is listening in!

Charles Plessy

unread,
Sep 22, 2022, 1:40:02 AM9/22/22
to
Hi Russ,

Le Tue, Sep 20, 2022 at 06:08:16PM -0700, Russ Allbery a écrit :
>
> I do find the use of paragraph the way we were previously using it to
> be confusing, particularly given that the paragraphs contain fields
> which in turn contain actual paragraphs in the normal sense of the
> term.

> I don't want to keep using paragraph, but I'd be open to some other term
> that Guillem was also open to (I think matching the terminology in dpkg is
> very important). Section or block are commonly used for things like this,
> but aren't very precise, so I'm not that enthused by them.

In the spec, the word "paragraph" is only used in the specified context,
so I always felt that there is no ambiguity. But of course, it can
create opportunities for misunderstanding when discussing about the
spec. So point taken about "paragraph", although interestingly, the
Simple English definition of "paragraph" is quite spot on if one would
replace "sentence" with "field": ”one or more sentences that are written
together with no line breaks separating them. Usually they are
connected by a single idea.” (<https://simple.wiktionary.org/wiki/paragraph>)

The use of "paragraph" in the current spec is also consistent with
Chapter 5 of the Policy, which also uses the word "paragraph". By the
way, in section 5.6.26 of the Policy, the word "stanza" is also used to
mean something else than a "paragraph".

I do not mind the word "section". It is the term used in the manual
page "systemd.syntax" that describes systemd's unit files, which means
that readers may be already familiar with the concept. One could argue
that its definition in Simple English
(<https://simple.wiktionary.org/wiki/section>, “A section of a thing or
place is a part of it”) would allow a reader to think that a Field is
also a section, but I feel it is unlikely to happen. This said, one big
disadvantage of "section" is that when searching for this word in a
document, there may be a lot of noisy hits such as "refer to section xyz
for details".

I understand about avoiding ambiguity, but in my opinion it is the price
to pay to be able to translate information into simple words from
English to non-European languages. Although the Policy itself is not
going to be translated, I think that it can be advantageous if its
contents can be discussed in simple words in people's native languages.

Cheers,

-- Charles

Guillem Jover

unread,
Sep 22, 2022, 7:10:03 PM9/22/22
to
Hi!

On Thu, 2022-09-22 at 14:26:38 +0900, Charles Plessy wrote:
> Le Tue, Sep 20, 2022 at 06:08:16PM -0700, Russ Allbery a écrit :
> > I do find the use of paragraph the way we were previously using it to
> > be confusing, particularly given that the paragraphs contain fields
> > which in turn contain actual paragraphs in the normal sense of the
> > term.

Idem.

> > I don't want to keep using paragraph, but I'd be open to some other term
> > that Guillem was also open to (I think matching the terminology in dpkg is
> > very important). Section or block are commonly used for things like this,
> > but aren't very precise, so I'm not that enthused by them.
>
> In the spec, the word "paragraph" is only used in the specified context,
> so I always felt that there is no ambiguity. But of course, it can
> create opportunities for misunderstanding when discussing about the
> spec. So point taken about "paragraph", although interestingly, the
> Simple English definition of "paragraph" is quite spot on if one would
> replace "sentence" with "field": ”one or more sentences that are written
> together with no line breaks separating them. Usually they are
> connected by a single idea.” (<https://simple.wiktionary.org/wiki/paragraph>)

In the end nothing will match exactly, and we need to choose some
terminology. In this case, as previously mentioned, «stanza» has the
good properties of not usually applying to prose, being short, distinct
from the other terms and the less ambiguous of them all. It also makes
constructing sentences to describe things less cumbersome.

> I do not mind the word "section". It is the term used in the manual
> page "systemd.syntax" that describes systemd's unit files, which means
> that readers may be already familiar with the concept. One could argue
> that its definition in Simple English
> (<https://simple.wiktionary.org/wiki/section>, “A section of a thing or
> place is a part of it”) would allow a reader to think that a Field is
> also a section, but I feel it is unlikely to happen. This said, one big
> disadvantage of "section" is that when searching for this word in a
> document, there may be a lot of noisy hits such as "refer to section xyz
> for details".

The problems with section, is that as you mention is not very
searchable, but worse we already have a field with the same name!

> I understand about avoiding ambiguity, but in my opinion it is the price
> to pay to be able to translate information into simple words from
> English to non-European languages. Although the Policy itself is not
> going to be translated, I think that it can be advantageous if its
> contents can be discussed in simple words in people's native languages.

As a non-native speaker, and a translator, I agree having clear
wording in the original text is important, as otherwise that tends to
make translation work harder. But then, part of that work is to find
or create terminology, in many cases not existing yet in the
translated language, that might be suitable there, trying several terms
that might not necessarily be direct translations.

For a translation anecdote related to finding the right terms, when
triggers got introduced, and having to translate them to Catalan, we
initially used «gallets» (which would be the direct translation). But
when reading them that was bothering several of us as it sounded weird,
it could be read as “small roosters” («gall» being rooster, and «ets»
forming the plural diminutive), or being too close to «galets» which is
a type of pasta used for example in «sopa de galets» ("galets" soup). We
then switched to «activadors» which sounds way nicer, even though it's
not a direct translation. But if we had to translate the spec today,
that would be annoying as it uses «activating» all over the place, so
perhaps using «disparador» would be better. So, in the end this is a
process too, and terms can be changed if they are deemed confusing or
not helping convey the meaning. And in some others, you just need to
simply create new terminology, and describe what it means in specific
contexts.


For example for Catalan/Spanish «stanza» is simply «estrofa» which
seems like a nice term to use here.

Thanks,
Guillem

Russ Allbery

unread,
Sep 22, 2022, 10:00:02 PM9/22/22
to
Charles Plessy <ple...@debian.org> writes:

> In the spec, the word "paragraph" is only used in the specified context,
> so I always felt that there is no ambiguity. But of course, it can
> create opportunities for misunderstanding when discussing about the
> spec. So point taken about "paragraph", although interestingly, the
> Simple English definition of "paragraph" is quite spot on if one would
> replace "sentence" with "field": ”one or more sentences that are written
> together with no line breaks separating them. Usually they are
> connected by a single idea.”
> (<https://simple.wiktionary.org/wiki/paragraph>)

> The use of "paragraph" in the current spec is also consistent with
> Chapter 5 of the Policy, which also uses the word "paragraph".

Right, that's the motivation of this change. It didn't start as being
about the copyright file, but about Policy. Guillem was standardizing
terminology in dpkg.

I don't have a strong opinion about what word we choose. I care more
about a few surrounding principles, specifically:

* We should use the same terminology when describing the copyright file as
when describing Debian control files (and every other deb822 file).

* Policy should use the same terminology as dpkg.

* I'd prefer we not use the word "paragraph" because we also use that word
to talk about normal prose paragraphs in the Description control field,
and may similarly need to talk about prose paragraphs in the copyright
file.

> By the way, in section 5.6.26 of the Policy, the word "stanza" is also
> used to mean something else than a "paragraph".

Thanks, I think regardless of how we resolve this bug that usage was
confusing. It was also using two terms for the same concept in the same
section, since earlier the same construction was referred to as a
"portion." I've fixed this to use "portion" consistently in this section.

gregor herrmann

unread,
Sep 23, 2022, 3:10:02 PM9/23/22
to
On Fri, 23 Sep 2022 01:03:28 +0200, Guillem Jover wrote:

> In the end nothing will match exactly, and we need to choose some
> terminology. In this case, as previously mentioned, «stanza» has the
> good properties of not usually applying to prose, being short, distinct
> from the other terms and the less ambiguous of them all. It also makes
> constructing sentences to describe things less cumbersome.

FWIW, and knowing this is not a popularity vote, as yet another
non-native speaker-with a different first language-I also like
"stanza" and agree with Guillem's arguments.

Additionally I'd like to mention that also some software uses this
term:

/usr/share/perl5/Debian/Control/Stanza
/usr/share/perl5/Debian/Control/Stanza.pm
/usr/share/perl5/Debian/Control/Stanza/Binary.pm
/usr/share/perl5/Debian/Control/Stanza/CommaSeparated.pm
/usr/share/perl5/Debian/Control/Stanza/Source.pm
/usr/share/perl5/Debian/Copyright/Stanza
/usr/share/perl5/Debian/Copyright/Stanza.pm
/usr/share/perl5/Debian/Copyright/Stanza/Files.pm
/usr/share/perl5/Debian/Copyright/Stanza/Header.pm
/usr/share/perl5/Debian/Copyright/Stanza/License.pm
/usr/share/perl5/Debian/Copyright/Stanza/OrSeparated.pm

(That's dh-make-perl and libdebian-copyright-perl. Also
libparse-debcontrol-perl talks about "stanzas" in its documentation.)


Cheers
gregor

--
.''`. https://info.comodo.priv.at -- Debian Developer https://www.debian.org
: :' : OpenPGP fingerprint D1E1 316E 93A7 60A8 104D 85FA BB3A 6801 8649 AA06
`. `' Member VIBE!AT & SPI Inc. -- Supporter Free Software Foundation Europe
`-
signature.asc

Charles Plessy

unread,
Sep 24, 2022, 12:10:03 AM9/24/22
to
Hi Russ and Gregor,

thanks for your feedback,

I think that I made most of the points I was thinking about and hope
that some of them related to Simple English and jargon can be useful in
the future. I also understand your point of view. One final comment I
would like to make is that the format is used in other contexts, such as
Apt (using "paragraph" in https://wiki.debian.org/DebianRepository/Format),
DEP-11 (using "block" in https://wiki.debian.org/DEP-11), and probably
other files generated for the Debian archive. I recommend to keep
producers and consumers of these files in the loop before changing
Chapter 5. As for on which word to standardise on, I trust the Policy
Editors to make a good choice, even if it is not my favourite.

Have a nice week-end,

--
Charles

Guillem Jover

unread,
Dec 17, 2022, 10:50:05 AM12/17/22
to
Control: reopen -1

Hi!

Sorry, probably my fault! As I tend to use «Fixes:» git pseudo-fields
for things that fix part of a bug, but are not intended yet to close it,
for which I use «Closes:».

And for some reason I think I also got the impression, even though
the stanza changes had been committed, they could still be backed out.
(BTW I've now gone over the wiki and updated all paragraph references
that applied to stanza.)

In any case, I've sat down and gone over the meat of the original
report. See below.

On Sat, 2022-12-17 at 03:09:10 +0000, Debian Bug Tracking System wrote:
> Date: Sun, 18 Sep 2022 22:28:00 +0200
> From: Guillem Jover <gui...@debian.org>
> To: sub...@bugs.debian.org
> Subject: debian-policy: Clarifying nomenclature for control file names
>
> Package: debian-policy
> Version: 4.6.1.1
> Severity: wishlist

Seems I missed another file:

* .changes:
policy → «upload control file» / «Debian changes file»
dpkg → «upload control file» / «.changes control file» /
«Debian .changes file» / «Debian changes file»

> These are quite confusingly close.
>
> I've been considering naming debian/control something like
> «Debian template source package control file», as that is used to
> generate both the source and binary control files. And always
> prefixing with Debian, so that would end up as:
>
> * debian/control: «Debian source package template control file»
> * .dsc: «Debian source package control file»
> * DEBIAN/control: «Debian binary package control file»

For changes I think something like the following might be a more clear
option (and has the minor bonus of aligning perfectly on the first
words! :), with it mentioning explicitly this is about changes being
uploaded, and that it is a control file (but I'm not sure I'm entirely
convinced about it):

* .changes: «Debian upload changes control files»

> This also removes the «master» usage in dpkg, for me for the same
> reasons as I covered at
> <https://lists.debian.org/debian-dpkg/2021/03/msg00002.html>.

> File contents
> -------------
>
> We have references to the various parts being called as «paragraphs»,
> «stanza», «blocks», but this seems to be more of an issue with dpkg, as
> the usage in the Debian policy is quite clear and uniform now, so I'll
> at least try to remove the «block» usage there, stanza has the nice
> property of being shorter and policy already mentions that this is
> currently a common alias, so I might keep paragraph and stanza for now
> in dpkg.

I've also found instances of «record» and «section» referring to fields
or stanzas.

> The other thing affecting dpkg and debian-policy is how the parts
> within the control files are referred to. We have for example:
>
> dpkg → «general section of control info file»
> «source stanza»
> policy → «general paragraph»
>
> dpkg → «package's section of control info file»
> policy → «binary package paragraphs»
>
>
> So, how does «source package paragraph» and «binary package paragraph»
> (of the «template control file») sound instead?

> If I've missed any other problematic nomenclature, I'm happy to
> discuss and update those on the dpkg side.

I also recalled another term that has always seemed very confusing in
context: «control information files» or «control information area». For
example in a sentence such as “the control file is a control information
file in the control information area in a .deb archive”. :) This also
seems confusing when some of the files in the .deb control member are
not really “control files” with a deb822(5) format.

My thinking has been going into calling these as the «metadata files»,
and being located in either the «metadata part of the .deb archive» or
explicitly the «control member of the .deb archive», in contrast to the
filesystem part. In dpkg I'd be eventually switching to meta/metadata
and fsys/filesystem, from control or info and data. I've added a patch
with the proposed change, but again nothing set in stone, and I'm again
open to discussing pros/cons of this.

Attached the proposals for discussion/review, and I might again have
perhaps missed instances or similar.

Thanks,
Guillem
0001-Use-field-instead-of-record.patch
0002-Use-stanza-instead-of-section.patch
0003-Markup-Files-field-name.patch
0004-Markup-.changes-and-.dsc.patch
0005-Clarify-terminology-for-Debian-control-files.patch
0006-Use-package-metadata-instead-of-control-information.patch

Russ Allbery

unread,
Dec 17, 2022, 11:40:03 AM12/17/22
to
Guillem Jover <gui...@debian.org> writes:

> Sorry, probably my fault! As I tend to use «Fixes:» git pseudo-fields
> for things that fix part of a bug, but are not intended yet to close it,
> for which I use «Closes:».

Ack, sorry, this was my fault. I optimistically added a bug closer when I
started merging patches from this bug in the hope that we'd get them all
merged before the next release and then forgot about it.

(That said, and this is only personal preference and I don't feel that
strongly about it, I usually err on the side of creating lots of bugs so
that there can be roughly one bug per patch. It can make it a bit harder
to track things if there's one bug following a bunch of semi-related but
separable changes. Unfortunately, the BTS doesn't support the concept of
a hierarchy with a tracking issue and a bunch of underlying implementaton
issue very well.)

> And for some reason I think I also got the impression, even though
> the stanza changes had been committed, they could still be backed out.
> (BTW I've now gone over the wiki and updated all paragraph references
> that applied to stanza.)

I'm personally happy to stick with stanza.

> I also recalled another term that has always seemed very confusing in
> context: «control information files» or «control information area». For
> example in a sentence such as “the control file is a control information
> file in the control information area in a .deb archive”. :) This also
> seems confusing when some of the files in the .deb control member are
> not really “control files” with a deb822(5) format.

> My thinking has been going into calling these as the «metadata files»,
> and being located in either the «metadata part of the .deb archive» or
> explicitly the «control member of the .deb archive», in contrast to the
> filesystem part. In dpkg I'd be eventually switching to meta/metadata
> and fsys/filesystem, from control or info and data. I've added a patch
> with the proposed change, but again nothing set in stone, and I'm again
> open to discussing pros/cons of this.

I like metadata file, but I think I prefer talking about the "control
member" than the "metadata part" because it more closely matches what one
sees if one takes the *.deb file apart. But I haven't looked at your
diffs yet.

> Attached the proposals for discussion/review, and I might again have
> perhaps missed instances or similar.

Will take a look soon!

Guillem Jover

unread,
Dec 17, 2022, 12:20:03 PM12/17/22
to
On Sat, 2022-12-17 at 08:35:02 -0800, Russ Allbery wrote:
> (That said, and this is only personal preference and I don't feel that
> strongly about it, I usually err on the side of creating lots of bugs so
> that there can be roughly one bug per patch. It can make it a bit harder
> to track things if there's one bug following a bunch of semi-related but
> separable changes. Unfortunately, the BTS doesn't support the concept of
> a hierarchy with a tracking issue and a bunch of underlying implementaton
> issue very well.)

(Right, personally I don't think I'd split one bug per patch, as long
as the patch is covering the same thing, perhaps. In this case I
pondered about opening a new one, but given the initial discussion
and context was here it seemed best to keep it together. Should have
perhaps split the stanza stuff into another report, though. :)

(In any case, hope this is all not too inconvenient!)

> Guillem Jover <gui...@debian.org> writes:
> > And for some reason I think I also got the impression, even though
> > the stanza changes had been committed, they could still be backed out.
> > (BTW I've now gone over the wiki and updated all paragraph references
> > that applied to stanza.)
>
> I'm personally happy to stick with stanza.

Sorry, rereading that paragraph it seems not clear on what I was
trying to convey. :) I meant that because I thought this could be backed
out until an upload, I guess subconsciously postponed further changes
for this (both in dpkg and here) based on that. Should have asked.

> > I also recalled another term that has always seemed very confusing in
> > context: «control information files» or «control information area». For
> > example in a sentence such as “the control file is a control information
> > file in the control information area in a .deb archive”. :) This also
> > seems confusing when some of the files in the .deb control member are
> > not really “control files” with a deb822(5) format.
>
> > My thinking has been going into calling these as the «metadata files»,
> > and being located in either the «metadata part of the .deb archive» or
> > explicitly the «control member of the .deb archive», in contrast to the
> > filesystem part. In dpkg I'd be eventually switching to meta/metadata
> > and fsys/filesystem, from control or info and data. I've added a patch
> > with the proposed change, but again nothing set in stone, and I'm again
> > open to discussing pros/cons of this.
>
> I like metadata file, but I think I prefer talking about the "control
> member" than the "metadata part" because it more closely matches what one
> sees if one takes the *.deb file apart. But I haven't looked at your
> diffs yet.

Probably more clear currently, yeah. To expand on the above, my thinking
has been for example that if we ever have a deb 3.0 format, I'd probably
like to name the members, something like: «meta.tar.*» and «fsys.tar.*»,
and that's also what I've kind of been moving the dpkg internals to
(functions, structs and similar). Also as part of the file metadata work,
I've also have pending splitting the dpkg db info/ dir into a dir that
contains things shipped by the .deb, and a dir for stuff generated by
dpkg itself, so that they cannot conflict or get overwritten or need
to be taken into account doing any of that.

Thanks,
Guillem

Guillem Jover

unread,
Jan 14, 2023, 1:20:03 PM1/14/23
to
Hi!

On Sat, 2022-12-17 at 17:24:57 -0700, Sean Whitton wrote:
> On Sat 17 Dec 2022 at 04:43PM +01, Guillem Jover wrote:
> > Sorry, probably my fault! As I tend to use «Fixes:» git pseudo-fields
> > for things that fix part of a bug, but are not intended yet to close it,
> > for which I use «Closes:».
> >
> > And for some reason I think I also got the impression, even though
> > the stanza changes had been committed, they could still be backed out.
> > (BTW I've now gone over the wiki and updated all paragraph references
> > that applied to stanza.)
> >
> > In any case, I've sat down and gone over the meat of the original
> > report. See below.

> We're sticking with 'stanza', and in light of that, could you confirm
> that the bug is reopened in order to make additional fixes, rather than
> back anything else out?

In case my other replies to Russ didn't make this clear. This comment
was in reference to the various replies in the sub-thread started by
Charles, where it looked to me like whether to back that out was still
an open question for the editors.

In any case, as I mentioned, given the changes being included in the
release, I took that as indeed, sticking with the term, and that's why
I reopened and submitted the actual changes this original report was
intending on requesting. :)

As an aside I've since updated the dpkg code and docs to refer to
these as stanzas everywhere applicable.

Thanks,
Guillem

Russ Allbery

unread,
Sep 10, 2023, 2:40:03 PM9/10/23
to
Guillem Jover <gui...@debian.org> writes:

> Seems I missed another file:

> * .changes:
> policy → «upload control file» / «Debian changes file»
> dpkg → «upload control file» / «.changes control file» /
> «Debian .changes file» / «Debian changes file»

[...]

> For changes I think something like the following might be a more clear
> option (and has the minor bonus of aligning perfectly on the first
> words! :), with it mentioning explicitly this is about changes being
> uploaded, and that it is a control file (but I'm not sure I'm entirely
> convinced about it):

> * .changes: «Debian upload changes control files»

[...]

> I've also found instances of «record» and «section» referring to fields
> or stanzas.

[...]

> I also recalled another term that has always seemed very confusing in
> context: «control information files» or «control information area». For
> example in a sentence such as “the control file is a control information
> file in the control information area in a .deb archive”. :) This also
> seems confusing when some of the files in the .deb control member are
> not really “control files” with a deb822(5) format.

> My thinking has been going into calling these as the «metadata files»,
> and being located in either the «metadata part of the .deb archive» or
> explicitly the «control member of the .deb archive», in contrast to the
> filesystem part. In dpkg I'd be eventually switching to meta/metadata
> and fsys/filesystem, from control or info and data. I've added a patch
> with the proposed change, but again nothing set in stone, and I'm again
> open to discussing pros/cons of this.

> Attached the proposals for discussion/review, and I might again have
> perhaps missed instances or similar.

All of these changes seem straightforward and uncontroversial to me, and
there are huge advantages to using consistent terminology between Policy
and dpkg. I have applied all of them for the next Policy release. Thank
you!

Edward Little

unread,
Sep 10, 2023, 9:20:05 PM9/10/23
to
Please remove the following email address:  e.lit...@gmail.com
0 new messages