Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Debian (would like) to do list

0 views
Skip to first unread message

Drew Scott Daniels

unread,
Jul 26, 2002, 5:10:09 PM7/26/02
to
I would like to become a Debian developer to help accomplish these tasks,
but my time is limited and I do not need to be a developer to help if some
developers pick up these tasks. Also my computing resources are limited so
projects like scanning source code and brute force "now" checking of
packages would be too time consuming without help or more resources. I
also haven't done all the checking that may be necessary, and thus some of
these tasks may be irrelevant or already underway. Having three exams
comming up, this may not be the best time for me to discuss anything, but
I got tired of waiting and, as you can see my list is growing long.

I'm not sure what the appropriate forum for discussing my list, but
debian-user seemed to be the best fit as I am a user.

First some clarification. When I say "before, after, now" I mean that the
uploader should(?) check this before uploading (perhaps this can be
automated), the archive maintainers or upload procedure should(?) check
this after it has been uploaded (perhaps this can be automated), and this
should(?) be checked for now to catch any violations that have been missed
(perhaps this can be automated).


Debian related tasks:

QA and improvements:
Continue the spell check campaign and look at improving it (before, after,
now)

Add grammar checking (before, after, now).

Add watch files to as many package sources (or diffs) as possible (before,
now, after may be unnecessarily complex).

Add Trove descriptions to packages and source (before, after, now). This
would be nice. Perhaps this may help improve the trove format.

Why are packages removed from the archives? There are many reasons, but
sometimes it's hard to find out. There should be some way of recording
this especially for those who track unstable on an infrequent basis.
Perhaps an entry into the Debian BTS under the package name?

Packages should purge configuration files before purging directories
otherwise empty directories can be left behind. (now)

Scanning package descriptions, documentation and other package related
areas for URL's and seeing if they are active URL's. (before, now)

Check to see if a package depends on a pseudopackage, transitional (also
dummy packages?) or other package that will be removed from the archives.
(before, after, now) Should Debian have a way to mark packages that are
going to be removed from the archives, pseudopackages, transitional and
dummy packages? A common word to describe such packages may help users to
better identify these packages and deal with them (users may want to
remove them, developers must want to depend on other packages).

Check for bash specific pieces of shell scripts where it may cause
problems such as in install scripts. (before, after, now)

Checking for policy violations or better fits:
Section 2.3.4 of the policy manual says:
"Packages are not required to declare any dependencies they have on other
packages which are marked Essential (see below), and should not do so
unless they depend on a particular version of that package.", this should
be checked for (before, after, now).

Check for packages that use old policies (before, after, now) and see if
the policy version can be updated or what needs to be done and file bug
reports against the package.

Check for contrib packages that can be moved to main (before, after, now).

Check for non-free packages that can be moved (before, after, now).

From dpkg (1.10.1) unstable's changelog:
"* Add conflict with dpkg-iasearch which intruded on our namespace." by:
-- Wichert Akkerman <wakk...@debian.org> Tue, 2 Jul 2002 12:34:07
+0200. Is this a policy violation? Did dpkg-iasearch violate a policy?

Automate testing of policy musts and where approval must be met create an
automated system for approval by people (may require authority structure
to be created). Many parts of Debian policy say to get approval from
debian-devel. I would like to avoid having people upload packages without
explicit approval which an automated mechanism could check for. (after,
now)

Reducing the size of the distribution & packages, cleaning up, and backing
up:
Look at not only gziping documentation but also compressing other files
such as png files using pngcrush or other files using other utilities.
(before, after, now)

Why not bzip2 instead of gzip? New upcoming algorithms are being worked on
and there are known deficiencies in bzip2. See the bzip2 homepage and read
about how the author thinks that he can make some significant
improvements. Also see http://www.compression.ca for some comparisons of
archives and note that PPM variants compress things more. CTW is pretty
good too, but the algorithm that bzip2 is based on is lower on the list
for compression ratio. Using bzip2 on source files is a wishlist item for
Debian policy. I'm arguing that it's a good idea to look at algorithms
other than gzip, but jumping on bzip2 may be a large transition that may
be made unnecessary by another large transition to a new compression
format. I'm hoping to help in the development of new compression formats
some of which should have better performance than bzip2.

Section 2.4.1 of the policy manual says:
"only the first three components of the policy version are significant in
the Standards-Version control field, and so either these three components
or the all four components may be specified." As this is a may, I would
prefer the saved space over the acknowledgement of cosmetic differences.
If the cosmetic difference is found to cause a meaning to change then a
higher version number will be changed.

A policy for reducing the length of changelogs may help reduce package
sizes. "Before, after, now" only after a policy has been chosen. I know
changelogs can be needed and useful. Changelogs can also be useless and
consume precious space, especially on minimal installations. Perhaps
packages could have a ranking of what files in them are necessary? This
may imply splitting the archives, but you can't split some files like
changelogs as they are required(2.4.4) for every package.

Optimize ordering of files in tar archives. tar files are usually
compressed, but if files of similar types are put closer together they can
compress better. I am looking at a simple method using 'file' and sorting
by 'file' type first, then looking at mime types, and then looking at
doing some statistical testing for file information. I may also create a
utility for using brute force to try every combination and then
compressing them and checking for the best order. Note that this may be
affected by concepts discussed in the gzip/bzip argument above as
compression methods do prefer different orderings in different cases.
(before, after, now)

Removing unnecessary directories from package listings. Some .deb's
contain lists of directories that they need. Even when it is not required
that they list certain directories, they are still allowed to. (before,
and now, but as this is a 'may' then not after)

Detecting the 'want' for virtual packages (when many "depend" and/or
"require" have or's, or a virtual package is provided by few packages).
This may cause virtual packages to be either created or removed. (before,
after, now)

Using upx or alike for minimal installs, boot disks, base? Making it an
option? Perhaps this could be an option integrated into apt.

Some programs use static code for things like regex expressions and
handling tar archives. A program to go through the source code of all the
programs (or a developer effort) may help to find common code that could
be put into a library or that already is in a library. This could make
packages smaller, but if we're not careful, creating new libraries could
increase the overall installed size for a program. (before, now) An
additional benefit would be fewer places to change code (good for
security, good for efficiency, good for all updates). Are there any
security issues to exporting code from packages? (This should be looked at
whenever code's exported.)

Searching for more ways of removing unnecessary content from debs.

Using a thesaurus such as Aiksaurus may help to reduce the size of
descriptions. Shorter descriptions and more clear descriptions would be a
good project (aka laconic's good). Automated tools could help (before,
now).

http://lists.debian.org/debian-mentors/1999/debian-mentors-199901/msg00051.html
talks about putting datasets into Debian or non-free. I wonder what has
become of this particular dataset and if there has been a policy developed
for datasets. I would like to see astronomical, meteorological,
geographical and other data sets easily available. If a data set is
DFSG-free then I feel it should be put in main, but segregated somehow (in
extra?). Data sets may require maintenance too. For example, recently new,
more accurate data was collected about the distance certain stars are from
our sun. When I did some more investigation, I found out that a vote to
include a dataset section was made and it was decided to create a dataset
section. No such section was created and the astronomical data is sitting
with the person made this proposal. Special handling of datasets may be
required to reduce the impact on Debian distribution infrastructure. I
recommend updates and distribution only be allowed through diffs or some
other method that uses less bandwidth than is used now.

findimagedups and other such packages could be used to search for
duplicate or near duplicate files Debian packages. Then 'common' packages
which have these files can be created and/or symbolic links may be used to
save space. (before, now) Perhaps a program that makes symbolic links to
common files where necessary?

Create Debian cleanup procedure and program(s) (cruft, deborphan...), now.

Create Debian backup procedure and program(s) (debian cleanup, cruft to
backup, dpkg --getselections > myselections, backup config files possibly
checking md5's which more than should be in every package), now.

Creating a version of Debian that binds a writeable filesystem onto a read
only filesystem (floppy writeable with a readonly CD). I would love to
have this to cary around and run Linux on any machine with a CDROM drive
that I could boot with. upx may be useful. A compressed filesystem for
writing may be useful. Support for umsdos, NFS, samba, and/or mounting
file systems, creating a file and mounting the file could be useful.
http://www.debian.org/CD/faq/index#live-cd is something I later found. I
would like to see more development, and official Debian development. Upon
further investigation bootcd seems to be a start, but how much of this can
it do? Maybe these features should be wishlist bugs, but the CD faq needs
to be updated, and I would still like to see an official CD image.

Should CD images be optimized for space? I saw an option to optimize CD
images for space in Roxio's Ez-cdcreator (formerly by Adaptec).


Security, Policy and other bug stuff:
Automated rough security audit of all source code (rats, splint & other
programs can be used, before, after, now).

Programs that use keyahead or mouseahead routines may be a security risk
or cause other undesirable results. One example is my apt-get using
readline has keyahead so if I accidentally hit enter, the enter is saved
until the next question and then inputted. Instead I'd prefer it be
disregarded so I can read the arbitrary (it's hard to predict the order)
question that appears next. Mouseahead can be very dangerous if the
program hasn't updated the interface, the user will likely have no idea
what they will have clicked on ("it didn't work. I'll click again. What? I
didn't select that second option."). Yes, these are probably wishlist
bugs, but they could be a normal bug as this can affect the desired
functionality of programs. These bugs may also to be tagged security in
some situations (the default password gets set by accident, etc). This may
tie into scanning code for security vulnerabilities. (With scanning before
and after, but this should be checked for now, especially where system
security can be involved).

'popularity-contest' and other methods can be very helpful in finding out
what users are interested in seeing being developed and maintained.
Perhaps this, archive (mirrors too?) statistics and other methods can be
used to create a priority list for the qa group. Perhaps a system should
be put in place to allow user input into package importance/maintenance
priority levels. Currently I would assume that a good system would be by
the subtype such as essential, optional...

Campaigning for signed debs to be a must (if not already). Signed debs
more than should be used (before, after, now).

Campaigning for md5 lists in debs to be a must. md5 values for all files
in packages more than should be done (before, after, now).

A procedure should be put in place to ensure installation starvation due
to dependencies does not occur in the unstable distribution. (perhaps
waiting a day for dependencies to catch up?) I feel this could be
automated or automated better (before, after, now).

Find a way to reduce the chance of bad NMU's (accidental, malicious,
poorly done, etc.). I haven't looked into how this is done now (if at
all), but the developer making the NMU should be warned that it's an NMU.
It may be good to list NMU policy for the first time for an NMU by a
specific developer and ask for confirmation. It may be good to have an
automated system where maintainers can block NMU's except by permission of
an authority such as the security or qa group.

Joey says at http://www.debian.org/devel/website/todo#misc that security
updates are on the same server as the signatures for the updates. This
could be a potential security issue as if one method is exploited to
change the files, it can be used to change the signatures at the same
time. wyrmbait at debianplanet.org says in his article Security with apt
(found at http://www.debianplanet.org/article.php?sid=643 ), that apt can
be viewed as a single point of failure. While his arguments may not quite
be thorough, he does bring up some issues of security. Why not have a
package for keys/certificates, then have dpkg complain if a new package
has not been correctly signed. Also packages in the archive should be
signed by a public key that is available on many public key servers and
available offline (on CD perhaps). Changes to the keyring packages would
need to have the appropriate signature(s).

Packages being signed by multiple people and allowing users to assign
trust levels (checked before installing an upgrade) to people could
improve security.

I would like to encourage distribution of public key server media. Having
keys stored online lends them to potential man in the middle attacks even
if multiple protocols are used. It's much more difficult to circumvent an
offline signature.

One of the reasons for the delay of the release of Woody was said to have
been security concerns. It has also been reported (see the glibc example
at http://www.debianplanet.org/article.php?sid=568 ) that it takes a long
time for security patches to get through due to the compiling and testing
on the 68k and arm architectures. I would like to bring forward the idea
of using emulation to help speed things up. There was recently (March?) a
patch for UAE (a 68k emulator) to support running Linux. There are also
emulators for the arm architecture such as arcem (
http://bugs.debian.org/cgi-bin/bugreport.cgi?archive=no\&bug=136844 for
details ). arcem is said to be quite fast on Intel architecture. Emulation
of old architecture brings two advantages and one disadvantage: it's
usually faster, it's easier to get, it could have trouble being a
completely accurate emulation of the original hardware (bugs not emulated
or new bugs not yet found/patched).

Many of the patches and programs found at
http://www.theaimsgroup.com/~hlein/haqs/ can be quite useful. The programs
can be packaged. The patches, when useful, should be added to existing
packages or modified to make them run time options. For example the idle
connection traffic patch for ssh may be a useful option that may be
possible to be chosen at runtime.

Performance issues:
Someone mentioned the idea of ordering the startup scripts into a
dependency tree, and have programs startup in parallel. I feel this would
be useful for many people running many startup scripts such as myself.
Perhaps this should be a before, after, now. If nothing else, it should be
looked at now and a policy document regarding this may be useful. I forget
which Debian developer I read this idea from.

Having package install in parallel may speed up installation. This may be
a wishlist item for dpkg or apt.

Should CD images be optimized for speed? I saw an option to optimize CD
images for creation speed in Roxio's Ez-cdcreator (formerly by Adaptec).
Also speed of installation or other reads of the CD. Seek time might also
be a consideration when choosing what order to put data on CD's.

Using programs like cmix, the performance of programs may be able to be
optimized (before and now, but not after upload as optimizing programs may
not work desirably).

Sometimes threads come up about performance optimization done at compile
time. Yes numbers have not been provided, but some number should be. As
such a comparison of compilers of gcc, lc (Intel's compiler if allowed),
tcc and any other compilers should be done. Binaries could be compiled
with each available compiler and then checked to see which produces the
best results in application performance, binary size and perhaps compile
time. (Smallest binary size usually means high compile time and better
application performance, or so I've been told.) This should be done before
upload and now.


Other sometimes bigger ideas:
Should there be a method to force retirement of developers? I don't
believe so, I believe that a new category should be created for developers
who are not active. Why separate inactive developers? To limit the
security risk and make managing developers easier.

A restructuring of the online distribution protocol is needed. Recently in
the Debian Weekly news this was mentioned and this has been discussed. A
BTS location may be a good place to start putting won't fix, wishlist and
other bug information regarding the distribution protocol(s). Personally,
I'd like to see low server loads, compressed files, deltas, and have
upgrade priorities visible before downloading the package/archive.

It might be nice to make debian/watch files separately available and to
have a watch file for all upstream sources even when it's version
specific. It would also be nice to carry md5's for upstream sources (last
known version of course) so when upstream sources get modified (like the
dsniff security issue), users of watch files to grab the current source
get a heads up that there may be something wrong.

Support for installing Debian via a netboot/bootp by distributing an
official netboot image.

A comparison of xwindows terminals (or is it terminal emulators?) is
disirable. xvt seems to have a smaller footprint than rxvt which, I
though, was supposed to be reduced xvt.
http://dickey.his.com/xterm/xterm.faq.html has some starting information.
This would be useful for creating a small RAM xwindows install.


Other related projects that aren't Debian specific:
RATS for gnu assembly (note: intel2gas) may be more useful if it existed,
but it doesn't, yet.

An open source grammar checker (not EBNF or alike) doesn't seem to exist.
Openoffice lacks a grammar checker and does not plan to add one. A grammar
checker is a major proofing tool that would be extremely useful to many
people. I did find one open source grammar checker called Link Grammar
http://www.link.cs.cmu.edu/link/ . I disagree with the evolution of
English being too fast for creating a static grammar checker as many in
the commercial world have done so.

Update File's database. This would be useful for my projects looking at
reordering files in tar archives and other compression projects of mine.
(This may have to be a Debian thing as I don't see updates to the database
very often.)

Other related projects (to those discussed):
I'm working on some compression algorithms (Charles Bloom at
http://www.cbloom.com and xiph have some starting work of what I can do).
I believe I can improve existing compression. I don't have much time as
I'm a full time student and I need money to pay rent. I will be graduating
with a computer science degree in December.

Drew Daniels

PS: Any help at finding me a good job would be appreciated (contract or
otherwise). A reasonable version of my resume is available at
http://home.cc.umanitoba.ca/~umdanie8/resume.html


--
To UNSUBSCRIBE, email to debian-us...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listm...@lists.debian.org

Joey Hess

unread,
Jul 26, 2002, 8:20:08 PM7/26/02
to
Drew Scott Daniels wrote:
> I would like to become a Debian developer to help accomplish these tasks,
> but my time is limited and I do not need to be a developer to help if some
> developers pick up these tasks. Also my computing resources are limited so
> projects like scanning source code and brute force "now" checking of
> packages would be too time consuming without help or more resources. I
> also haven't done all the checking that may be necessary, and thus some of
> these tasks may be irrelevant or already underway. Having three exams
> comming up, this may not be the best time for me to discuss anything, but
> I got tired of waiting and, as you can see my list is growing long.

Good list. It reminds me of the list I had when I became a developer, 7
years ago, excecpt it looks like it could easly keep someone busy for
20, rather than just 10 years..

You should not find it hard to become a developer given your quality of
thought on debian, attention to detail, and wide variety of stuff you
want to see fixed -- if you follow through on it. I'm going to annotate
this list to note where we are on some things. There is also a ton of
stuff here that falls under the heading QA and can easily be worked on
by non-developers.

> I'm not sure what the appropriate forum for discussing my list, but
> debian-user seemed to be the best fit as I am a user.

debian-devel (or debian-qa) would really be better, cc'd.

> First some clarification. When I say "before, after, now" I mean that the
> uploader should(?) check this before uploading (perhaps this can be
> automated), the archive maintainers or upload procedure should(?) check
> this after it has been uploaded (perhaps this can be automated), and this
> should(?) be checked for now to catch any violations that have been missed
> (perhaps this can be automated).
>
>
> Debian related tasks:
>
> QA and improvements:
> Continue the spell check campaign and look at improving it (before, after,
> now)
>
> Add grammar checking (before, after, now).
>
> Add watch files to as many package sources (or diffs) as possible (before,
> now, after may be unnecessarily complex).

Some kind of semi-automated tool to do this might help. I have watch
files on all my packages that can have watch files, but it seems to take
at least 5 minutes to add and test one.

> Add Trove descriptions to packages and source (before, after, now). This
> would be nice. Perhaps this may help improve the trove format.
>
> Why are packages removed from the archives? There are many reasons, but
> sometimes it's hard to find out. There should be some way of recording
> this especially for those who track unstable on an infrequent basis.
> Perhaps an entry into the Debian BTS under the package name?

There is a file on the archive whose name I cannot remember that lists
removed packages and why.

> Packages should purge configuration files before purging directories
> otherwise empty directories can be left behind. (now)

It is rumored that an in-progress rewrite of dpkg-deb may address this.

> Scanning package descriptions, documentation and other package related
> areas for URL's and seeing if they are active URL's. (before, now)

Good idea and even somewhat easy.

> Check to see if a package depends on a pseudopackage, transitional (also
> dummy packages?) or other package that will be removed from the archives.
> (before, after, now) Should Debian have a way to mark packages that are
> going to be removed from the archives, pseudopackages, transitional and
> dummy packages? A common word to describe such packages may help users to
> better identify these packages and deal with them (users may want to
> remove them, developers must want to depend on other packages).

I think the word is "deprecated", but maybe you mean a formal control
file field.

> Check for bash specific pieces of shell scripts where it may cause
> problems such as in install scripts. (before, after, now)

Partly done already by lintian, a full check is hard.

> Checking for policy violations or better fits:
> Section 2.3.4 of the policy manual says:
> "Packages are not required to declare any dependencies they have on other
> packages which are marked Essential (see below), and should not do so
> unless they depend on a particular version of that package.", this should
> be checked for (before, after, now).

I think there is something on qa.debian.org that lists these.

> Check for packages that use old policies (before, after, now) and see if
> the policy version can be updated or what needs to be done and file bug
> reports against the package.

The check is already done by lintian, and the rest is already done
occasionally for the very old poliy versions. Any work toward tightening
it up to more recent versions is all to the good.

> Check for contrib packages that can be moved to main (before, after, now).
>
> Check for non-free packages that can be moved (before, after, now).

Generally done by their respective maintainers, I think we do a pretty
good job here.

> From dpkg (1.10.1) unstable's changelog:
> "* Add conflict with dpkg-iasearch which intruded on our namespace." by:
> -- Wichert Akkerman <wakk...@debian.org> Tue, 2 Jul 2002 12:34:07
> +0200. Is this a policy violation? Did dpkg-iasearch violate a policy?

No, it's just Wichert being inconsistent (dpkg-repack intruded on that
"namespace" long ago, and need I mention debhelper?)

> Automate testing of policy musts and where approval must be met create an
> automated system for approval by people (may require authority structure
> to be created). Many parts of Debian policy say to get approval from
> debian-devel. I would like to avoid having people upload packages without
> explicit approval which an automated mechanism could check for. (after,
> now)

I'm not sure I understand this one.

> Reducing the size of the distribution & packages, cleaning up, and backing
> up:
> Look at not only gziping documentation but also compressing other files
> such as png files using pngcrush or other files using other utilities.
> (before, after, now)
>
> Why not bzip2 instead of gzip? New upcoming algorithms are being worked on
> and there are known deficiencies in bzip2. See the bzip2 homepage and read
> about how the author thinks that he can make some significant
> improvements. Also see http://www.compression.ca for some comparisons of
> archives and note that PPM variants compress things more. CTW is pretty
> good too, but the algorithm that bzip2 is based on is lower on the list
> for compression ratio. Using bzip2 on source files is a wishlist item for
> Debian policy. I'm arguing that it's a good idea to look at algorithms
> other than gzip, but jumping on bzip2 may be a large transition that may
> be made unnecessary by another large transition to a new compression
> format. I'm hoping to help in the development of new compression formats
> some of which should have better performance than bzip2.

It seems certian that the new source format will support bzip2'd source
packages. There are as you note performance issues, and those have been
used to shoot down suggestions to use bzip2 in binary packages in the
past.

> Section 2.4.1 of the policy manual says:
> "only the first three components of the policy version are significant in
> the Standards-Version control field, and so either these three components
> or the all four components may be specified." As this is a may, I would
> prefer the saved space over the acknowledgement of cosmetic differences.
> If the cosmetic difference is found to cause a meaning to change then a
> higher version number will be changed.

I'm ambivilant.

> A policy for reducing the length of changelogs may help reduce package
> sizes. "Before, after, now" only after a policy has been chosen. I know
> changelogs can be needed and useful. Changelogs can also be useless and
> consume precious space, especially on minimal installations. Perhaps
> packages could have a ranking of what files in them are necessary? This
> may imply splitting the archives, but you can't split some files like
> changelogs as they are required(2.4.4) for every package.

I'd love to see some kind of a formal policy on this. I think we should
keep old debian changelogs for ever in the source package, but there is
little value to the user in most cases in the 3 hundredth changelog
entry down being in a binary package. There are exceptions, though (or
maybe I'm the only one who goes trawling through ancient bits of
debhelper's changelog, I don't know). I know that some trimming is
already happening on an ad-hoc basis, and it would be useful it this
were formalized. Perhaps there should be a mark that can be placed in a
changelog, below which it is truncated when being put in a binary
package. Or perhaps changelogs could be truncated to X years back after
installation, based on the user's preferences. Just making sure what we
symlink changelogs together when possible between binary packages that
share a source package would save a lot of space.

> Optimize ordering of files in tar archives. tar files are usually
> compressed, but if files of similar types are put closer together they can
> compress better. I am looking at a simple method using 'file' and sorting
> by 'file' type first, then looking at mime types, and then looking at
> doing some statistical testing for file information. I may also create a
> utility for using brute force to try every combination and then
> compressing them and checking for the best order. Note that this may be
> affected by concepts discussed in the gzip/bzip argument above as
> compression methods do prefer different orderings in different cases.
> (before, after, now)

Heh, interesting idea, not very debian specific.

> Removing unnecessary directories from package listings. Some .deb's
> contain lists of directories that they need. Even when it is not required
> that they list certain directories, they are still allowed to. (before,
> and now, but as this is a 'may' then not after)

Do you refer to debs that happen to include an empty directory that is
already on the system, like a random deb that puts programs in /usr/bin
having an empty /sbin directory?

> Detecting the 'want' for virtual packages (when many "depend" and/or
> "require" have or's, or a virtual package is provided by few packages).
> This may cause virtual packages to be either created or removed. (before,
> after, now)
>
> Using upx or alike for minimal installs, boot disks, base? Making it an
> option? Perhaps this could be an option integrated into apt.

It has been discussed on debian-boot in the past; there are plenty of
issues. See archives.

> Some programs use static code for things like regex expressions and
> handling tar archives. A program to go through the source code of all the
> programs (or a developer effort) may help to find common code that could
> be put into a library or that already is in a library. This could make
> packages smaller, but if we're not careful, creating new libraries could
> increase the overall installed size for a program. (before, now) An
> additional benefit would be fewer places to change code (good for
> security, good for efficiency, good for all updates). Are there any
> security issues to exporting code from packages? (This should be looked at
> whenever code's exported.)

Very cute idea.

> Searching for more ways of removing unnecessary content from debs.
>
> Using a thesaurus such as Aiksaurus may help to reduce the size of
> descriptions. Shorter descriptions and more clear descriptions would be a
> good project (aka laconic's good). Automated tools could help (before,
> now).

I wouldn't trust an automated tool, but neat idea.

> http://lists.debian.org/debian-mentors/1999/debian-mentors-199901/msg00051.html
> talks about putting datasets into Debian or non-free. I wonder what has
> become of this particular dataset and if there has been a policy developed
> for datasets. I would like to see astronomical, meteorological,
> geographical and other data sets easily available. If a data set is
> DFSG-free then I feel it should be put in main, but segregated somehow (in
> extra?). Data sets may require maintenance too. For example, recently new,
> more accurate data was collected about the distance certain stars are from
> our sun. When I did some more investigation, I found out that a vote to
> include a dataset section was made and it was decided to create a dataset
> section. No such section was created and the astronomical data is sitting
> with the person made this proposal. Special handling of datasets may be
> required to reduce the impact on Debian distribution infrastructure. I
> recommend updates and distribution only be allowed through diffs or some
> other method that uses less bandwidth than is used now.

Yes, that idea seems to be stalled.

> findimagedups and other such packages could be used to search for
> duplicate or near duplicate files Debian packages. Then 'common' packages
> which have these files can be created and/or symbolic links may be used to
> save space. (before, now) Perhaps a program that makes symbolic links to
> common files where necessary?

Beware of overfragmentation of packages though.

> Create Debian cleanup procedure and program(s) (cruft, deborphan...), now.
>
> Create Debian backup procedure and program(s) (debian cleanup, cruft to
> backup, dpkg --getselections > myselections, backup config files possibly
> checking md5's which more than should be in every package), now.
>
> Creating a version of Debian that binds a writeable filesystem onto a read
> only filesystem (floppy writeable with a readonly CD). I would love to
> have this to cary around and run Linux on any machine with a CDROM drive
> that I could boot with. upx may be useful. A compressed filesystem for
> writing may be useful. Support for umsdos, NFS, samba, and/or mounting
> file systems, creating a file and mounting the file could be useful.
> http://www.debian.org/CD/faq/index#live-cd is something I later found. I
> would like to see more development, and official Debian development. Upon
> further investigation bootcd seems to be a start, but how much of this can
> it do? Maybe these features should be wishlist bugs, but the CD faq needs
> to be updated, and I would still like to see an official CD image.

Somebody please do this, mmmkay? IIRC I saw Robster and someone
discussing this on irc the other day.

> Should CD images be optimized for space? I saw an option to optimize CD
> images for space in Roxio's Ez-cdcreator (formerly by Adaptec).

Depends on how much slower or less portible it makes them and how much
space is gained, I imagine.

> Security, Policy and other bug stuff:
> Automated rough security audit of all source code (rats, splint & other
> programs can be used, before, after, now).
>
> Programs that use keyahead or mouseahead routines may be a security risk
> or cause other undesirable results. One example is my apt-get using
> readline has keyahead so if I accidentally hit enter, the enter is saved
> until the next question and then inputted. Instead I'd prefer it be
> disregarded so I can read the arbitrary (it's hard to predict the order)
> question that appears next. Mouseahead can be very dangerous if the
> program hasn't updated the interface, the user will likely have no idea
> what they will have clicked on ("it didn't work. I'll click again. What? I
> didn't select that second option."). Yes, these are probably wishlist
> bugs, but they could be a normal bug as this can affect the desired
> functionality of programs. These bugs may also to be tagged security in
> some situations (the default password gets set by accident, etc). This may
> tie into scanning code for security vulnerabilities. (With scanning before
> and after, but this should be checked for now, especially where system
> security can be involved).

I think that in all cases except for new password prompts, it can be
sucessfully argued that _someone_ will be abe to anticipate what prompts
are coming up and typeahead, and so that will get shot down. Maybe look
at modifying a terminal emulator to let typeahead be turned off, or
displayed at the bottom of the screen and cleared with a keypress, or
something, would be more effective?

> 'popularity-contest' and other methods can be very helpful in finding out
> what users are interested in seeing being developed and maintained.
> Perhaps this, archive (mirrors too?) statistics and other methods can be
> used to create a priority list for the qa group. Perhaps a system should
> be put in place to allow user input into package importance/maintenance
> priority levels. Currently I would assume that a good system would be by
> the subtype such as essential, optional...
>
> Campaigning for signed debs to be a must (if not already). Signed debs
> more than should be used (before, after, now).
>
> Campaigning for md5 lists in debs to be a must. md5 values for all files
> in packages more than should be done (before, after, now).

If the dpkg people ever get around to making this part of the package
format for real, it will happen. People seem to take malicious delight
in shooting down the idea that the current files become required by
policy since they are not perfect, while ignoring the fact that they are
very useful to a lot of people. Sigh.

> A procedure should be put in place to ensure installation starvation due
> to dependencies does not occur in the unstable distribution. (perhaps
> waiting a day for dependencies to catch up?) I feel this could be
> automated or automated better (before, after, now).

Testing already does this, and putting it in unstable would just add
another hurdle to making big changes to unstable. It's really not a good
idea, and it has been discussed before.

> Find a way to reduce the chance of bad NMU's (accidental, malicious,
> poorly done, etc.). I haven't looked into how this is done now (if at
> all), but the developer making the NMU should be warned that it's an NMU.
> It may be good to list NMU policy for the first time for an NMU by a
> specific developer and ask for confirmation. It may be good to have an
> automated system where maintainers can block NMU's except by permission of
> an authority such as the security or qa group.

Social solutions are the rights ones here. We do a pretty good job.

> Joey says at http://www.debian.org/devel/website/todo#misc that security
> updates are on the same server as the signatures for the updates. This
> could be a potential security issue as if one method is exploited to
> change the files, it can be used to change the signatures at the same
> time. wyrmbait at debianplanet.org says in his article Security with apt
> (found at http://www.debianplanet.org/article.php?sid=643 ), that apt can
> be viewed as a single point of failure. While his arguments may not quite
> be thorough, he does bring up some issues of security. Why not have a
> package for keys/certificates, then have dpkg complain if a new package
> has not been correctly signed. Also packages in the archive should be
> signed by a public key that is available on many public key servers and
> available offline (on CD perhaps). Changes to the keyring packages would
> need to have the appropriate signature(s).

We just need an end-to-end package signing infrastructure with no holes,
really.

> Packages being signed by multiple people and allowing users to assign
> trust levels (checked before installing an upgrade) to people could
> improve security.
>
> I would like to encourage distribution of public key server media. Having
> keys stored online lends them to potential man in the middle attacks even
> if multiple protocols are used. It's much more difficult to circumvent an
> offline signature.

debian-keyring.deb, you mean? It's on the non-US cd's anyway, and will
be on the regular cd's for sarge I hope.

Henrique de Moraes Holschuh. He has this well in hand I think.

> Having package install in parallel may speed up installation. This may be
> a wishlist item for dpkg or apt.

Very unlikely, on a system fast enough to not run into CPU problems
doing this, you'd probably become quickly disk bound, and this would
just churn the disk more. It's also rather hard to do right.

> Using programs like cmix, the performance of programs may be able to be
> optimized (before and now, but not after upload as optimizing programs may
> not work desirably).
>
> Sometimes threads come up about performance optimization done at compile
> time. Yes numbers have not been provided, but some number should be. As
> such a comparison of compilers of gcc, lc (Intel's compiler if allowed),
> tcc and any other compilers should be done. Binaries could be compiled
> with each available compiler and then checked to see which produces the
> best results in application performance, binary size and perhaps compile
> time. (Smallest binary size usually means high compile time and better
> application performance, or so I've been told.) This should be done before
> upload and now.

By all means I'd like to see some real numbers.

> Other sometimes bigger ideas:
> Should there be a method to force retirement of developers? I don't
> believe so, I believe that a new category should be created for developers
> who are not active. Why separate inactive developers? To limit the
> security risk and make managing developers easier.

It has been discussed, seach for "MIA" and "emeritus" around the DPL
compagning times.

> A restructuring of the online distribution protocol is needed. Recently in
> the Debian Weekly news this was mentioned and this has been discussed. A
> BTS location may be a good place to start putting won't fix, wishlist and
> other bug information regarding the distribution protocol(s). Personally,
> I'd like to see low server loads, compressed files, deltas, and have
> upgrade priorities visible before downloading the package/archive.
>
> It might be nice to make debian/watch files separately available and to
> have a watch file for all upstream sources even when it's version
> specific. It would also be nice to carry md5's for upstream sources (last
> known version of course) so when upstream sources get modified (like the
> dsniff security issue), users of watch files to grab the current source
> get a heads up that there may be something wrong.

So long as packages use pristine upstream sources, we already have such
md5's in the .dsc files.

--
see shy jo

Oohara Yuuma

unread,
Jul 26, 2002, 9:50:05 PM7/26/02
to
[moving the thread to debian-devel --- see Mail-Followup-To:]

On Fri, 26 Jul 2002 16:07:41 -0500 (CDT),
Drew Scott Daniels <umda...@cc.UManitoba.CA> wrote:
> I would like to become a Debian developer to help accomplish these tasks,
> but my time is limited and I do not need to be a developer to help if some
> developers pick up these tasks. Also my computing resources are limited so
> projects like scanning source code and brute force "now" checking of
> packages would be too time consuming without help or more resources.

As you said, you can help Debian even if you don't have a Debian
account. Why don't you pick a package you are interested in and
check it?

> I'm not sure what the appropriate forum for discussing my list, but
> debian-user seemed to be the best fit as I am a user.

debian-devel is better

> First some clarification. When I say "before, after, now" I mean that the
> uploader should(?) check this before uploading (perhaps this can be
> automated), the archive maintainers or upload procedure should(?) check
> this after it has been uploaded (perhaps this can be automated), and this
> should(?) be checked for now to catch any violations that have been missed
> (perhaps this can be automated).

> Debian related tasks:

> QA and improvements:
> Continue the spell check campaign and look at improving it (before, after,
> now)
>
> Add grammar checking (before, after, now).

Go ahead, do it yourself.

> Add watch files to as many package sources (or diffs) as possible (before,
> now, after may be unnecessarily complex).

Patches welcome.

> Add Trove descriptions to packages and source (before, after, now). This
> would be nice. Perhaps this may help improve the trove format.

I don't know why Debian should bother about
http://www.tuxedo.org/~esr/trove/ .

> Why are packages removed from the archives? There are many reasons, but
> sometimes it's hard to find out. There should be some way of recording
> this especially for those who track unstable on an infrequent basis.
> Perhaps an entry into the Debian BTS under the package name?

http://ftp-master.debian.org/removals.txt

> Packages should purge configuration files before purging directories
> otherwise empty directories can be left behind. (now)

If a package foo has package.d/ style configuration files and another
package bar creates a configuration file in package.d/, foo must not
remove package.d/bar .

> Scanning package descriptions, documentation and other package related
> areas for URL's and seeing if they are active URL's. (before, now)

Go ahead, do it yourself.

> Check to see if a package depends on a pseudopackage, transitional (also
> dummy packages?) or other package that will be removed from the archives.
> (before, after, now) Should Debian have a way to mark packages that are
> going to be removed from the archives, pseudopackages, transitional and
> dummy packages? A common word to describe such packages may help users to
> better identify these packages and deal with them (users may want to
> remove them, developers must want to depend on other packages).

Why don't you read Description: of installed packages?
grep-status is your friend.

> Check for bash specific pieces of shell scripts where it may cause
> problems such as in install scripts. (before, after, now)
>
> Checking for policy violations or better fits:
> Section 2.3.4 of the policy manual says:
> "Packages are not required to declare any dependencies they have on other
> packages which are marked Essential (see below), and should not do so
> unless they depend on a particular version of that package.", this should
> be checked for (before, after, now).

Go ahead, do it yourself.

> Check for packages that use old policies (before, after, now) and see if
> the policy version can be updated or what needs to be done and file bug
> reports against the package.

lintian does this.

> Check for contrib packages that can be moved to main (before, after, now).
>
> Check for non-free packages that can be moved (before, after, now).

Go ahead, do it yourself. (Yes, I moved gpgp to main.)

> From dpkg (1.10.1) unstable's changelog:
> "* Add conflict with dpkg-iasearch which intruded on our namespace." by:
> -- Wichert Akkerman <wakk...@debian.org> Tue, 2 Jul 2002 12:34:07
> +0200. Is this a policy violation? Did dpkg-iasearch violate a policy?

seems first-come-first-served situation

> Automate testing of policy musts and where approval must be met create an
> automated system for approval by people (may require authority structure
> to be created). Many parts of Debian policy say to get approval from
> debian-devel. I would like to avoid having people upload packages without
> explicit approval which an automated mechanism could check for. (after,
> now)

Debian does not need bureaucracy.

> Reducing the size of the distribution & packages, cleaning up, and backing
> up:
> Look at not only gziping documentation but also compressing other files
> such as png files using pngcrush or other files using other utilities.
> (before, after, now)

I don't think png files can be compressed well without reducing the number
of colors in them.

> Why not bzip2 instead of gzip? New upcoming algorithms are being worked on
> and there are known deficiencies in bzip2. See the bzip2 homepage and read
> about how the author thinks that he can make some significant
> improvements. Also see http://www.compression.ca for some comparisons of
> archives and note that PPM variants compress things more. CTW is pretty
> good too, but the algorithm that bzip2 is based on is lower on the list
> for compression ratio. Using bzip2 on source files is a wishlist item for
> Debian policy. I'm arguing that it's a good idea to look at algorithms
> other than gzip, but jumping on bzip2 may be a large transition that may
> be made unnecessary by another large transition to a new compression
> format. I'm hoping to help in the development of new compression formats
> some of which should have better performance than bzip2.

You don't know dbs or dpkg-source v2. Read debian-devel.

> Section 2.4.1 of the policy manual says:
> "only the first three components of the policy version are significant in
> the Standards-Version control field, and so either these three components
> or the all four components may be specified." As this is a may, I would
> prefer the saved space over the acknowledgement of cosmetic differences.
> If the cosmetic difference is found to cause a meaning to change then a
> higher version number will be changed.

Replacing 3.5.6.1 with 3.5.6 saves only 2 bytes. Re-building the entire
archive (takes at least one week) for (at most) 20k bytes is too much.

> A policy for reducing the length of changelogs may help reduce package
> sizes. "Before, after, now" only after a policy has been chosen. I know
> changelogs can be needed and useful. Changelogs can also be useless and
> consume precious space, especially on minimal installations. Perhaps
> packages could have a ranking of what files in them are necessary? This
> may imply splitting the archives, but you can't split some files like
> changelogs as they are required(2.4.4) for every package.

I think changelog should be more verbose. You can remove
/usr/share/doc/package/changelog.Debian.gz (or even /usr/share/doc/ itself)
manually after install.

> Optimize ordering of files in tar archives. tar files are usually
> compressed, but if files of similar types are put closer together they can
> compress better. I am looking at a simple method using 'file' and sorting
> by 'file' type first, then looking at mime types, and then looking at
> doing some statistical testing for file information. I may also create a
> utility for using brute force to try every combination and then
> compressing them and checking for the best order. Note that this may be
> affected by concepts discussed in the gzip/bzip argument above as
> compression methods do prefer different orderings in different cases.
> (before, after, now)

Contact dpkg maintainers. Note that such brute force optimization is
a nightmare of autobuilders.

> Removing unnecessary directories from package listings. Some .deb's
> contain lists of directories that they need. Even when it is not required
> that they list certain directories, they are still allowed to. (before,
> and now, but as this is a 'may' then not after)

Go ahead, do it yourself. I don't think there are many.

> Detecting the 'want' for virtual packages (when many "depend" and/or
> "require" have or's, or a virtual package is provided by few packages).
> This may cause virtual packages to be either created or removed. (before,
> after, now)

seems Provides:

> Using upx or alike for minimal installs, boot disks, base? Making it an
> option? Perhaps this could be an option integrated into apt.

Working debian-installer has much higher priority.

> Some programs use static code for things like regex expressions and
> handling tar archives. A program to go through the source code of all the
> programs (or a developer effort) may help to find common code that could
> be put into a library or that already is in a library. This could make
> packages smaller, but if we're not careful, creating new libraries could
> increase the overall installed size for a program. (before, now) An
> additional benefit would be fewer places to change code (good for
> security, good for efficiency, good for all updates). Are there any
> security issues to exporting code from packages? (This should be looked at
> whenever code's exported.)

Do you know why apt-get works so wonderfully?
Have you ever heard anything about SONAME?

> Searching for more ways of removing unnecessary content from debs.

Why do you install unnecessary packages?

> Using a thesaurus such as Aiksaurus may help to reduce the size of
> descriptions. Shorter descriptions and more clear descriptions would be a
> good project (aka laconic's good). Automated tools could help (before,
> now).

I don't think newspeak is the best language for Description: .

> http://lists.debian.org/debian-mentors/1999/debian-mentors-199901/msg00051.html
> talks about putting datasets into Debian or non-free. I wonder what has
> become of this particular dataset and if there has been a policy developed
> for datasets. I would like to see astronomical, meteorological,
> geographical and other data sets easily available. If a data set is
> DFSG-free then I feel it should be put in main, but segregated somehow (in
> extra?). Data sets may require maintenance too. For example, recently new,
> more accurate data was collected about the distance certain stars are from
> our sun. When I did some more investigation, I found out that a vote to
> include a dataset section was made and it was decided to create a dataset
> section. No such section was created and the astronomical data is sitting
> with the person made this proposal. Special handling of datasets may be
> required to reduce the impact on Debian distribution infrastructure. I
> recommend updates and distribution only be allowed through diffs or some
> other method that uses less bandwidth than is used now.

I didn't hear about such a vote.

> findimagedups and other such packages could be used to search for
> duplicate or near duplicate files Debian packages. Then 'common' packages
> which have these files can be created and/or symbolic links may be used to
> save space. (before, now) Perhaps a program that makes symbolic links to
> common files where necessary?

Did you find any such duplicate files?

> Create Debian cleanup procedure and program(s) (cruft, deborphan...), now.

debfoster

> Create Debian backup procedure and program(s) (debian cleanup, cruft to
> backup, dpkg --getselections > myselections, backup config files possibly
> checking md5's which more than should be in every package), now.

Use "tar cf etc.tar /etc". If you want to verify the tarball later,
sign it with gpg.

> Creating a version of Debian that binds a writeable filesystem onto a read
> only filesystem (floppy writeable with a readonly CD). I would love to
> have this to cary around and run Linux on any machine with a CDROM drive
> that I could boot with. upx may be useful. A compressed filesystem for
> writing may be useful. Support for umsdos, NFS, samba, and/or mounting
> file systems, creating a file and mounting the file could be useful.
> http://www.debian.org/CD/faq/index#live-cd is something I later found. I
> would like to see more development, and official Debian development. Upon
> further investigation bootcd seems to be a start, but how much of this can
> it do? Maybe these features should be wishlist bugs, but the CD faq needs
> to be updated, and I would still like to see an official CD image.

Go ahead, do it yourself.

> Should CD images be optimized for space? I saw an option to optimize CD
> images for space in Roxio's Ez-cdcreator (formerly by Adaptec).

Does it work? Is it free (in the sense of DFSG)?

> Security, Policy and other bug stuff:
> Automated rough security audit of all source code (rats, splint & other
> programs can be used, before, after, now).

_Rough_ security audit does more harm than good because it provides
a false sense of security. Are you sure such auditing tool has no bug,
or are you volunteering for line-by-line audit of all 10k+ packages?

> Programs that use keyahead or mouseahead routines may be a security risk
> or cause other undesirable results. One example is my apt-get using
> readline has keyahead so if I accidentally hit enter, the enter is saved
> until the next question and then inputted. Instead I'd prefer it be
> disregarded so I can read the arbitrary (it's hard to predict the order)
> question that appears next. Mouseahead can be very dangerous if the
> program hasn't updated the interface, the user will likely have no idea
> what they will have clicked on ("it didn't work. I'll click again. What? I
> didn't select that second option."). Yes, these are probably wishlist
> bugs, but they could be a normal bug as this can affect the desired
> functionality of programs. These bugs may also to be tagged security in
> some situations (the default password gets set by accident, etc). This may
> tie into scanning code for security vulnerabilities. (With scanning before
> and after, but this should be checked for now, especially where system
> security can be involved).

Think before type.

> 'popularity-contest' and other methods can be very helpful in finding out
> what users are interested in seeing being developed and maintained.
> Perhaps this, archive (mirrors too?) statistics and other methods can be
> used to create a priority list for the qa group. Perhaps a system should
> be put in place to allow user input into package importance/maintenance
> priority levels. Currently I would assume that a good system would be by
> the subtype such as essential, optional...

If you want some package well-maintained, help it yourself.
Note that a "thanks" mail to a maintainer (or the upstream) is much more
appreciated than a vote on popularity-contest.

> Campaigning for signed debs to be a must (if not already). Signed debs
> more than should be used (before, after, now).

I heard singed Packages.gz . Each .deb is not signed.
I don't know what dpkg maintainers are planning about signed .deb .



> Campaigning for md5 lists in debs to be a must. md5 values for all files
> in packages more than should be done (before, after, now).

I don't think md5 lists are worth a must clause of the Policy.
Why do you trust md5 lists in /var/lib/dpkg/info ?

> A procedure should be put in place to ensure installation starvation due
> to dependencies does not occur in the unstable distribution. (perhaps
> waiting a day for dependencies to catch up?) I feel this could be
> automated or automated better (before, after, now).

Testing is for this.

> Find a way to reduce the chance of bad NMU's (accidental, malicious,
> poorly done, etc.). I haven't looked into how this is done now (if at
> all), but the developer making the NMU should be warned that it's an NMU.
> It may be good to list NMU policy for the first time for an NMU by a
> specific developer and ask for confirmation. It may be good to have an
> automated system where maintainers can block NMU's except by permission of
> an authority such as the security or qa group.

I repeat: Debian does not need bureaucracy. By the way, currently
the QA group is defined as "subscribers of debian-qa" or
"anyone who is interested in QA work".

> Joey says at http://www.debian.org/devel/website/todo#misc that security
> updates are on the same server as the signatures for the updates. This
> could be a potential security issue as if one method is exploited to
> change the files, it can be used to change the signatures at the same
> time. wyrmbait at debianplanet.org says in his article Security with apt
> (found at http://www.debianplanet.org/article.php?sid=643 ), that apt can
> be viewed as a single point of failure. While his arguments may not quite
> be thorough, he does bring up some issues of security. Why not have a
> package for keys/certificates, then have dpkg complain if a new package
> has not been correctly signed. Also packages in the archive should be
> signed by a public key that is available on many public key servers and
> available offline (on CD perhaps). Changes to the keyring packages would
> need to have the appropriate signature(s).

sounds like singed Packages.gz

> Packages being signed by multiple people and allowing users to assign
> trust levels (checked before installing an upgrade) to people could
> improve security.

Signing a package means "_I_ can't find a bug", not "there is no bug".

> I would like to encourage distribution of public key server media. Having
> keys stored online lends them to potential man in the middle attacks even
> if multiple protocols are used. It's much more difficult to circumvent an
> offline signature.

Why do you trust the post office (or whatever)?
Do you know web of trust?

> One of the reasons for the delay of the release of Woody was said to have
> been security concerns. It has also been reported (see the glibc example
> at http://www.debianplanet.org/article.php?sid=568 ) that it takes a long
> time for security patches to get through due to the compiling and testing
> on the 68k and arm architectures. I would like to bring forward the idea
> of using emulation to help speed things up. There was recently (March?) a
> patch for UAE (a 68k emulator) to support running Linux. There are also
> emulators for the arm architecture such as arcem (
> http://bugs.debian.org/cgi-bin/bugreport.cgi?archive=no\&bug=136844 for
> details ). arcem is said to be quite fast on Intel architecture. Emulation
> of old architecture brings two advantages and one disadvantage: it's
> usually faster, it's easier to get, it could have trouble being a
> completely accurate emulation of the original hardware (bugs not emulated
> or new bugs not yet found/patched).

The new security infrastructure seems to work fast enough.
Read the recent ssh story.

> Many of the patches and programs found at
> http://www.theaimsgroup.com/~hlein/haqs/ can be quite useful. The programs
> can be packaged. The patches, when useful, should be added to existing
> packages or modified to make them run time options. For example the idle
> connection traffic patch for ssh may be a useful option that may be
> possible to be chosen at runtime.

File a RFP bug to wnpp or a wishlist bug to a specific package.

> Performance issues:
> Someone mentioned the idea of ordering the startup scripts into a
> dependency tree, and have programs startup in parallel. I feel this would
> be useful for many people running many startup scripts such as myself.
> Perhaps this should be a before, after, now. If nothing else, it should be
> looked at now and a policy document regarding this may be useful. I forget
> which Debian developer I read this idea from.

Write a working code and provide a transition plan.

> Having package install in parallel may speed up installation. This may be
> a wishlist item for dpkg or apt.

It will break the system.

> Should CD images be optimized for speed? I saw an option to optimize CD
> images for creation speed in Roxio's Ez-cdcreator (formerly by Adaptec).
> Also speed of installation or other reads of the CD. Seek time might also
> be a consideration when choosing what order to put data on CD's.

Does it work? Is it free (in the sense of DFSG)?
I think CD is fast enough.

> Using programs like cmix, the performance of programs may be able to be
> optimized (before and now, but not after upload as optimizing programs may
> not work desirably).

Is it better than gcc -O flags?

> Sometimes threads come up about performance optimization done at compile
> time. Yes numbers have not been provided, but some number should be. As
> such a comparison of compilers of gcc, lc (Intel's compiler if allowed),
> tcc and any other compilers should be done. Binaries could be compiled
> with each available compiler and then checked to see which produces the
> best results in application performance, binary size and perhaps compile
> time. (Smallest binary size usually means high compile time and better
> application performance, or so I've been told.) This should be done before
> upload and now.

I think gcc -O2 is enough on most cases.

> Other sometimes bigger ideas:
> Should there be a method to force retirement of developers? I don't
> believe so, I believe that a new category should be created for developers
> who are not active. Why separate inactive developers? To limit the
> security risk and make managing developers easier.

They may be active again. Insert a rant about bureaucracy here.
Any Debian developer can take over an unmaintained package as a last resort.

> A restructuring of the online distribution protocol is needed. Recently in
> the Debian Weekly news this was mentioned and this has been discussed. A
> BTS location may be a good place to start putting won't fix, wishlist and
> other bug information regarding the distribution protocol(s). Personally,
> I'd like to see low server loads, compressed files, deltas, and have
> upgrade priorities visible before downloading the package/archive.

Go ahead, do it yourself.

> It might be nice to make debian/watch files separately available and to
> have a watch file for all upstream sources even when it's version
> specific. It would also be nice to carry md5's for upstream sources (last
> known version of course) so when upstream sources get modified (like the
> dsniff security issue), users of watch files to grab the current source
> get a heads up that there may be something wrong.

URL in Description: will help you. I have no means to know
md5sum of the next upstream release before it is released.

> Support for installing Debian via a netboot/bootp by distributing an
> official netboot image.

I think there are already such boot floppies.

> A comparison of xwindows terminals (or is it terminal emulators?) is
> disirable. xvt seems to have a smaller footprint than rxvt which, I
> though, was supposed to be reduced xvt.
> http://dickey.his.com/xterm/xterm.faq.html has some starting information.
> This would be useful for creating a small RAM xwindows install.

The priority of a terminal emulator is 20.
See the Policy, section 12.8.3.

> Other related projects that aren't Debian specific:
> RATS for gnu assembly (note: intel2gas) may be more useful if it existed,
> but it doesn't, yet.

File a RFP bug.

> An open source grammar checker (not EBNF or alike) doesn't seem to exist.
> Openoffice lacks a grammar checker and does not plan to add one. A grammar
> checker is a major proofing tool that would be extremely useful to many
> people. I did find one open source grammar checker called Link Grammar
> http://www.link.cs.cmu.edu/link/ . I disagree with the evolution of
> English being too fast for creating a static grammar checker as many in
> the commercial world have done so.

Debian has nothing to do with the evolution of English.

> Update File's database. This would be useful for my projects looking at
> reordering files in tar archives and other compression projects of mine.
> (This may have to be a Debian thing as I don't see updates to the database
> very often.)

What kind of update are you talking about?

> Other related projects (to those discussed):
> I'm working on some compression algorithms (Charles Bloom at
> http://www.cbloom.com and xiph have some starting work of what I can do).
> I believe I can improve existing compression. I don't have much time as
> I'm a full time student and I need money to pay rent. I will be graduating
> with a computer science degree in December.

great

--
Oohara Yuuma <ooh...@libra.interq.or.jp>
Debian developer
PGP key (key ID F464A695) http://www.interq.or.jp/libra/oohara/pub-key.txt
Key fingerprint = 6142 8D07 9C5B 159B C170 1F4A 40D6 F42E F464 A695

her occasionally near suicidal sense of loyal self-sacrifice
--- Luke Seubert, about what Rei Ayanami and Debian developers have in common


--
To UNSUBSCRIBE, email to debian-dev...@lists.debian.org

Matt Zimmerman

unread,
Jul 26, 2002, 10:10:06 PM7/26/02
to
On Fri, Jul 26, 2002 at 08:16:27PM -0400, Joey Hess wrote:

> Drew Scott Daniels wrote:
> > Some programs use static code for things like regex expressions and
> > handling tar archives. A program to go through the source code of all
> > the programs (or a developer effort) may help to find common code that
> > could be put into a library or that already is in a library. This could
> > make packages smaller, but if we're not careful, creating new libraries
> > could increase the overall installed size for a program. (before, now)
> > An additional benefit would be fewer places to change code (good for
> > security, good for efficiency, good for all updates). Are there any
> > security issues to exporting code from packages? (This should be looked
> > at whenever code's exported.)
>
> Very cute idea.

Being a proponent of code sharing and reuse, and having made some
halfhearted attempts in this area before, I must say that there are some
nontechnical issues to consider here (in addition to the nontrivial
technical issue of maintaining a proper shared library). For example,
upstream developers are often wary of (or downright hostile toward) the idea
of introducing additional dependencies in their packages. In a Debian
environment, dependencies are in general so smoothly managed that this is
not something that we worry about, but it is not an easy thing to convince
some program authors to merge some of their functionality into a library.
Some crazy people still insist on building their software by hand, and this
makes life more difficult for those people (misguided though they may be).

> > http://lists.debian.org/debian-mentors/1999/debian-mentors-199901/msg00051.html
> > talks about putting datasets into Debian or non-free. I wonder what has
> > become of this particular dataset and if there has been a policy developed
> > for datasets. I would like to see astronomical, meteorological,
> > geographical and other data sets easily available. If a data set is
> > DFSG-free then I feel it should be put in main, but segregated somehow (in
> > extra?). Data sets may require maintenance too. For example, recently new,
> > more accurate data was collected about the distance certain stars are from
> > our sun. When I did some more investigation, I found out that a vote to
> > include a dataset section was made and it was decided to create a dataset
> > section. No such section was created and the astronomical data is sitting
> > with the person made this proposal. Special handling of datasets may be
> > required to reduce the impact on Debian distribution infrastructure. I
> > recommend updates and distribution only be allowed through diffs or some
> > other method that uses less bandwidth than is used now.
>
> Yes, that idea seems to be stalled.

The best way to get this project going would be for one of the proponents to
set up a private archive to manage this, and run it as an experiment for a
while. If it seems to work out, then we can work through the storage and
mirroring issues.

> > Create Debian backup procedure and program(s) (debian cleanup, cruft to
> > backup, dpkg --getselections > myselections, backup config files possibly
> > checking md5's which more than should be in every package), now.

I've been doing these kinds of backups in various custom ways for a while
now, and would very much like to see a tool which understands FHS and a
little bit of Debian which could be used by an average user.

I have in mind something like the Backup/Restore applet on the Zaurus. With
a few taps of the stylus, the user can back up all of their data that is not
part of the stock system, wipe it, install a new one, and restore their
data. I think that this is achievable for Debian.

> > [live Debian CDs]


>
> Somebody please do this, mmmkay? IIRC I saw Robster and someone discussing
> this on irc the other day.

For Debian Zaurus, I wrote a little init script which, given a read-only
root filesystem, creates and mounts a read/write one (in this case flash
memory, but the same could be done with a RAM disk, etc.), copies on
pre-populated symlink farms, and then pivot_roots into it, producing a
read-write system of which the read-only components are symlinks into the
read-only filesystem and the read-write components are copies of the initial
system state. This makes it possible to (for example) install additional
packages (requiring only enough space for the package contents), mess
around, whatever, and "roll back" to the base read-only system at any time.

If anyone is interested in using something like this for a live Debian CD,
feel free to get in touch with me.

> > One of the reasons for the delay of the release of Woody was said to
> > have been security concerns. It has also been reported (see the glibc
> > example at http://www.debianplanet.org/article.php?sid=568 ) that it
> > takes a long time for security patches to get through due to the
> > compiling and testing on the 68k and arm architectures. I would like to
> > bring forward the idea of using emulation to help speed things up. There
> > was recently (March?) a patch for UAE (a 68k emulator) to support
> > running Linux. There are also emulators for the arm architecture such as
> > arcem (
> > http://bugs.debian.org/cgi-bin/bugreport.cgi?archive=no\&bug=136844 for
> > details ). arcem is said to be quite fast on Intel architecture.
> > Emulation of old architecture brings two advantages and one
> > disadvantage: it's usually faster, it's easier to get, it could have
> > trouble being a completely accurate emulation of the original hardware
> > (bugs not emulated or new bugs not yet found/patched).

Making an emulator both accurate enough and fast enough to fill this role is
a challenge, and I am not sure that any emulator is up to the task of
running an autobuilder.

Things like ccache might prove to be of more benefit in this situation, but
there are arguments against it as well (increased complexity and unknowns,
lack of disk space and network bandwidth).

> > Many of the patches and programs found at
> > http://www.theaimsgroup.com/~hlein/haqs/ can be quite useful. The programs
> > can be packaged. The patches, when useful, should be added to existing
> > packages or modified to make them run time options. For example the idle
> > connection traffic patch for ssh may be a useful option that may be
> > possible to be chosen at runtime.

The ssh example that you use is:

http://marc.theaimsgroup.com/?l=openssh-unix-dev&m=100283400026301&w=2

yes? Current versions of ssh have both KeepAlive and ProtocolKeepAlive
which serve this need well enough for me. That message is quite old; it
is possible that these features did not exist when it was written.

--
- mdz

Junichi Uekawa

unread,
Jul 27, 2002, 2:50:07 AM7/27/02
to
Matt Zimmerman <m...@debian.org> immo vero scripsit:

> > > [live Debian CDs]
> >
> > Somebody please do this, mmmkay? IIRC I saw Robster and someone discussing
> > this on irc the other day.

I have a few scripts sitting around for booting nodes up from CD-Rom
images. I was playing around with it around this time last year,
for creating cluster-base installation systems.

It's fun. I'll dig them up. (It's in shell, as usual).


regards,
junichi


--
dan...@debian.org : Junichi Uekawa http://www.netfort.gr.jp/~dancer
GPG Fingerprint : 17D6 120E 4455 1832 9423 7447 3059 BF92 CD37 56F4
Libpkg-guide: http://www.netfort.gr.jp/~dancer/column/libpkg-guide/


--
To UNSUBSCRIBE, email to debian-dev...@lists.debian.org

Junichi Uekawa

unread,
Jul 27, 2002, 2:50:08 AM7/27/02
to
Matt Zimmerman <m...@debian.org> immo vero scripsit:

> I must say that there are some


> nontechnical issues to consider here

Yes. There are many upstream developers who don't like
extra dependencies, and try to include every dependency in their
distribution package.

This is silly, but this is a point where Debian stands strong.
We can make everything depend on everything else (tm) and
still keep things trivial to install.

hurray.

(if people manage to get shared library right).


--
dan...@debian.org : Junichi Uekawa http://www.netfort.gr.jp/~dancer
GPG Fingerprint : 17D6 120E 4455 1832 9423 7447 3059 BF92 CD37 56F4
Libpkg-guide: http://www.netfort.gr.jp/~dancer/column/libpkg-guide/


--
To UNSUBSCRIBE, email to debian-dev...@lists.debian.org

Goswin Brederlow

unread,
Jul 27, 2002, 9:30:14 AM7/27/02
to
Junichi Uekawa <dan...@netfort.gr.jp> writes:

> Matt Zimmerman <m...@debian.org> immo vero scripsit:
>
> > > > [live Debian CDs]
> > >
> > > Somebody please do this, mmmkay? IIRC I saw Robster and someone discussing
> > > this on irc the other day.
>
> I have a few scripts sitting around for booting nodes up from CD-Rom
> images. I was playing around with it around this time last year,
> for creating cluster-base installation systems.
>
> It's fun. I'll dig them up. (It's in shell, as usual).

Theres allways Knoopix:

http://www.knopper.net/knoppix/

MfG
Goswin

0 new messages