basics: creating and distributing distributions

5 views
Skip to first unread message

John Gabriele

unread,
Jan 2, 2010, 8:45:52 PM1/2/10
to The Hitchhiker's Guide to Packaging
Hi,

I'd like to put together a brief "Basics: Creating and Distributing
Distributions" article for the Guide. Again, the goal is to cover only
the bare best-practice basics of what's required for turning some
useful pure Python modules on your hard disk into a distribution on
the PyPI that others can benefit from. If you have any tips regarding
modern best-practice module creation and/or distribution, please post.

For project directory structure, I was going to suggest:

MyProject/ -- note CamelCase naming convention
README.txt -- also including a list of contributors/
thanks-to
would be nice

CHANGES.txt -- including dates in addition to version
numbers
here would be appreciated
LICENSE.txt
setup.py
bin/ -- standalone scripts this distribution
provides
my_project/ -- the modules your project provides go into
this
__init__.py top-level package
my_module.py
docs/
tests/

Regarding version numbers, I was going to recommend using the
revision.version.subversion (major.minor.bugfix) scheme.

Does anything special need to be done so that the project's README.txt
will properly render on the main PyPI page for the project?

Can anyone supply more concrete details on how to properly implement
the advice given
[here](http://packages.python.org/distribute/using.html)?

Recommendations regarding testing, preferred modules to use,
automation
to make use of?

Best practice regarding supplying distribution meta-data?

Thanks!

Tarek Ziadé

unread,
Jan 2, 2010, 10:44:23 PM1/2/10
to packagi...@googlegroups.com
On Sun, Jan 3, 2010 at 2:45 AM, John Gabriele <jmg...@gmail.com> wrote:
> Hi,
>
> I'd like to put together a brief "Basics: Creating and Distributing
> Distributions" article for the Guide. Again, the goal is to cover only
> the bare best-practice basics of what's required for turning some
> useful pure Python modules on your hard disk into a distribution on
> the PyPI that others can benefit from.  If you have any tips regarding
> modern best-practice module creation and/or distribution, please post.

That's to be done in example.txt. It's fine if you start it, no one
has started it yet.

>
> For project directory structure, I was going to suggest:
>
>    MyProject/           -- note CamelCase naming convention

I think you should do a real project example, so it speaks to people.
A known project, that uses a good structure, would be the best pick.

>        README.txt       -- also including a list of contributors/
> thanks-to
>                            would be nice
>

or, CONTRIBUTORS.txt, but that's just one way to do it.


[..]


> Regarding version numbers, I was going to recommend using the
> revision.version.subversion (major.minor.bugfix) scheme.

Just put a link on versioning.txt, and use the scheme/practice described there.
If you find a issue, or something is missing in it, let's discuss it here.

>
> Does anything special need to be done so that the project's README.txt
> will properly render on the main PyPI page for the project?

Yes. It has to be reSTtructuredText compliant, and linked to the
long_description
metadata.

>
> Can anyone supply more concrete details on how to properly implement
> the advice given
> [here](http://packages.python.org/distribute/using.html)?

Don't use Distribute for now, and provide a plain Distutils example.
A Distribute-based setup.py should be done in a second phase if possible,
after we explain why it can be helpful.

>
> Recommendations regarding testing, preferred modules to use,
> automation
> to make use of?

The project should have a tests/ folder in each package, or a global
tests/ folder if it does have only plain modules. This tests/ package
should be using the unittest framework, as this is the most universal
format for test runners to run tests (nosetests, python's unitest new
discovery code, py.test..) that is, one test_module.py per module.py.

>
> Best practice regarding supplying distribution meta-data?
>

setup.py needs to be as dumb as possible and provide all options to setup().
Avoid adding any logic in it, and KISS.


Tarek
--
Tarek Ziadé | http://ziade.org

Carl Meyer

unread,
Jan 3, 2010, 1:45:09 AM1/3/10
to packagi...@googlegroups.com
Hi John,

Great idea, thanks for contributing all this!

John Gabriele wrote:
[snip]


> For project directory structure, I was going to suggest:
>
> MyProject/ -- note CamelCase naming convention
> README.txt -- also including a list of contributors/
> thanks-to
> would be nice
>
> CHANGES.txt -- including dates in addition to version
> numbers
> here would be appreciated
> LICENSE.txt
> setup.py
> bin/ -- standalone scripts this distribution
> provides
> my_project/ -- the modules your project provides go into
> this
> __init__.py top-level package
> my_module.py
> docs/
> tests/

PEP 8 does say that "the use of underscores is discouraged" in package
names. I'm not sure why, to be honest.

> Regarding version numbers, I was going to recommend using the
> revision.version.subversion (major.minor.bugfix) scheme.

You could link to Semantic Versioning (semver.org)?

> Does anything special need to be done so that the project's README.txt
> will properly render on the main PyPI page for the project?

A common approach is something like this:

setup(
# ...
long_description=open('README.txt').read(),
# ...
)

Without that, README.txt won't appear anywhere on PyPI.

Carl

Screwtape

unread,
Jan 3, 2010, 5:00:55 AM1/3/10
to The Hitchhiker's Guide to Packaging
On Jan 3, 12:45 pm, John Gabriele <jmg3...@gmail.com> wrote:
> For project directory structure, I was going to suggest:

For my projects, I've always followed Exarkun's project layout advice,
here:

http://jcalderone.livejournal.com/39794.html

It's worked out pretty well for me!

> Regarding version numbers, I was going to recommend using the
> revision.version.subversion (major.minor.bugfix) scheme.

I would have guessed that 'revision' and 'subversion' were the same
thing, expressing it as "major.minor.bugfix" is better, I think.

Also, I'm sure there's been lots of projects that have wrestled with
how to make version-numbers that will be properly sorted by all the
major package management tools (rpm, dpkg, PyPI) including such tricks
as marking beta and release-candidate versions. Unfortunately, I'm not
sure what the right answer here is - if anybody happens to know, I'd
be interested in adopting it myself.

> Recommendations regarding testing, preferred modules to use,
> automation
> to make use of?

Use unittest unless it makes testing your code unbearably difficult
(for example, code that uses Twisted needs to inherit from
twisted.trial.unittest.TestCase instead of the stdlib's
unittest.TestCase; I believe Django has its own crazy propretary non-
unittest-based test system that Django users will be familiar with and
expect).

Get in the habit of using a test-runner like Nose or Twisted Trial
before you commit, when you're distracted, or before you start working
on something else. Bonus points for adding a commit-hook to your
version-control system that rejects commits if they fail tests.

> Best practice regarding supplying distribution meta-data?

As somebody else mentioned, put no logic or cleverness into setup.py,
just pass values to the setup function. If you want to do any
cleverness at all (such as automatically picking a version number from
your version-control system), make a library function somewhere in
your package that does the required calculation, then import that into
setup.py and call it there.

Carl Meyer

unread,
Jan 3, 2010, 9:43:50 AM1/3/10
to packagi...@googlegroups.com

Screwtape wrote:
> On Jan 3, 12:45 pm, John Gabriele <jmg3...@gmail.com> wrote:
>> For project directory structure, I was going to suggest:
>
> For my projects, I've always followed Exarkun's project layout advice,
> here:
>
> http://jcalderone.livejournal.com/39794.html

Good advice. It's already included in the Guide (example.txt), but much
of it would make sense for this intro.

> Also, I'm sure there's been lots of projects that have wrestled with
> how to make version-numbers that will be properly sorted by all the
> major package management tools (rpm, dpkg, PyPI) including such tricks
> as marking beta and release-candidate versions. Unfortunately, I'm not
> sure what the right answer here is - if anybody happens to know, I'd
> be interested in adopting it myself.

There are currently at least two mutually-incompatible version
comparison systems in use, distutils' and setuptools. But the emerging
standard to use (and teach new users ) is what's outlined by PEP 386:
http://www.python.org/dev/peps/pep-0386/

> unittest.TestCase; I believe Django has its own crazy propretary non-
> unittest-based test system that Django users will be familiar with and
> expect).

This is an interesting bit of uninformed FUD. Django's test system uses
unittest (or doctest, though that's not encouraged) and there's nothing
"proprietary" about it, it just provides some simple conveniences useful
for testing Django web apps.

> As somebody else mentioned, put no logic or cleverness into setup.py,
> just pass values to the setup function. If you want to do any
> cleverness at all (such as automatically picking a version number from
> your version-control system), make a library function somewhere in
> your package that does the required calculation, then import that into
> setup.py and call it there.

This can be dangerous advice. It's not good to require the import of
your entire package just to run setup.py, so if you do this, make sure
this library module is self-contained and doesn't cause cascade imports
of pretty much everything else in your package.

Carl

Screwtape

unread,
Jan 3, 2010, 7:00:22 PM1/3/10
to packagi...@googlegroups.com
On Sun, Jan 03, 2010 at 09:43:50AM -0500, Carl Meyer wrote:
> Screwtape wrote:
> > On Jan 3, 12:45 pm, John Gabriele <jmg3...@gmail.com> wrote:
> > Also, I'm sure there's been lots of projects that have wrestled with
> > how to make version-numbers that will be properly sorted by all the
> > major package management tools (rpm, dpkg, PyPI) including such tricks
> > as marking beta and release-candidate versions. Unfortunately, I'm not
> > sure what the right answer here is - if anybody happens to know, I'd
> > be interested in adopting it myself.
>
> There are currently at least two mutually-incompatible version
> comparison systems in use, distutils' and setuptools. But the emerging
> standard to use (and teach new users) is what's outlined by PEP 386:
> http://www.python.org/dev/peps/pep-0386/

Ah, interesting. I'll take a look, thanks!

> > unittest.TestCase; I believe Django has its own crazy propretary non-
> > unittest-based test system that Django users will be familiar with and
> > expect).
>
> This is an interesting bit of uninformed FUD. Django's test system uses
> unittest (or doctest, though that's not encouraged) and there's nothing
> "proprietary" about it, it just provides some simple conveniences useful
> for testing Django web apps.

That's good to know. I was talking to a friend once who was trying to
integrate coverage-tracking into Django's test framework, and
I suggested temporarily switching to a test-runner that already had such
support, but he seemed to be under the impression that switching to
a unittest-based test-runner would require him to rewrite all his tests
from scratch. Perhaps I misunderstood him; at any rate, I'm glad to hear
that's not the case.

> > As somebody else mentioned, put no logic or cleverness into setup.py,
> > just pass values to the setup function. If you want to do any
> > cleverness at all (such as automatically picking a version number from
> > your version-control system), make a library function somewhere in
> > your package that does the required calculation, then import that into
> > setup.py and call it there.
>
> This can be dangerous advice. It's not good to require the import of
> your entire package just to run setup.py, so if you do this, make sure
> this library module is self-contained and doesn't cause cascade imports
> of pretty much everything else in your package.

The rationale (which I should have mentioned to begin with, I guess) is
that code in setup.py can't easily be tested, so one should put any
minimally-complex code into your package.

I haven't heard about problems caused by cascade imports before; what
are the symptoms?

Carl Meyer

unread,
Jan 3, 2010, 9:24:39 PM1/3/10
to packagi...@googlegroups.com

Screwtape wrote:
> That's good to know. I was talking to a friend once who was trying to
> integrate coverage-tracking into Django's test framework, and
> I suggested temporarily switching to a test-runner that already had such
> support, but he seemed to be under the impression that switching to
> a unittest-based test-runner would require him to rewrite all his tests
> from scratch. Perhaps I misunderstood him; at any rate, I'm glad to hear
> that's not the case.

Django does provide its own unittest-based test-runner which provides
some of the Django-specific conveniences, so switching to another
test-runner (such as nose or whatnot) can require some work if you've
made use of those conveniences, but it's not particularly onerous.

> The rationale (which I should have mentioned to begin with, I guess) is
> that code in setup.py can't easily be tested, so one should put any
> minimally-complex code into your package.

True.

> I haven't heard about problems caused by cascade imports before; what
> are the symptoms?

The problematic symptom is just that any of the dependencies of your
entire distribution become dependencies of running setup.py. It's bad if
a prospective user can't even run "python setup.py --long-description"
without first setting up an environment that includes all your runtime
dependencies. Of course if your distribution has no external runtime
dependencies, that's not an issue.

Carl

John Gabriele

unread,
Jan 3, 2010, 10:13:12 PM1/3/10
to The Hitchhiker's Guide to Packaging
On Jan 2, 10:44 pm, Tarek Ziadé <ziade.ta...@gmail.com> wrote:

> On Sun, Jan 3, 2010 at 2:45 AM, John Gabriele <jmg3...@gmail.com> wrote:
> > Hi,
>
> > I'd like to put together a brief "Basics: Creating and Distributing
> > Distributions" article for the Guide. {snip}

>
> That's to be done in example.txt. It's fine if you start it, no one
> has started it yet.

Ok. Just finished writing and committing it. Not sure how long it
takes to make it to the html version online.

Tarek, my goal is to provide a brief, concise, opinionated, and to-the-
point article that gives new contributors exactly what they need to
get their useful modules off their hard drive and into the Cheeseshop.
I think the Guide has room for both this type of article (which I
think the community needs) and a longer, more complete and detailed
article (example.txt). As you can see, I provided links to example.txt
when appropriate, and I think the two will work well together.

>
>
> > For project directory structure, I was going to suggest:
>
> >    MyProject/           -- note CamelCase naming convention
>
> I think you should do a real project example, so it speaks to people.

Ok. I uploaded the project detailed in the article. :)

>
> >        README.txt       -- also including a list of contributors/
> > thanks-to
> >                            would be nice
>
> or, CONTRIBUTORS.txt, but that's just one way to do it.

Ok. Addressed this in the article. Since everyone (most everyone?)
reads the README, I think it's most gracious to put the list of
contributors in the readme. It doesn't take up much room, and lets you
have 1 or 2 less files in the distribution. Also, possibly, increases
your cosmic karma by a point or two. :)

> [..]
>
> > Regarding version numbers, I was going to recommend using the
> > revision.version.subversion (major.minor.bugfix) scheme.
>
> Just put a link on versioning.txt, and use the scheme/practice described there.
> If you find a issue, or something is missing in it, let's discuss it here.

Ok. Recommended major.minor.bugfix and then provided a link to
versioning.txt. I needed to put some advice there, and personally
think that major.minor.bugfix is probably best-practice for most
common needs.

>
>
> > Does anything special need to be done so that the project's README.txt
> > will properly render on the main PyPI page for the project?
>
> Yes. It has to be reSTtructuredText compliant, and linked to the
> long_description
> metadata.

Great. Added Carl Meyer's tip to the article. Oh darn. Forgot to add
him (and you!) to the "Thanks to" section! Will fix that.

>
>
> > Recommendations regarding testing, preferred modules to use,
> > automation
> > to make use of?
>
> The project should have a tests/ folder in each package, or a global
> tests/ folder if it does have only plain modules.  This tests/ package
> should be using the unittest framework, as this is the most universal
> format for test runners to run tests (nosetests, python's unitest new
> discovery code, py.test..) that is, one test_module.py per module.py.

Ok. I think I struck a good balance in the article between giving good
advice and not bogging the prospective PyPI contributor down with
requirements.

>
>
> > Best practice regarding supplying distribution meta-data?
>
> setup.py needs to be as dumb as possible and provide all options to setup().
> Avoid adding any logic in it, and KISS.

Check.

Thanks all. Please let me know if you find any inaccuracies or areas
where it can be even more streamlined.

---John

John Gabriele

unread,
Jan 3, 2010, 10:22:18 PM1/3/10
to The Hitchhiker's Guide to Packaging
On Jan 3, 10:13 pm, John Gabriele <jmg3...@gmail.com> wrote:
>
> Great. Added Carl Meyer's tip to the article. Oh darn. Forgot to add
> him (and you!) to the "Thanks to" section! Will fix that.

Whoops. Confused. Already in credits.txt.

Tarek Ziadé

unread,
Jan 4, 2010, 6:24:58 AM1/4/10
to packagi...@googlegroups.com
On Mon, Jan 4, 2010 at 4:13 AM, John Gabriele <jmg...@gmail.com> wrote:
[..]

>
> Ok. Recommended major.minor.bugfix and then provided a link to
> versioning.txt. I needed to put some advice there, and personally
> think that major.minor.bugfix is probably best-practice for most
> common needs.

What's "bugfix" ? is this similar to "micro" or behaves differently ?

Can you look at versions.txt and see if its similar to "micro".

- if it's similar, let's use "micro" everywhere
- if it's not, you need to describe here how it predcisely works,
then we need to rework version.txt in order to add "bugfix"

Cheers
Tarek

--
Tarek Ziadé | http://ziade.org

John Gabriele

unread,
Jan 4, 2010, 10:08:11 AM1/4/10
to The Hitchhiker's Guide to Packaging
On Jan 4, 6:24 am, Tarek Ziadé <ziade.ta...@gmail.com> wrote:

> On Mon, Jan 4, 2010 at 4:13 AM, John Gabriele <jmg3...@gmail.com> wrote:
>
> [..]
>
> > Ok. Recommended major.minor.bugfix and then provided a link to
> > versioning.txt. I needed to put some advice there, and personally
> > think that major.minor.bugfix is probably best-practice for most
> > common needs.
>
> What's "bugfix" ? is this similar to "micro" or behaves differently ?
> Can you look at versions.txt and see if its similar to "micro".

Hm. I think "bugfix" is the same as "micro", just as described in
versioning.txt:

| Some softwares use a third level: MICRO. This level is used when the
| release cycle of minor release is quite long. In that case, micro
releases
| are dedicated to bug fixes.

I've heard Perl folk use revision.version.subversion to mean (I think)
pretty much the same thing, for example:
http://search.cpan.org/~andya/Perl-Version/lib/Perl/Version.pm

But wikipedia http://en.wikipedia.org/wiki/Software_versioning says
that some people (including GNU) use "revision" to mean the tiny
changes.

Looks like Apache does it a bit differently too: http://apr.apache.org/versioning.html

---John

Tarek Ziadé

unread,
Jan 4, 2010, 10:31:10 AM1/4/10
to packagi...@googlegroups.com
On Mon, Jan 4, 2010 at 4:08 PM, John Gabriele <jmg...@gmail.com> wrote:
> On Jan 4, 6:24 am, Tarek Ziadé <ziade.ta...@gmail.com> wrote:
>> On Mon, Jan 4, 2010 at 4:13 AM, John Gabriele <jmg3...@gmail.com> wrote:
>>
>> [..]
>>
>> > Ok. Recommended major.minor.bugfix and then provided a link to
>> > versioning.txt. I needed to put some advice there, and personally
>> > think that major.minor.bugfix is probably best-practice for most
>> > common needs.
>>
>> What's "bugfix" ? is this similar to "micro" or behaves differently ?
>> Can you look at versions.txt and see if its similar to "micro".
>
> Hm. I think "bugfix" is the same as "micro", just as described in
> versioning.txt:
>
> | Some softwares use a third level: MICRO. This level is used when the
> | release cycle of minor release is quite long. In that case, micro
> releases
> | are dedicated to bug fixes.
>

Ok let's use the same PEP 386 term everywhere then, for clarity e.g. "micro"

Thanks
Tarek

John Gabriele

unread,
Jan 6, 2010, 9:44:14 AM1/6/10
to The Hitchhiker's Guide to Packaging
Tarek, in the "basics: creating dists" doc, I noticed that you added a
line to the MANIFEST.in file:

recursive-include towelstuff *.py

Isn't that redundant, since we're already specifying that in the `setup
()` call like so

packages=['towelstuff', 'towelstuff.test'],

?

Thanks,
---John

Tarek Ziadé

unread,
Jan 6, 2010, 10:13:34 AM1/6/10
to packagi...@googlegroups.com

From the Distutils docs:

""
The manifest template is just a list of instructions for how to
generate your manifest file, MANIFEST, which is the exact list of
files to include in your source distribution. The sdist command
processes this template and generates a manifest based on its
instructions and what it finds in the filesystem.
"""

IOW, sdist is driven by the template when its own mechanism doesn't
suffice. So it's a bad practice to have a MANIFEST.in template that
just lists *partially* some files to be included, because the file
listing then relies on two mechanisms.

For instance, if you have other things than py files in your packages,
they won't be added by
package so you might forget to update the MANIFEST.in at some point.

The best practice I can see so far is:

- don't use a MANIFEST.in, and let Distutils add all files for you by
reading the options
and running an explicit algorithm to include files (and data files
are now added
as well since 2.7+) see
http://docs.python.org/dev/distutils/sourcedist.html#manifest

For the guide this means = use a standard list of files. I think
this "Basics" chapter should
not mention MANIFEST.in at all in fact

- use a MANIFEST.in, but specify all files in there

see http://docs.python.org/dev/distutils/sourcedist.html#the-manifest-in-template

Tarek

> Thanks,

Carl Meyer

unread,
Jan 6, 2010, 11:18:32 AM1/6/10
to packagi...@googlegroups.com

Tarek Ziad� wrote:
> IOW, sdist is driven by the template when its own mechanism doesn't
> suffice. So it's a bad practice to have a MANIFEST.in template that
> just lists *partially* some files to be included, because the file
> listing then relies on two mechanisms.
>
> For instance, if you have other things than py files in your
> packages, they won't be added by package so you might forget to
> update the MANIFEST.in at some point.

Interesting. My experience has been that this is not the common
practice. I've very rarely seen *.py in a MANIFEST.in. In practice I've
found it very easy to think of MANIFEST.in as "everything but the
Python" and let the Python be taken care of automatically. To me the
non-DRYness of having to specify my Python packages in two different
places is worse than having my file listing generated from two different
places.

> The best practice I can see so far is:
>
> - don't use a MANIFEST.in, and let Distutils add all files for you by
> reading the options and running an explicit algorithm to include
> files (and data files are now added as well since 2.7+) see
> http://docs.python.org/dev/distutils/sourcedist.html#manifest

This seems best for the future, but is obviously not an option for
supporting Python versions < 2.7, which means pretty much all
distributions for several years yet.

> For the guide this means = use a standard list of files. I think this
> "Basics" chapter should not mention MANIFEST.in at all in fact

By "standard list of files" do you mean specifying the MANIFEST in full,
or what? That seems like the worst (most error-prone) option. If you
don't want to mention MANIFEST.in, I'm confused about what "best
practice" solution you are recommending for Python < 2.7.

> - use a MANIFEST.in, but specify all files in there
>
> see
> http://docs.python.org/dev/distutils/sourcedist.html#the-manifest-in-template

I'm not convinced. I think I would continue to use (and recommend) a
MANIFEST.in that contains only non-Python files as the best option
available for now.

Carl

Tarek Ziadé

unread,
Jan 6, 2010, 11:51:19 AM1/6/10
to packagi...@googlegroups.com
On Wed, Jan 6, 2010 at 5:18 PM, Carl Meyer <carl.j...@gmail.com> wrote:
[..]

>>
>> For instance, if you have other things than py files in your
>> packages, they won't be added by package so you might forget to
>> update the MANIFEST.in at some point.
>
> Interesting. My experience has been that this is not the common
> practice. I've very rarely seen *.py in a MANIFEST.in.

Since you can have .py files in other places than in directory listed
in packages, (docs, tools, etc) how do you do with them ? (if you
want to include or exclude them)

See for instance FA : http://formalchemy.googlecode.com/hg/MANIFEST.in

In that case, if one of them becomes a package, it will compete with MANIFEST.in

-> one single place to list all files is less confusing


> In practice I've
> found it very easy to think of MANIFEST.in as "everything but the
> Python" and let the Python be taken care of automatically.

Yes but the problem is that "Python" can mean many things here :
if you use setuptools + subversion and you rely on setuptools magics,
you might lose some files the day you drop svn. (since setuptools automagically
adds files listed in .svn)

or worse : someone that takes your archive without the .svn file (because you
don't distribute it) is unable to rebuild the same archive by running sdist.

Explicit is better than implicit in that case, so I recommend either way:

- an explicit inclusion (done by distutils) + one place to look at to
know what's included (distutils docs)
or
- an explicit MANIFEST.in + one place to look at to know what's
included (MANIFEST.in)


>To me the
> non-DRYness of having to specify my Python packages in two different
> places is worse than having my file listing generated from two different
> places.

what you mean by "having to specify my Python packages in two
different places" ?

listing files to include in the distribution is orthogonal to
specifying the packages your project has.


>
>> The best practice I can see so far is:
>>
>> - don't use a MANIFEST.in, and let Distutils add all files for you by
>>  reading the options and running an explicit algorithm to include
>> files (and data files are now added as well since 2.7+) see
>> http://docs.python.org/dev/distutils/sourcedist.html#manifest
>
> This seems best for the future, but is obviously not an option for
> supporting Python versions < 2.7, which means pretty much all
> distributions for several years yet.

It's still the best option if you don't have data files (as the
majority of distributions)

>
>> For the guide this means = use a standard list of files. I think this
>> "Basics" chapter should not mention MANIFEST.in at all in fact
>
> By "standard list of files" do you mean specifying the MANIFEST in full,
> or what? That seems like the worst (most error-prone) option.  If you
> don't want to mention MANIFEST.in, I'm confused about what "best
> practice" solution you are recommending for Python < 2.7.

By "standard list of files" I mean, not using MANIFEST.in at all and relying
on what gets included by default (see the doc i've mentioned, eg the
list of files that are
added by sdist when no MANIFEST.in is provided)

using MANIFEST.in for data files is just a fix for teh bug < 2.7 has.


>> - use a MANIFEST.in, but specify all files in there
>>
>> see
>> http://docs.python.org/dev/distutils/sourcedist.html#the-manifest-in-template
>
> I'm not convinced. I think I would continue to use (and recommend) a
> MANIFEST.in that contains only non-Python files as the best option
> available for now.

The question then is : what do you have left in this MANIFEST.in file
? (besides the data files listed in setup() options)

As a side note:

MANIFEST.in has always been a big source of confusion because it
breaks the "one way to do thing" philosophy.
I am wondering if an "extra_files" option in setup.py for sdist
wouldn't be better.

Carl Meyer

unread,
Jan 6, 2010, 12:10:27 PM1/6/10
to packagi...@googlegroups.com

Tarek Ziad� wrote:
> Since you can have .py files in other places than in directory listed
> in packages, (docs, tools, etc) how do you do with them ? (if you
> want to include or exclude them)

That's a case I haven't run into, but in those cases I would put
them in MANIFEST.in too.

> Yes but the problem is that "Python" can mean many things here :
> if you use setuptools + subversion and you rely on setuptools magics,
> you might lose some files the day you drop svn. (since setuptools automagically
> adds files listed in .svn)
>
> or worse : someone that takes your archive without the .svn file (because you
> don't distribute it) is unable to rebuild the same archive by running sdist.

Right. That's why I think the whole VCS magic is a bad idea, and I don't
use it at all. Explicit is better than implicit: in my mind MANIFEST.in
+ explicitly-specified packages option is quite explicit.

> what you mean by "having to specify my Python packages in two
> different places" ?
>
> listing files to include in the distribution is orthogonal to
> specifying the packages your project has.

In theory it might be orthogonal, in practice it is not. If my
distribution includes a Python package, it should be included in the
distribution (is that a tautology or what? ;-> ). When adding a new
package to my distribution, I prefer not to specify that in two
different places (setup.py and MANIFEST.in).

> By "standard list of files" I mean, not using MANIFEST.in at all and relying
> on what gets included by default (see the doc i've mentioned, eg the
> list of files that are
> added by sdist when no MANIFEST.in is provided)

We agree on this then. If you don't have data files, don't use a
MANIFEST.in. That's the same thing that I would do. But if you do have
data files...

> using MANIFEST.in for data files is just a fix for teh bug < 2.7 has.

Sure, fine. But AFAICT it's the best fix available until we can
deprecate Python < 2.7, and using that fix doesn't mean that I should
ALSO list all my Python packages in MANIFEST.in.

>> I'm not convinced. I think I would continue to use (and recommend) a
>> MANIFEST.in that contains only non-Python files as the best option
>> available for now.
>
> The question then is : what do you have left in this MANIFEST.in file
> ? (besides the data files listed in setup() options)

That's it. Just data files. Clearly I have to list the data files two
places, and I'd rather not, but as you say that's a necessary workaround
for now.

> MANIFEST.in has always been a big source of confusion because it
> breaks the "one way to do thing" philosophy.
> I am wondering if an "extra_files" option in setup.py for sdist
> wouldn't be better.

I think we're really talking about two different things here: what's
best for now, and what's best in a theoretical future. I will gladly
drop MANIFEST.in completely in favor of the setup.py options, once that
all works correctly. And personally I don't even have a need for
"extra_files", though I can see the use case (including extra files in
an sdist that should not be installed).

All I am saying is that in the _current_ situation, for a real-world
package, I recommend the workaround of listing data files in
MANIFEST.in. And I don't see a good reason to also list Python code
files that are part of a package.

But really, it's not a big deal :-)

Carl

Carl Meyer

unread,
Jan 6, 2010, 12:13:54 PM1/6/10
to packagi...@googlegroups.com

Carl Meyer wrote:
> That's it. Just data files. Clearly I have to list the data files two
> places, and I'd rather not, but as you say that's a necessary workaround
> for now.

Actually I have to correct myself. I do also list in MANIFEST.in files
like README, CHANGES, LICENSE, etc. I want those in the sdist, but they
don't need to be installed. So I guess that would be a use for the
possible extra_files setup option you mentioned.

Carl

Tarek Ziadé

unread,
Jan 6, 2010, 12:15:37 PM1/6/10
to packagi...@googlegroups.com

Yes I was thinking about those, because the defaults are making too
much assumption
on the name of these files.

Tarek Ziadé

unread,
Jan 6, 2010, 12:18:02 PM1/6/10
to packagi...@googlegroups.com

There's one feature MANIFEST.in provides though: the ability to
*remove* files added by setup().

But that's very rare I guess

Reinout van Rees

unread,
Jan 6, 2010, 4:08:33 PM1/6/10
to The Hitchhiker's Guide to Packaging
On 3 jan, 02:45, John Gabriele <jmg3...@gmail.com> wrote:
>
> For project directory structure, I was going to suggest:
>
>     MyProject/           -- note CamelCase naming convention

Why the camel case? All lower case seems to be preferred nowadays.
Modules being lowercase, pep8-wise.


Reinout

Tarek Ziadé

unread,
Jan 6, 2010, 4:33:40 PM1/6/10
to packagi...@googlegroups.com

That's the project layout on the system, not a module or a package.

And it generally matches the project's distutils name, which has been
renamed to TowelStuff/ in the guide.

Now for projects that are not part of a framework namespace (zope.*,
plone.*, etc), CamelCase is quite used by the community. But I don't
think there's any real convention.

OTHO, most installers are case-insensitive these days, meaning that
"pip install towelstuff" will work too..

Tarek

John Gabriele

unread,
Jan 6, 2010, 4:47:01 PM1/6/10
to The Hitchhiker's Guide to Packaging

Well, I think it's quite sensible to have your "project name" be the
same as your distribution name. And using CamelCase for distribution
names distinguishes them nicely from package and module names (which I
think is important, given how the terms are sometimes confusingly used
interchangably).

---John

Reply all
Reply to author
Forward
0 new messages