RFC: Docudo Reborn

5 views

Skip to first unread message

Kevin H

unread,

Oct 18, 2007, 3:04:24 PM10/18/07

to Docudo

As promised (last week) I am now posting my notes on what I think
should go into the new and (hopefully) improved iteration of Docudo.
My intent not to say "how its going to be" but to generate discussion
regarding the direction this project should go.

Please feel free to sound off if you see any features missing that you
would like to see included, or any features included that you think
should be left out.

I'd also like to hear which things you like (or don't like) about what
I've proposed below, as well as any potential pitfalls that you think
may crop up.

Thanks for your input!

Kevin Horn

============================================

~~~~~~~~~~~~~
Docudo Reborn
~~~~~~~~~~~~~

Motivation
==========
Docudo is a cool idea and needs to happen.

What is Docudo?
===============
Docudo is an application, first proposed (AFAIK) by Kevin Dangoor,
renowned
founder of the TurboGears project. The idea is to have a web
interface to
the documentation for a software project.

Desired Features
================
- Data storage backend which can support version control.
(possibly multiple backends)

- Ability to support multiple "versions" of a project simultaneously.

- Simultaneous editing of docs?

- multiple "collections" under the same app

- ability to copy docs from one version to a later version
("promote"?),
which may then be edited to fit the later version of the software

- anti-spam measures, such as access restrictions, CAPTCHAs, and
moderation of posts/comments, etc.

- ability for users/visitors to add comments to documentation pages

- interface for user administration

- interface for "collection" administration

- interface for "version" administration

- interface for managing comments

- ability to view "diffs" between revisions of documents

- ability to view history of changes of documents

- interface for managing table of contents of a collection

- ability to upload images/media/other binary files

- ability to configure for single or multi collection operation

- full text search (using TurboLucene?)

- RSS feeds for new posts, edits to docs, comments?

- support for multiple markup languages, though admins should be able
to
restrict usage of these on a app, collection, version, or
individual
document basis

- make the app (relatively) easily "skinnable"/"themeable"

- it would be really nice to have a way to cleanly/seamlessly
integrate
auto-generated API docs into the system alongside hand-written
narrative docs (not sure how to do this yet, though)

- export of documents to a number of formats, especially:
* HTML
* PDF
* CHM
* XML? DocBook?

- easy setup: when installed, app should ask user for config vars

Types of documentation
======================
- API/Technical docs (sometimes auto-generated)
- Introductory docs
- Tutorials
- HOWTOs
- man pages
- FAQs

Glossary/Terminology
====================
"Collection" - a grouping of documents for a particular project
(e.g. docs for Docudo)
"Version" - the version of the software being documented
(e.g. Docudo _v2.0_)
"Page" - a web view of a Document, the Document after it has been
rendered to the browser by Docudo
"Revision" - a particular saved version of a Document
"Document" - the source markup of the content for a Page, a single
logical
piece of documentation
"Promotion" - the process by which a document from an earlier version
within a collection can be copied to a later version.

e.g. Docudo 2.0 is released and the maintainers of the
documentation
want to copy the contents of the 1.0 documentation into the 2.0
"version" of the docs. They "promote" the docs from the 1.0
version
to the 2.0 version.

-> What should a group of collections be called?
Do we even need a term for this?

User Model
==========
- app visitors should be able to optionally register for a user
account
- app admins should be able to restrict registration to a desired
level
(e.g. open, wait for admin approval, no registration)
- user accounts should be grouped by access priviledges into access
groups
(e.g. readers, contributors, reviewers, editors, administrators)

Document Model
==============
- Documents should be marked up in a simple language (reST, Markdown,
etc.)
- Documents should be stored in one of several storage backends (svn,
cvs,
bazaar, filesystem, etc.)
- Documents should have a "status" (e.g. contributed, "promoted",
official, etc.)
- Documents may have other metadata associated with them, especially
"tags"
- Should be able to add "tags" to a collection and/or version which
will
apply to all documents in that collection/version

File Structure
==============
/collection
/version
/media
/toc
/documents?

URL Structure
=============
[possibly make this configurable using "routes"]
APP_ROOT/collections - list of collections
COLLECTION_ROOT/welcome - welcome page, list of versions
COLLECTION_ROOT/VERSION/ - version home page, show TOC
COLLECTION_ROOT/VERSION/Page - display page "Page"
COLLECTION_ROOT/VERSION/Page/edit - edit page "Page"
COLLECTION_ROOT/admin/ - main admin interface
COLLECTION_ROOT/admin/comments
COLLECTION_ROOT/admin/users
COLLECTION_ROOT/admin/catalog
COLLECTION_ROOT/admin/versions
COLLECTION_ROOT/documents - list of all documents in collection
COLLECTION_ROOT/help/ - root for online help pages

Widgets
=======
Document editing form widget
Comment form widget
List other versions of this document widget
List related documents (what should this mean exactly?)
TOC navigation widget (prev, next, up, home links)
various administration widgets
...

Storage Backends
================
- each collection should be able to have a seperate backend if desired
- should leverage setuptools, so outside developers can create their
own
storage backend modules as plugins
- each storage backendmust provide a way to store both markup AND
metadata

Possible Storage Backends
=========================
- Subversion (SVN)
- CVS
- bazaar?
- SQL database (SQLite, MySQL, PostgreSQL, MSSQL, Oracle, Firebird,
etc.)
- flat files?

Comment system
==============
Docudo should have a flexible comment system, with various anti-spam
measures that are configurable by administrators.

At least the following security measures should be implemented:
- Administrators should be able to restrict comments to:
- anonymous users
- registered users
- any of several user groups (reviewers, editors, admins)
- an optional CAPTCHA system (maybe pluggable?) to verify that the
commenter is an actual person
- a moderation system
- admins should be able to specify whether comments are moderated
- moderation should perhaps be applied differently to different
groups,
so ,for example, admins comments might be unmoderated, but
everyone else's comments are moderated
- there needs to be an interface for approving comments
- there should be a "permission" that allows approving comments

Christoph Zwerschke

unread,

Oct 19, 2007, 5:37:09 AM10/19/07

to doc...@googlegroups.com

Thank you, Kevin. That list looks like a good start.

As far as I understand, the Docudo project is aiming at two things:

* A collaboration tool for writing software documentation, similar to a
Wiki, but more restricted, specific and focused on software docs.
* Integration of this written documentation with auto-generated
reference created by tools such as PythonDoc, epydoc or Pudge.

Did I get this main idea right?

Here are some additions to your list, plus quotes from one of the first
postings here by Ian Bicking.

= Desired Features =

- Another very important feature is syntax highlighting in code
snippets, and the possibility to easily cut & paste code snippets from
the documentation.

- When entering docs using a markup language, it should be very easy to
look up the syntax (without going to another website). I like the way
how this is solved here: http://rst2a.com/create/type/

- It should be easy link to the auto generated reference.

- The search function could optionally cover an accompanying mailing
list or other external documentation systems.

- Multiple languages (maybe even computer aided translation, i.e. create
a first draft of a translation by using somethng like Bablefish)

- A menu for switching between the different languages and versions of a
page should be available on all pages.

= Comment system =

- IB: "Allow for inline comments. This is as open as possible, so long
as it keeps out spam. I'm not sure if the comments should be always
present, present in a parallel part of the site that is linked from each
page, or what. If the comments are inline, it's a lot easier to process
and integrate them into documentation."

- Another comment: "An idea for inline coments: The SQL object docs have
an interesting approach to this with their sticky tag things. Not having
to log in is great. This encourages anyone to contribute in minimal
time. Yum. However, the notes seem to interfere with the flow of the
documentation. It would be interesting to make this more of a footnote
notation. As in, a user double clicks (or whatever action is most
intuitive) on a spot in the doc, a sticky note comes up, comments are
put in, and instead of leaving the note there, a footnote is added to
the doc.Clicking the footnote causes the sticky note to reappear. If
these are editable by all, then there could also be a way to
automagically integrate these comments into the doc by the moderator.

= Possible Storage Backends =

- Mercurial and Codeville should be considered as well.

- Maybe instead of supporting several backends we should choose only one
best match, so that we can make use of all its features? Otherwise we
can only use the lowest common denominator of all the backends, and that
may not be very powerful, plus some of them have completly different
concepts so it will be difficult to reconcile all of them. We may have
to spend too much energy into supporting different backends or inventing
a universal plugin API.

- Checks of the markup syntax and spelling as well as previews should be
as easy as possible.

= Types of documentation =

- We should also support documentation that is not aimed at the end user
of the software, but used by the developers internally: e.g. the kind of
things we are currently discussing, brainstorming, feature lists, todo
lists, milestones, etc.

= Document Model =

- Should we support Tex/LaTex, too?

- Should tags be definable by all documentation writers ("folksonomy",
"wiki style"), or only by admins who also define a certain "tag
hierarchy" from which a hierarchical table of context for a collection
can be derived? I prefer the latter.

- IB: "I don't think user contributed documentation needs to be
structured like a wiki. I think a simple index (maybe with categories)
is sufficient. Maintaining wiki-style navigation is not useful or needed
for software projects."

- Each collection should be subdivided into one or more "books". All
documents should be assigned to exactly one of these books. The books
should be exportable in docbook format. All books and their tables of
content should be accesible from the start page. The auto generated API
reference should go to a separate book or to the appendix of one ore
more of these books.

- Use an intermediate document format (e.g. docbook) describing the
semantic of the document, independent of the syntax used for
entering/importing the document? If we want to support inline comments
and the like, then this will be needed.

= Glossary/Terminology =

> What should a group of collections be called? Do we even need a term
for this?

I think we should group like that: collection - book - chapter - etc.
i.e. the "collection" is the top level, and "book" is probably more like
your "collection".

By the way, I think we should start using Docudo to document itself as
soon as possible.

-- Christoph

Kevin Horn

unread,

Oct 19, 2007, 1:32:36 PM10/19/07

to doc...@googlegroups.com

On 10/19/07, Christoph Zwerschke <ci...@online.de> wrote:

Thank you, Kevin. That list looks like a good start.

As far as I understand, the Docudo project is aiming at two things:

* A collaboration tool for writing software documentation, similar to a
Wiki, but more restricted, specific and focused on software docs.

Correct. Though I think it may be possible (or even likely) that users will want to use Docudo for other types of technical documentation as well. We should try to anticipate these needs, as long as it doesn't distract too much from the primary mission of DOcudo, which is software tools.

* Integration of this written documentation with auto-generated
reference created by tools such as PythonDoc, epydoc or Pudge.

I'm not actually sure if this was an original design goal (I can't find any real notes or discussion of this topic from back then), but I certainly think it should be. This will be tricky to get right, I think. Python has an awful lot of documentation systems, and pythonistas tend to be somewhat "religious" about their choice of API doc tools. This may make it difficult to automate/integrate with API docs, unless we choose only one tool to integrate with. I'm not really an expert on any of the docs tools out tehre, so I don't really know how much of an issue this is likely to be.

Did I get this main idea right?

Yup.

Here are some additions to your list, plus quotes from one of the first
postings here by Ian Bicking.

= Desired Features =

- Another very important feature is syntax highlighting in code
snippets, and the possibility to easily cut & paste code snippets from
the documentation.

YES! I was actually thinking about this, but forgot to put it in the notes. It drives me nuts when its difficult to put code snippets in systems which are focused on discussing/documenting code. The cut/paste issue is a big one for me as well.

And code snippets in comments are critical. I basically learned PHP (yuck!) from code snippets in the comments of their docs.

- When entering docs using a markup language, it should be very easy to
look up the syntax (without going to another website). I like the way
how this is solved here: http://rst2a.com/create/type/

That's quite a nice layout.

- It should be easy link to the auto generated reference.

Agreed. The problem is how?

- The search function could optionally cover an accompanying mailing
list or other external documentation systems.

Hmmm. Maybe for a later iteration. It would undoubtedly be useful to some. I'd prefer to have a way to turn this off. Maybe some checkboxes in the search interface, one for each source.

- Multiple languages (maybe even computer aided translation, i.e. create
a first draft of a translation by using somethng like Bablefish)

That's an interesting idea. I don't have much experience with translation of docs, so I don't know how useful this would be. Any opinions out there?

- A menu for switching between the different languages and versions of a
page should be available on all pages.

Definitely.

= Comment system =

- IB: "Allow for inline comments. This is as open as possible, so long
as it keeps out spam. I'm not sure if the comments should be always
present, present in a parallel part of the site that is linked from each
page, or what. If the comments are inline, it's a lot easier to process
and integrate them into documentation."

I'm not a huge fan of inline comments. While they always sound good in theory, I have yet to see them done in a way where they don't seem to just get in my way.

- Another comment: "An idea for inline coments: The SQL object docs have
an interesting approach to this with their sticky tag things. Not having
to log in is great. This encourages anyone to contribute in minimal
time. Yum. However, the notes seem to interfere with the flow of the
documentation. It would be interesting to make this more of a footnote
notation. As in, a user double clicks (or whatever action is most
intuitive) on a spot in the doc, a sticky note comes up, comments are
put in, and instead of leaving the note there, a footnote is added to
the doc.Clicking the footnote causes the sticky note to reappear. If
these are editable by all, then there could also be a way to
automagically integrate these comments into the doc by the moderator.

This is an interesting idea. I think this could work nicely, at least for short comments.

Maybe what we really need is 2 different "comment-like" concepts. The problem I've always had with inline comments has always been that some people will put huge comments inline, which, even with a pop-up ballon type of interface, really interferes with document flow (at least for me).

Would it be helpful to have both an "annotation", which would be a short comment about something minor, like maybe a spelling error, and a ... hmmm, call it a "gloss", which would be a longer bit of explanatory text about the concepts in the document.

Maybe your footnote would bypass the need for this, I don't know. Of course, there are technical issues with inline comments, too...

= Possible Storage Backends =

- Mercurial and Codeville should be considered as well.

I just started looking into Mercurial yesterday, and I have to say, being Python-based is a huge plus. Though I confess, the whole distributed version control thing kinda makes my head spin. Guess I'm just set in my ways.

- Maybe instead of supporting several backends we should choose only one
best match, so that we can make use of all its features? Otherwise we
can only use the lowest common denominator of all the backends, and that
may not be very powerful, plus some of them have completly different
concepts so it will be difficult to reconcile all of them. We may have
to spend too much energy into supporting different backends or inventing
a universal plugin API.

Maybe. I don't know that the concepts are really all that different, at least among systems that allow parallel editing by multiple users.

I think that if it's possible, we should support at least a handful of storage backends. If it turns out not to be possible, well, I guess it's not possible. I'd really like to see it though.

For the first version of the new Docudo, I would settle for having one storage backend, that is as loosely coupled _as possible_ from the rest of the system. I don't think this will be too bad. The original Docudo had only a couple of points where it touched SVN.

- Checks of the markup syntax and spelling as well as previews should be
as easy as possible.

Yes, this was (and is) still missing in the original Docudo. And I learned the hard way that your web application does Bad Things (tm) when docutils can't process your files :(

= Types of documentation =

- We should also support documentation that is not aimed at the end user
of the software, but used by the developers internally: e.g. the kind of
things we are currently discussing, brainstorming, feature lists, todo
lists, milestones, etc.

Any specific ideas on how to do that? I think we might be better off leaving this to the individual users of Docudo, rather than trying to make some kind of special effort to specifically support these things. I mean, they're all just documents, right?

Though I wonder if some kind of Trac/issue tracking system integration would be useful, perhaps as addons.

= Document Model =

- Should we support Tex/LaTex, too?

As an output format, I have no problem with this, though I don't know much about it. I think they are a bit complex to use as an input format though. :)

- Should tags be definable by all documentation writers ("folksonomy",
"wiki style"), or only by admins who also define a certain "tag
hierarchy" from which a hierarchical table of context for a collection
can be derived? I prefer the latter.

Hmmm...not sure about the hierarchical bit. As far as who can define tags, I think it should be configurable. It should just be a permission in the auth system.

OK, I thing I just understood what you meant by the "tag hierarchy" bit. Are you saying that the hierarchy would be separate from the tags themselves? That could maybe work, though how would you handle getting docs in a particular order within a given level of the hierarchy?

- IB: "I don't think user contributed documentation needs to be
structured like a wiki. I think a simple index (maybe with categories)
is sufficient. Maintaining wiki-style navigation is not useful or needed
for software projects."

Fair enough. I think the TOC system in the original Docudo was fairly nice in concept, though it was a little buggy, and wasn't quite finished.

- Each collection should be subdivided into one or more "books". All
documents should be assigned to exactly one of these books. The books
should be exportable in docbook format. All books and their tables of
content should be accesible from the start page. The auto generated API
reference should go to a separate book or to the appendix of one ore
more of these books.

Would these "books" be analogous to what I'm calling a "version"? Or would they be different. Can you provide some examples?

- Use an intermediate document format (e.g. docbook) describing the
semantic of the document, independent of the syntax used for
entering/importing the document? If we want to support inline comments
and the like, then this will be needed.

This is tricky because whatever intermediate format you use has to translate both ways (both from markup, and to markup). Otherwise you can't ever edit the document again! The potential pitfalls for this almost make my eyes bleed, though maybe it isn't as bad as I think.

This would likely prevent the use of multiple markup languages as well. (NOT within a given Docudo instance!! I just mean that the admin could configure Docudo to use one of several options)

Docutils has a "pseudo-xml" format that might be worth looking into.

= Glossary/Terminology =

> What should a group of collections be called? Do we even need a term
for this?

I think we should group like that: collection - book - chapter - etc.
i.e. the "collection" is the top level, and "book" is probably more like
your "collection".

Where do versions come in?

By the way, I think we should start using Docudo to document itself as
soon as possible.

This was a goal for the original project, and IMO caused some problems. Because we wanted to use Docudo to document intself, and Docudo couldn't really do that, very littel actually got documented. I think a better plan is to choose whatever primary storage backend we want to use and set up a "collection" in that and edit it by hand (in a text editor) at first. This will help us actually document as we go, and will also give us some "test data" to use within Docudo.

BTW, one of the original design goals of Docudo was to allow something like this, where people could use either Docudo or SVN + editor to edit the documentation. I'm not sure this has been mentioned in a while, but I think it is a worthy goal.

-- Christoph

Whew! Thanks for all the great input, Christoph. We have a lot to think about, and a lot of work to do!

Kevin Horn

Christoph Zwerschke

unread,

Oct 20, 2007, 5:50:40 AM10/20/07

to doc...@googlegroups.com

Kevin Horn wrote:
> And code snippets in comments are critical. I basically learned PHP (yuck!)
> from code snippets in the comments of their docs.

Right. One idea is that comments support markup (of the same type as the
main document). Then it would be easy to "promote" comments by
integrating them into the main text. On the other hand, comments are
often written by users who may not be aquainted with the markup and do
not want to care about it.

> - It should be easy link to the auto generated reference.
>
> Agreed. The problem is how?

We can defining a placeholder for the root URL of the epydoc generated
pages, then it is very easy to refer to certain modules or classes by
adding "-module" or "-class" to the dotted name etc. These things should
be explained in the online help along with the markup syntax.

> - The search function could optionally cover an accompanying mailing
> list or other external documentation systems.
>
> Hmmm. Maybe for a later iteration. It would undoubtedly be useful to
> some. I'd prefer to have a way to turn this off. Maybe some checkboxes in
> the search interface, one for each source.

Yes, that's a "nice to have" that can be added later. Anyway there
should be an advanced search and a simple search. In the simple search,
instead of having to click "title" or "text" (as in the MoinMoin wiki),
the search should list first the hits where the word is in the title,
and then those where it is only in the text. I.e. automatic sorting by
relevance, as Google does. I really don't want to search a 2nd time when
my term has not been found in the title.

> I'm not a huge fan of inline comments. While they always sound good in
> theory, I have yet to see them done in a way where they don't seem to just
> get in my way.

Like you, I also used PHP some years ago and for me the most attractive
aspect of PHP was the commented online docs. Since they are divided into
very small chunks, the comments were quasi inline, though in fact they
were always at the bottom of the page and never in the way.

Maybe we can use this principle of small chunks also for editing, i.e.
you edit only a subsection at a time, and you have buttons for adding
sections, subsections etc.? Then, you wouldn't even need to remember the
markup syntax for these.

> For the first version of the new Docudo, I would settle for having one
> storage backend, that is as loosely coupled _as possible_ from the rest of
> the system. I don't think this will be too bad. The original Docudo had
> only a couple of points where it touched SVN.

Agree, that's the best way to get started.

> Though I wonder if some kind of Trac/issue tracking system integration would
> be useful, perhaps as addons.

We could use a similar mechanism for refering to software revisions and
bug tickets as we are using for refering to the API reference. We can
provide configuration settings for the bug tracking and version control
used by the documented software (not the one used as the Docudo
backend), and their root URLs, and then Docudo would automatically
create the right links.

> = Document Model =
>> - Should we support Tex/LaTex, too?
>
> As an output format, I have no problem with this, though I don't know much
> about it. I think they are a bit complex to use as an input format though.
> :)

For output, it's so much nicer than anything else. For input, I'm not
sure if we need it. Mathematical formulas can be included as images in
case of need, but usually you don't need them for describing software.

> OK, I thing I just understood what you meant by the "tag hierarchy" bit.
> Are you saying that the hierarchy would be separate from the tags
> themselves? That could maybe work, though how would you handle getting docs
> in a particular order within a given level of the hierarchy?

I meant that the tags should be controlled and ordered hierarchically,
instead of having only a flat tag cloud with duplicates, synonyms etc.
Something like this:

+-model
+-orm
+-sqlalchemy
+-sqlobject
+-view
+-templates
+-kid
+-genshi

> - Each collection should be subdivided into one or more "books". All
>> documents should be assigned to exactly one of these books. The books
>> should be exportable in docbook format. All books and their tables of
>> content should be accesible from the start page. The auto generated API
>> reference should go to a separate book or to the appendix of one ore
>> more of these books.
>
> Would these "books" be analogous to what I'm calling a "version"? Or would
> they be different. Can you provide some examples?

No, the versions (software version described and doc version maintained
by the underlying versioning backend) would be orthogonal to books and
collections (of books). As an example, I was thinking of
http://docs.sun.com/app/docs/prod/software
When you click through the hierarchy, you will get to symbols with three
books ("collections"), and below them the actual books (the symbols look
like pages, but they actually stand for whole books).

> - Use an intermediate document format (e.g. docbook) describing the
>> semantic of the document, independent of the syntax used for
>> entering/importing the document? If we want to support inline comments
>> and the like, then this will be needed.
>
> This is tricky because whatever intermediate format you use has to translate
> both ways (both from markup, and to markup). Otherwise you can't ever edit
> the document again! The potential pitfalls for this almost make my eyes
> bleed, though maybe it isn't as bad as I think.
>
> This would likely prevent the use of multiple markup languages as well.
> (NOT within a given Docudo instance!! I just mean that the admin could
> configure Docudo to use one of several options)

Maybe we should distinguish between the higher level markup defining the
document structure into chapters, sections, subsections, and the lower
level markup used inside these sections for bullet lists, emphasis etc.

If we edit the document in small chunks, providing buttons for adding
and deleting chapters, sections etc. (in a convenient AJAXian way), then
we can forget about the higher level markup, and use our own format for
storing the document structure, keeping the lower level markup in the
chunks. Each chunk could be written in a different markup language. Then
it would be also easy to add comments to individual subsections.

It should be possible to browse the docs in chunks with and without
comments or as whole books (single html, pdf, chm etc.)

-- Chris

Kevin Horn

unread,

Oct 22, 2007, 10:42:04 PM10/22/07

to doc...@googlegroups.com

On 10/20/07, Christoph Zwerschke <ci...@online.de> wrote:

Kevin Horn wrote:
> And code snippets in comments are critical. I basically learned PHP (yuck!)
> from code snippets in the comments of their docs.

Right. One idea is that comments support markup (of the same type as the
main document). Then it would be easy to "promote" comments by
integrating them into the main text. On the other hand, comments are
often written by users who may not be aquainted with the markup and do
not want to care about it.

I think markup should definitely be supported in comments. ReST makes this easy, since it is easily readable as plain text, so if someone doesn't want to bother with markup, they just type plain text.

> - It should be easy link to the auto generated reference.
>
> Agreed. The problem is how?

We can defining a placeholder for the root URL of the epydoc generated
pages, then it is very easy to refer to certain modules or classes by
adding "-module" or "-class" to the dotted name etc. These things should
be explained in the online help along with the markup syntax.

Something like this could work, but would we want to implement it as extensions to whatever markup language (probably reST, initially), or would we want to add in a "preprocessing" step?

Maybe when the user submits edits, the system could automatically resolve some of these things and translate the "-module" or "-class" bits into markup?

> - The search function could optionally cover an accompanying mailing
> list or other external documentation systems.
>
>  Hmmm.  Maybe for a later iteration.  It would undoubtedly be useful to
> some.  I'd prefer to have a way to turn this off.  Maybe some checkboxes in
> the search  interface, one for each source.

Yes, that's a "nice to have" that can be added later. Anyway there
should be an advanced search and a simple search. In the simple search,
instead of having to click "title" or "text" (as in the MoinMoin wiki),
the search should list first the hits where the word is in the title,
and then those where it is only in the text. I.e. automatic sorting by
relevance, as Google does. I really don't want to search a 2nd time when
my term has not been found in the title.

Agreed. The "text" vs. "title" thing bugs me too. :)

> I'm not a huge fan of inline comments. While they always sound good in
> theory, I have yet to see them done in a way where they don't seem to just
> get in my way.

Like you, I also used PHP some years ago and for me the most attractive
aspect of PHP was the commented online docs. Since they are divided into
very small chunks, the comments were quasi inline, though in fact they
were always at the bottom of the page and never in the way.

Maybe we can use this principle of small chunks also for editing, i.e.
you edit only a subsection at a time, and you have buttons for adding
sections, subsections etc.? Then, you wouldn't even need to remember the
markup syntax for these.

I'm worried that this will enforce too much structure on the users. I think Docudo should be flexible enough to handle lots of different cases, and what we decide on might not work for everyone.

> For the first version of the new Docudo, I would settle for having one
> storage backend, that is as loosely coupled _as possible_ from the rest of
> the system. I don't think this will be too bad. The original Docudo had
> only a couple of points where it touched SVN.

Agree, that's the best way to get started.

> Though I wonder if some kind of Trac/issue tracking system integration would
> be useful, perhaps as addons.

We could use a similar mechanism for refering to software revisions and
bug tickets as we are using for refering to the API reference. We can
provide configuration settings for the bug tracking and version control
used by the documented software (not the one used as the Docudo
backend), and their root URLs, and then Docudo would automatically
create the right links.

Hmmm. Maybe we could go with the "preprocessing" on saving an edit idea that I mentioned above, and have a set of filters that get applied that resolve things to markup. These filters could then be made pluggable through setuptools. Then it would be easy to add support for multiple systems as time goes on. Whaddya think?

> = Document Model =
>> - Should we support Tex/LaTex, too?
>
> As an output format, I have no problem with this, though I don't know much
> about it. I think they are a bit complex to use as an input format though.
> :)

For output, it's so much nicer than anything else. For input, I'm not
sure if we need it. Mathematical formulas can be included as images in
case of need, but usually you don't need them for describing software.

LaTex output should be pretty easy when using reST, since there are already tools available. In fact, we may want to use LaTex as an intermediate step to provide PDF output (Google for "pdflatex").

> OK, I thing I just understood what you meant by the "tag hierarchy" bit.
> Are you saying that the hierarchy would be separate from the tags
> themselves?  That could maybe work, though how would you handle getting docs
> in a particular order within a given level of the hierarchy?

I meant that the tags should be controlled and ordered hierarchically,
instead of having only a flat tag cloud with duplicates, synonyms etc.
Something like this:

+-model
   +-orm
     +-sqlalchemy
     +-sqlobject
+-view
   +-templates
     +-kid
     +-genshi

Hmmm. I'm still undecided on this. I think it should definitely be possible for users to manually create/manage a table of contents themselves, but an auto-generated one might be an interesting option.

That said, I think one of the most useful things about tags is that they can cut across a hierarchy, so its easy to find related things even when they aren't "nearby" in a set of documents.

> - Each collection should be subdivided into one or more "books". All
>> documents should be assigned to exactly one of these books. The books
>> should be exportable in docbook format. All books and their tables of
>> content should be accesible from the start page. The auto generated API
>> reference should go to a separate book or to the appendix of one ore
>> more of these books.
>
> Would these "books" be analogous to what I'm calling a "version"? Or would
> they be different. Can you provide some examples?

No, the versions (software version described and doc version maintained
by the underlying versioning backend) would be orthogonal to books and
collections (of books). As an example, I was thinking of
http://docs.sun.com/app/docs/prod/software
When you click through the hierarchy, you will get to symbols with three
books ("collections"), and below them the actual books (the symbols look
like pages, but they actually stand for whole books).

So is something like the following what you are proposing?

- turbogears collection
- Users Guide (book)
- Dev Guide (book)
- HowTo Collection (book)
[then have each one "versioned"]

What kind of URL scheme would we use for this?

> - Use an intermediate document format ( e.g. docbook) describing the

>> semantic of the document, independent of the syntax used for
>> entering/importing the document? If we want to support inline comments
>> and the like, then this will be needed.
>
> This is tricky because whatever intermediate format you use has to translate
> both ways (both from markup, and to markup).  Otherwise you can't ever edit
> the document again!  The potential pitfalls for this almost make my eyes
> bleed, though maybe it isn't as bad as I think.
>
> This would likely prevent the use of multiple markup languages as well.
> (NOT within a given Docudo instance!!  I just mean that the admin could
> configure Docudo to use one of several options)

Maybe we should distinguish between the higher level markup defining the
document structure into chapters, sections, subsections, and the lower
level markup used inside these sections for bullet lists, emphasis etc.

If we edit the document in small chunks, providing buttons for adding
and deleting chapters, sections etc. (in a convenient AJAXian way), then
we can forget about the higher level markup, and use our own format for
storing the document structure, keeping the lower level markup in the
chunks. Each chunk could be written in a different markup language. Then
it would be also easy to add comments to individual subsections.

It should be possible to browse the docs in chunks with and without
comments or as whole books (single html, pdf, chm etc.)

-- Chris

If we do this, it will keep us from using a lot of existing code, or at least keep us from using it in the most effective way. We'd have to come up with our own document format, and then be responsible for tracking where the "outer" markup stops, and the "inner" markup begins. I think its adding a lot of complexity, for very little gain.

Also, it would make the documents tricky to edit "by hand" (i.e. in a text editor). Playing nice with manual editing has always been a design goal of Docudo, and it think it would be a mistake to move away from that.

I can see that it might help with the inline comments issue, but other than that I think it creates more problems than it solves.

Kevin Horn

Christoph Zwerschke

unread,

Oct 24, 2007, 2:51:26 AM10/24/07

to doc...@googlegroups.com

Kevin Horn wrote:
> On 10/20/07, Christoph Zwerschke <ci...@online.de> wrote:

>> We can defining a placeholder for the root URL of the epydoc generated
>> pages, then it is very easy to refer to certain modules or classes by
>> adding "-module" or "-class" to the dotted name etc. These things should
>> be explained in the online help along with the markup syntax.
>
> Something like this could work, but would we want to implement it as
> extensions to whatever markup language (probably reST, initially), or would
> we want to add in a "preprocessing" step?

Maybe both? We could add a simple syntax to reST for API reference links
(similar to the Trac Links in the Trac Wiki) createing special URLs
which could then be postprocessed to point to the right location. These
URLs could also be created when the syntax extension is not available.

>> Maybe we can use this principle of small chunks also for editing, i.e.
>> you edit only a subsection at a time, and you have buttons for adding
>> sections, subsections etc.? Then, you wouldn't even need to remember the
>> markup syntax for these.
>
> I'm worried that this will enforce too much structure on the users. I think
> Docudo should be flexible enough to handle lots of different cases, and what
> we decide on might not work for everyone.

Maybe we should decouple the question of how to store the documents
(source markup or structured format) from how to enter them. Concerning
the latter, we could support both: Editing in chunks and creating the
structure with buttons, or importing documents with structure markup.
And again independently from that, it should be possible to output in
chunks or full documents.

The only difficult thing seems to be how to implement inline comments in
a way that they are not destroyed when the document is edited. One idea
would be to integrate them into the source markup with the possibility
for the document author to delete comments, i.e. removing them from the
markup without trace, by simply clicking a delete button instead of
having to remove the markup.

>> I meant that the tags should be controlled and ordered hierarchically,
>> instead of having only a flat tag cloud with duplicates, synonyms etc.
>> Something like this:
>>
>> +-model
>> +-orm
>> +-sqlalchemy
>> +-sqlobject
>> +-view
>> +-templates
>> +-kid
>> +-genshi
>
> Hmmm. I'm still undecided on this. I think it should definitely be
> possible for users to manually create/manage a table of contents
> themselves, but an auto-generated one might be an interesting option.

Yes, this should only be supplemental to manually or automatically (from
the section titles) created tables of contents.

By the way, if we support inline comments for sections, we could also
support tagging of sections.

> So is something like the following what you are proposing?
>
> - turbogears collection
> - Users Guide (book)
> - Dev Guide (book)
> - HowTo Collection (book)
> [then have each one "versioned"]
>
> What kind of URL scheme would we use for this?

Collections and books should have file names (short titles) as part of
their meta data, so the URL could be collection_filename/book_filename.

Concerning storage format, inline comments etc. this needs more
thoughts, experimentation and discussion that we should postpone. I
agree we should better start on getting a very simple working system
with no additional syntax or formats, and then we can experiment with
different ways to implement inline comments, links to API references
etc. and make additions or changes as necessary.

-- Chris

Kevin Horn

unread,

Oct 24, 2007, 12:05:32 PM10/24/07

to doc...@googlegroups.com

Yes, but where does the "version" (of the documented software) come in? Should it be:

collection_name/book_name/1.0/doc_name

or:

collection_name/1.0/book_name/doc_name ?

I'm leaning toward the second, myself, but I'd be interested in suggestions as to why one might be better than the other.

Concerning storage format, inline comments etc. this needs more
thoughts, experimentation and discussion that we should postpone. I
agree we should better start on getting a very simple working system
with no additional syntax or formats, and then we can experiment with
different ways to implement inline comments, links to API references
etc. and make additions or changes as necessary.

-- Chris

Agreed, we could hypothesize forever about what might be better, but that wouldn't get us any closer to usable software. :)

I think the main things that need to be supported in a first release (first re-release?) are:
- ability to store and retrieve docs from a storage backend
- ability to add/edit/preview/delete(?) documents through the web interface (basically CRUD + preview)
- ability to manage "versions", including ability to "promote" docs to a new version
- basic comments system (not inline), and comment management tools for admins
- user management and/or registration

We should get started on this as soon as we have a TG 2 beta release (or maybe sooner?) to work from.

Kevin Horn

Christoph Zwerschke

unread,

Oct 24, 2007, 3:34:47 PM10/24/07

to doc...@googlegroups.com

Kevin Horn wrote:
> Yes, but where does the "version" (of the documented software) come in?
> Should it be:
>
> collection_name/book_name/1.0/doc_name
> or:
> collection_name/1.0/book_name/doc_name ?
>
> I'm leaning toward the second, myself, but I'd be interested in suggestions
> as to why one might be better than the other.

Yes, the 2nd looks better: turbogears/1.0/installation_guide

But we can make that configurable.

> We should get started on this as soon as we have a TG 2 beta release (or
> maybe sooner?) to work from.

I think we should start sooner so this can also push TG2 development.

-- Chris

Kevin Horn

unread,

Oct 25, 2007, 1:21:08 PM10/25/07

to doc...@googlegroups.com

On 10/24/07, Christoph Zwerschke <ci...@online.de> wrote:

> We should get started on this as soon as we have a TG 2 beta release (or
> maybe sooner?) to work from.

I think we should start sooner so this can also push TG2 development.

-- Chris

This isn't a bad idea, but we'll need to come up with a standardized way of setting up a dev environment, so that we're working frmo the same page.

Mark, are you listening? Any suggestions?

Kevin Horn

Mark Ramm

unread,

Oct 25, 2007, 1:32:21 PM10/25/07

to doc...@googlegroups.com

I'd say we'll have a pretty good TG2 environment next week.

One thing I'm wondering about, and it's not something I know how to do
easily, is 3 way merge.

If we had that we could do things like import all the pylons docs,
edit them with TG specific content, and then when they pylons folks
update their docs we can merge their changes back in.

--Mark

--
Mark Ramm-Christensen
email: mark at compoundthinking dot com
blog: www.compoundthinking.com/blog

Kevin Horn

unread,

Oct 25, 2007, 1:56:58 PM10/25/07

to doc...@googlegroups.com

On 10/25/07, Mark Ramm <mark.mch...@gmail.com> wrote:

I'd say we'll have a pretty good TG2 environment next week.

Excellent!

One thing I'm wondering about, and it's not something I know how to do
easily, is 3 way merge.

If we had that we could do things like import all the pylons docs,
edit them with TG specific content, and then when they pylons folks
update their docs we can merge their changes back in.

--Mark

Hmmm...I was wondering about how 3-way merge would be useful in the context of documentation, and this seems an excellent use case. I still have to wrap my head around the whole concept though, and will probably have to actually use it for a bit to really see how it would be useful. Or at least useful _enough_ to spend a lot of effort supporting it. Not to mention getting users to understand it and use it. I'm sure we'll need more discussion on this issue.

This does bring up a couple of issues though. As much as "locking" a file is anathema in version control these days, when dealing with a web application like this, you have to account for the possibility that multiple users will be trying to edit a file at the same time. Some measures must be taken to warn the users and make sure that they don't overwrite each others changes. Sure the changes will be recorded in the repository, but that's not really helpful. I see 2 major ways of handling this:
1) Lock files that are being edited by Docudo
2) Add a bunch of logic to warn users, detect conflicts, resolve conflicts, etc.

#1 is way easier, and is the route that we took in Docudo 0.1.

Another issue is what I call (when I'm talking to myself :) the "blame issue". There needs to be some way of tying the user of the web applcation to the changes in the repository. Otherwise all changes made through Docudo will be listed in the repository as coming from the same user, which is ... sub-optimal, IMO.

Anyway, things to think about...

Kevin Horn

Erich Heine

unread,

Oct 25, 2007, 2:33:05 PM10/25/07

to doc...@googlegroups.com

Thinking about #2:

Why not take advantage of the built-in code that source control systems already have. For instance in svn, this would be handled by doing an:
svn up on the appropriate dir/file, then resolving any conflicts. the backend could could always change the doc, svn up on it (in case someone else changed it) then just prettify the conflicted sections. then the author that caused the conflict could resolve it. A workflow something like this (assuming most SCMs have similar functionality):

Person 1 starts editing doc in head, rev # is sent in vars.
Person 2 edits same doc in head, rev # is sent in vars.
person 2 saves changes*
person 1 saves changes* conflict found, svn's conflict stuff is prettified, and person one finishes
person 1 saves changes.

* saving changes would be something like this:
svn co -r rev# doc /tmp
write(doc, /tmp/doc)
svn up /tmp/doc
if conflict -> return pretty conflict.
else -> commit.
rm /tmp/

The prettified conflict could display a warning that someone else changed the doc while you were editing it, and your changes are below in green, theirs in red. and you should edit accordingly.

Just a thought.
Erich

On 10/25/07, Kevin Horn <kevin...@gmail.com> wrote:

Kevin Horn

unread,

Oct 25, 2007, 3:08:13 PM10/25/07

to doc...@googlegroups.com

Yeah, something like that would work, and was basically what I meant in option #2.

It's not rocket science (though I've done rocket science, and its not _that_ hard) but it is more work. Not saying its a lot more, but it is more. The extra stuff is mostly (hopefully) calls to underlying storage libs and interface stuff. Its do-able, just may not be what we want to spend our resources (valuable time) on. Then again it might be. I'm pretty sure it wouldn't be too difficult to do with SVN as an underlying engine, but I don't know other version control systems well enough to say whether they would be easier, harder, or about the same to deal with. I assume they all have ways of dealing with this problem.

Kevin Horn

Reply all

Reply to author

Forward

0 new messages