Should we rethink comments?

1 view
Skip to first unread message

Michael C. Harris

unread,
Mar 23, 2008, 1:33:31 AM3/23/08
to Habari Dev
From the department of random thoughts.

Why aren't comments a type of post with a content type of comment,
with a relationship to a post of another type (entry or page)? As far
as I can see there's a distinction being drawn between content that's
created by a blog's authors and content created by others. But when
thinking about content types recently (in the context of creating a
'discussion' type) I've had a vague feeling lately that this is a bit
arbitrary and limiting.

Obviously there are lots of web services (in a generic sense) being
developed where the 'others' are the primary producers of content.
Think Digg, or even Twitter (yes, we could argue about that, but
that's not my point). I think Habari is in a good position to be able
to be a platform for services like that.

So, would it make sense to reimplement comments as a kind of post, and
use this as an example of how posts can be related to other posts,
either of the same or different content types?

Anyone have any thoughts?

--
Michael C. Harris, School of CS&IT, RMIT University
http://twofishcreative.com/michael/blog

Scott Merrill

unread,
Mar 23, 2008, 8:57:33 AM3/23/08
to habar...@googlegroups.com
> Why aren't comments a type of post with a content type of comment,
> with a relationship to a post of another type (entry or page)? As far
> as I can see there's a distinction being drawn between content that's
> created by a blog's authors and content created by others. But when
> thinking about content types recently (in the context of creating a
> 'discussion' type) I've had a vague feeling lately that this is a bit
> arbitrary and limiting.
...

> So, would it make sense to reimplement comments as a kind of post, and
> use this as an example of how posts can be related to other posts,
> either of the same or different content types?

Posts have the following characteristics about them stored in the DB:
id, slug, content_type, title, guid, content, cached_content,
user_id, status, pubdate, updated

Comments have the following characteristics about them stored in the DB:
id, post_id, name, email, url, ip, content, status, date, type

If we were to store comments as a new post type, we'd necessarily need
to generate slugs and GUIDs for each comment (not necessarily a bad
thing). It also means that for each post we'd be storing an author
name, email, and URL, which doesn't make for a very normalized
database when we have a users table designed to hold that stuff.

An early idea put forward by Chris Davis was to create a user account
for every person who commented, with the idea that after X number of
comments that account might be "promoted" to be given the ability to
create new posts on the site. Unifiying posts and comments would
perhaps make Chris' idea a little more meaningful.

Renaming "post_id" to "parent_id" would permit us to have a
hierarchical (one-to-one) relationship between posts, such that a page
could be a child of another page. It would also permit us to have
threaded comments pretty easily: a feature quite a few people seem to
like.

I admit that my first reaction to this proposition is a negative one:
I think that the content stored for posts and comments is sufficiently
distinct that we'd be creating long-term headaches for ourselves by
trying to store them both in the same table, with the same set of
columns. It's an interesting idea, though, and I'm willing to explore
it further in discussion.

Cheers,
Scott

Sean T Evans

unread,
Mar 23, 2008, 8:33:30 AM3/23/08
to habar...@googlegroups.com
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

What's interesting is that your post here made me think that it's
actually a good idea. Comments are content, and they have the same
things going on that a post does in many ways.

Post => Comment
id => id
slug => this could actually be useful for linking directly to a specific
comment. Rather than http://example.com/post-slug#comment-8019 we could
use something like http://example.com/post-slug/comment-username
content_type => this would obviously be "comment"
title => I can see sites using the ability to add a title to a comment
guid => guid
content => content
cached_content => I don't know what this does.
user_id => could be problematic if we don't require commenters to
"login" somehow, as I'm likely to end up with comments from my brother
ending up with "Geoff" "Geoffrey" "Yamichi" "Little Brother" being
listed as users.
status => status
pubdate => date
updated => I can see the eventual creation of a plugin that will allow
users to edit comments.
? => type (could this simply be an expansion of status?

Plus, the addition of a "Parent ID" would give us the options that
Skippy mentioned as well as allowing sub-pages, which we've seen some
interest in.

This represents a fairly significant change in how "comments" are
considered in the grand scheme of things, and to some extent elevates
them. I don't think this is a bad thing, and in the long run, I think it
would streamline the code somewhat as all content would be handled the
same way.

If nothing else, it's a chance to explore some assumption about content.
- --
Sean T Evans
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFH5k4amQpMBUWJpdsRAoVNAJ90zE7m7o6XqINxfu4UONmNSihYLQCdFOIu
SMhEc0fn/nCgPyGo3X5oUBI=
=FP9t
-----END PGP SIGNATURE-----

Michael C. Harris

unread,
Mar 23, 2008, 6:15:48 PM3/23/08
to habar...@googlegroups.com
On Sun, Mar 23, 2008 at 08:57:33AM -0400, Scott Merrill wrote:
>
> > Why aren't comments a type of post with a content type of comment,
> > with a relationship to a post of another type (entry or page)? As far
> > as I can see there's a distinction being drawn between content that's
> > created by a blog's authors and content created by others. But when
> > thinking about content types recently (in the context of creating a
> > 'discussion' type) I've had a vague feeling lately that this is a bit
> > arbitrary and limiting.
> ...
> > So, would it make sense to reimplement comments as a kind of post, and
> > use this as an example of how posts can be related to other posts,
> > either of the same or different content types?
>
> Posts have the following characteristics about them stored in the DB:
> id, slug, content_type, title, guid, content, cached_content,
> user_id, status, pubdate, updated
>
> Comments have the following characteristics about them stored in the DB:
> id, post_id, name, email, url, ip, content, status, date, type
>
> If we were to store comments as a new post type, we'd necessarily need
> to generate slugs and GUIDs for each comment (not necessarily a bad
> thing). It also means that for each post we'd be storing an author
> name, email, and URL, which doesn't make for a very normalized
> database when we have a users table designed to hold that stuff.

I haven't delved into content types terribly deeply yet (starting is
what raised the question) but isn't the whole point of different
content types that we can store different information about them?
There is a base level of information stored about posts and comments,
discussions, contact pages and polls each have their own specific
extra data. If that's not easy, the whole idea of content types
becomes much less interesting.

> Renaming "post_id" to "parent_id" would permit us to have a
> hierarchical (one-to-one) relationship between posts, such that a page
> could be a child of another page. It would also permit us to have
> threaded comments pretty easily: a feature quite a few people seem to
> like.

It's partly the relationship between post types that I wanted to
explore. If we can have a simple and flexible way to define a content
type that has certain relationships with other posts, I think
interesting things will happen.

> I admit that my first reaction to this proposition is a negative one:
> I think that the content stored for posts and comments is sufficiently
> distinct that we'd be creating long-term headaches for ourselves by
> trying to store them both in the same table, with the same set of
> columns. It's an interesting idea, though, and I'm willing to explore
> it further in discussion.

Again, if it is a headache to store different information for
different content types, now is when we should make it easier. Perhaps
we do need a different table for each content type that's created,
rather than just dumping everything in post info.

Michael C. Harris

unread,
Mar 23, 2008, 6:21:02 PM3/23/08
to habar...@googlegroups.com
On Sun, Mar 23, 2008 at 08:33:30AM -0400, Sean T Evans wrote:
>
> user_id => could be problematic if we don't require commenters to
> "login" somehow, as I'm likely to end up with comments from my brother
> ending up with "Geoff" "Geoffrey" "Yamichi" "Little Brother" being
> listed as users.

We could have 'anonymous' as the default author for content created by
'others'.

> Plus, the addition of a "Parent ID" would give us the options that
> Skippy mentioned as well as allowing sub-pages, which we've seen some
> interest in.

The ability to define relationships between posts would give us the
ability to have hierarchical pages and threaded comments almost
trivially, and it's the fact that that would be trivial that makes me
excited about other possibilities I haven't even thought of yet.

> This represents a fairly significant change in how "comments" are
> considered in the grand scheme of things, and to some extent elevates
> them. I don't think this is a bad thing, and in the long run, I think it
> would streamline the code somewhat as all content would be handled the
> same way.

For 'comments' think 'content created by others'.

> If nothing else, it's a chance to explore some assumption about content.

Questioning assumptions has got to be a good thing :)

Scott Merrill

unread,
Mar 23, 2008, 8:51:52 PM3/23/08
to habar...@googlegroups.com
> > If we were to store comments as a new post type, we'd necessarily need
> > to generate slugs and GUIDs for each comment (not necessarily a bad
> > thing). It also means that for each post we'd be storing an author
> > name, email, and URL, which doesn't make for a very normalized
> > database when we have a users table designed to hold that stuff.
>
> I haven't delved into content types terribly deeply yet (starting is
> what raised the question) but isn't the whole point of different
> content types that we can store different information about them?
> There is a base level of information stored about posts and comments,
> discussions, contact pages and polls each have their own specific
> extra data. If that's not easy, the whole idea of content types
> becomes much less interesting.

The point of defining content types is to allow plugins to "do stuff"
with those content types. The example we most often bandy about is
podcasting: a "podcast" content type would allow a plugin to
automatically generate enclosures for linked media. It would allow a
plugin to render a different set of input controls when composing the
item. Most notably would be the input for the media file: this is
something that is useful primarily for podcasts, and not necessarily
useful for posts and pages.

Another example of a content type would be the so-called "CommentBack"
feature discussed on our Summer of Code ideas page.

I'm still leery of the notion of storing comments in the same table as
posts. From a database normalization perspective, it seems wrong to
me to conflate comments and posts together. I admit, though, that I'm
not a DB expert, so I'm interested in the perspective of folks with
more expertise on the matter than me.

> > Renaming "post_id" to "parent_id" would permit us to have a
> > hierarchical (one-to-one) relationship between posts, such that a page
> > could be a child of another page. It would also permit us to have
> > threaded comments pretty easily: a feature quite a few people seem to
> > like.
>
> It's partly the relationship between post types that I wanted to
> explore. If we can have a simple and flexible way to define a content
> type that has certain relationships with other posts, I think
> interesting things will happen.

For the record, I believe this can be accomplished now through the use
of postinfo records. A plugin could create a many-to-many
relationship among posts, whereas the addition of a "parent_id" field
in the core post table would only support a one-to-one (or
many-to-one) relationship.

> > I admit that my first reaction to this proposition is a negative one:
> > I think that the content stored for posts and comments is sufficiently
> > distinct that we'd be creating long-term headaches for ourselves by
> > trying to store them both in the same table, with the same set of
> > columns. It's an interesting idea, though, and I'm willing to explore
> > it further in discussion.
>
> Again, if it is a headache to store different information for
> different content types, now is when we should make it easier. Perhaps
> we do need a different table for each content type that's created,
> rather than just dumping everything in post info.

It's not currently a headache to store different information for
different content types, because all posts have the same set of basic
info. Additional data can be stored in the postinfo table as needed.
Comments have a fundamentally different (albeit similar) set of data,
which suggests to me that they should be stored separately.

I agree that challenging assumptions is a fine thing to do; and I'm
not at all opposed to continuing this discussion. What do folks see
as the long-term benefit(s) of storing comments in the post table?
How will it make things easier for Habari? How will it improve things
for our users?

Cheers,
Scott

Michael C. Harris

unread,
Mar 23, 2008, 9:16:26 PM3/23/08
to habar...@googlegroups.com
On Sun, Mar 23, 2008 at 08:51:52PM -0400, Scott Merrill wrote:
>
> > > If we were to store comments as a new post type, we'd necessarily need
> > > to generate slugs and GUIDs for each comment (not necessarily a bad
> > > thing). It also means that for each post we'd be storing an author
> > > name, email, and URL, which doesn't make for a very normalized
> > > database when we have a users table designed to hold that stuff.
> >
> > I haven't delved into content types terribly deeply yet (starting is
> > what raised the question) but isn't the whole point of different
> > content types that we can store different information about them?
> > There is a base level of information stored about posts and comments,
> > discussions, contact pages and polls each have their own specific
> > extra data. If that's not easy, the whole idea of content types
> > becomes much less interesting.
>
> The point of defining content types is to allow plugins to "do stuff"
> with those content types. The example we most often bandy about is
> podcasting: a "podcast" content type would allow a plugin to
> automatically generate enclosures for linked media. It would allow a
> plugin to render a different set of input controls when composing the
> item. Most notably would be the input for the media file: this is
> something that is useful primarily for podcasts, and not necessarily
> useful for posts and pages.

By saying "a different set of input controls" you're basically
agreeing with me, "store different information." Why else would you
have different input controls? The inputs all just put stuff in
content?

> Another example of a content type would be the so-called "CommentBack"
> feature discussed on our Summer of Code ideas page.
>
> I'm still leery of the notion of storing comments in the same table as
> posts. From a database normalization perspective, it seems wrong to
> me to conflate comments and posts together. I admit, though, that I'm
> not a DB expert, so I'm interested in the perspective of folks with
> more expertise on the matter than me.

They're just content. If we can work out how we can have "just
content" stored happily, I think we will give ourselves an enormous
amount of flexibility.

> > > Renaming "post_id" to "parent_id" would permit us to have a
> > > hierarchical (one-to-one) relationship between posts, such that a page
> > > could be a child of another page. It would also permit us to have
> > > threaded comments pretty easily: a feature quite a few people seem to
> > > like.
> >
> > It's partly the relationship between post types that I wanted to
> > explore. If we can have a simple and flexible way to define a content
> > type that has certain relationships with other posts, I think
> > interesting things will happen.
>
> For the record, I believe this can be accomplished now through the use
> of postinfo records. A plugin could create a many-to-many
> relationship among posts, whereas the addition of a "parent_id" field
> in the core post table would only support a one-to-one (or
> many-to-one) relationship.

For the record, it wasn't me who said anything about renaming post_id
to parent_id. I was suggesting a generic way of expressing
relationships between posts. I can see how this could be done in the
postinfo record. I'm also no DB expert, but I'm not sure that is the
best way to go about it. The more I think about it the more I think a
new content type should automatically create an in info table that
holds information specific to that type.

> > > I admit that my first reaction to this proposition is a negative one:
> > > I think that the content stored for posts and comments is sufficiently
> > > distinct that we'd be creating long-term headaches for ourselves by
> > > trying to store them both in the same table, with the same set of
> > > columns. It's an interesting idea, though, and I'm willing to explore
> > > it further in discussion.
> >
> > Again, if it is a headache to store different information for
> > different content types, now is when we should make it easier. Perhaps
> > we do need a different table for each content type that's created,
> > rather than just dumping everything in post info.
>
> It's not currently a headache to store different information for
> different content types, because all posts have the same set of basic
> info. Additional data can be stored in the postinfo table as needed.
> Comments have a fundamentally different (albeit similar) set of data,
> which suggests to me that they should be stored separately.

Which suggests to me that we are not taking full advantage of content
types, and that there may be some architectural blocks to it.

> I agree that challenging assumptions is a fine thing to do; and I'm
> not at all opposed to continuing this discussion. What do folks see
> as the long-term benefit(s) of storing comments in the post table?
> How will it make things easier for Habari? How will it improve things
> for our users?

I think you're asking the wrong question (although maybe I started
it). What do folks see as the long-term benefit of comments and other
content generated by non-authors being considered posts?

I think I've said what I think, so I'll try to shut up for a while and
see if others have opinions.

Sean T Evans

unread,
Mar 23, 2008, 11:21:47 PM3/23/08
to habar...@googlegroups.com
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

>> I agree that challenging assumptions is a fine thing to do; and I'm


>> not at all opposed to continuing this discussion. What do folks see
>> as the long-term benefit(s) of storing comments in the post table?
>> How will it make things easier for Habari? How will it improve things
>> for our users?
>
> I think you're asking the wrong question (although maybe I started
> it). What do folks see as the long-term benefit of comments and other
> content generated by non-authors being considered posts?
>
> I think I've said what I think, so I'll try to shut up for a while and
> see if others have opinions.
>

Perhaps the very idea of "We don't care what kind of content you have,
we want to be able to process it." and "Content is content, regardless
of where it comes from." are good starting points for the pro side of
this discussion (which is where I'm finding myself at this point).

By using comments as a starting point, we have a basis for integrating
this mindset into Habari. Then, when down the road, someone says "I want
to use foo as my primary content type." we're ready to fit it into our
system. Whether this is podcasts, a photo blog, recipes or something we
haven't even thought of yet, we'll be able to say "This is where this
data goes, and this is where the other data goes."

Setting aside whatever we're currently saving for comments,
fundamentally how does a comment differ from a post? Think that for the
end user, there probably isn't much difference between storing comments
in their own table vs. the post table, but I feel that for designers and
developers, having a consistent way of dealing with all the content on a
site would be beneficial. I can envision people making use of tagging
comments, and retrieving for the sake of things like a "recent comments"
thing in a sidebar or even being able to search comments could be made
more simple.

I think this may be the most beneficial aspect of making this change is
that we have a single point of reference to get at content, regardless
of type or source.
- --
Sean T.


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFH5x5LmQpMBUWJpdsRAsIbAKCwku4EZ9ley89/BwuiEq7yi+Ua/gCcCl7o
Ys6zfqX4vxjsqgKgPtZf2g8=
=UPCz
-----END PGP SIGNATURE-----

Michael C. Harris

unread,
Mar 24, 2008, 1:25:54 AM3/24/08
to habar...@googlegroups.com
On Sun, Mar 23, 2008 at 11:21:47PM -0400, Sean T Evans wrote:
> >> I agree that challenging assumptions is a fine thing to do; and I'm
> >> not at all opposed to continuing this discussion. What do folks see
> >> as the long-term benefit(s) of storing comments in the post table?
> >> How will it make things easier for Habari? How will it improve things
> >> for our users?
> >
> > I think you're asking the wrong question (although maybe I started
> > it). What do folks see as the long-term benefit of comments and other
> > content generated by non-authors being considered posts?
> >
> > I think I've said what I think, so I'll try to shut up for a while and
> > see if others have opinions.

I should never say this. I can't shut up :)

> Perhaps the very idea of "We don't care what kind of content you have,
> we want to be able to process it." and "Content is content, regardless
> of where it comes from." are good starting points for the pro side of
> this discussion (which is where I'm finding myself at this point).
>
> By using comments as a starting point, we have a basis for integrating
> this mindset into Habari. Then, when down the road, someone says "I want
> to use foo as my primary content type." we're ready to fit it into our
> system. Whether this is podcasts, a photo blog, recipes or something we
> haven't even thought of yet, we'll be able to say "This is where this
> data goes, and this is where the other data goes."

Looking at it another way, if comments hadn't been built in as their
own type, it would make perfect sense for someone to write a plugin to
create a comments content type.

My relationships between posts argument would be completely addressed
by having a table created for new content types. Comments get a
CommentInfo table, discussions get a DiscussionInfo table and polls
get a PollInfo table, defining whatever data and relationships they
want.

> Setting aside whatever we're currently saving for comments,
> fundamentally how does a comment differ from a post? Think that for the
> end user, there probably isn't much difference between storing comments
> in their own table vs. the post table, but I feel that for designers and
> developers, having a consistent way of dealing with all the content on a
> site would be beneficial. I can envision people making use of tagging
> comments, and retrieving for the sake of things like a "recent comments"
> thing in a sidebar or even being able to search comments could be made
> more simple.

Or having moddable posts/comments.

> I think this may be the most beneficial aspect of making this change is
> that we have a single point of reference to get at content, regardless
> of type or source.

+1

Randy Walker

unread,
Mar 25, 2008, 9:23:22 PM3/25/08
to habar...@googlegroups.com
This seems like a good idea to me. Comments are content... their only
difference is they're not created by the blog author... and that's not
even always true.

+1 from me for comments as content type.

~Randy

Graham Christensen

unread,
Mar 25, 2008, 9:33:07 PM3/25/08
to habar...@googlegroups.com
My concern with making comments & posts the same, is filtering would
have to be on a content-type basis; I want to be able to put HTML/etc in
my posts, but not have my commenter's do the same.

Graham Christensen

http://itrebal.com - Customized Web Hosting
Graham.Ch...@iamgraham.net

Michael C. Harris

unread,
Mar 25, 2008, 9:48:50 PM3/25/08
to habar...@googlegroups.com
On Tue, Mar 25, 2008 at 09:33:07PM -0400, Graham Christensen wrote:
>
> My concern with making comments & posts the same, is filtering would
> have to be on a content-type basis; I want to be able to put HTML/etc in
> my posts, but not have my commenter's do the same.

I don't see any reason why we can't have content type based filtering.

Arthus Erea

unread,
Mar 25, 2008, 10:01:06 PM3/25/08
to habari-dev
I am definitely for comments as a content-type because of the reasons
discussed before. Of course, we will have to add some more content-
type filtering and api features, but in the end I think they are worth
it. Think of the possibilities this investment and precedence can
create: podcasting, photos, etc.

Plus, a many-to-one relationship between posts would allow for some
very interesting things like subpages, (subentries!), podcasts as
parts of an entry, etc. Truly mind-boggling.

By regarding comments as another post type, we can save repetitive
code, set a standard for other content types, and lots more.

On Mar 25, 9:48 pm, "Michael C. Harris" <michael.twof...@gmail.com>
wrote:

Andrew da Silva

unread,
Mar 25, 2008, 10:09:50 PM3/25/08
to habari-dev
Could we make a single class, like templateengine.php, which would
store content, and have each content-type a class to override or
implement functions like get()...
(this that make sense?)

Graham Christensen

unread,
Mar 25, 2008, 10:37:14 PM3/25/08
to habar...@googlegroups.com
Many-to-many support would be more fun :D

Graham Christensen

Michael C. Harris

unread,
Mar 28, 2008, 4:31:09 AM3/28/08
to habar...@googlegroups.com
On Tue, Mar 25, 2008 at 07:09:50PM -0700, Andrew da Silva wrote:
>
> Could we make a single class, like templateengine.php, which would
> store content, and have each content-type a class to override or
> implement functions like get()...
> (this that make sense?)

Before getting into implementation details, I'd like some agreement as
to whether or not it's a good idea to have a comments content type. It
seems to me that there is support for the idea, so perhaps it's worth
a branch. Is anyone willing to work on this? I will, but it will very
much be time permitting (and given my work on the other branches I'm
supposed to be working on, time isn't very permitting).

jaypipes

unread,
Mar 30, 2008, 2:05:53 PM3/30/08
to habari-dev
Hi all! Long time no talkie... :)

This is an interesting discussion, and one that I think exposes a
number of ideas about how to handle somewhat similar types of
information. From a purely relational-theory perspective, I tend to
frown on schema tables which store information about two separate
entities, with the same field being used for different attributes
depending on what kind of "thing" the record represents. I'll just
post here a schema which has worked well for other applications I've
worked on which had to tackle this type of problem...feel free to
scrap this idea totally or use it as a jumping board... :)

Everyone on this thread has had an accurate viewpoint that comments
and posts share much of the same attributes: a date it was created/
updated, some text for it, etc. However, comments and posts have
important differences that also shouldn't be ignored (like the
"anonymous commenter" point discussed above). These differences
shouldn't really be glossed over in a single table with the same field
representing different things depending on whether the record
represents a comment or a post (status column, for instance..).
Instead, I think the best of both worlds can be established by having
a centralized content table which stores all the common attributes of
a piece of content (be it a post, a comment, a page, whatever) and
then having separate tables for each content type which requires
additional attributes be stored in the DB. This will mean no
redundant information is stored, which is A Good Thing. Any
complexity it adds on a schema level can and should be hidden in the
object model itself.

As I see it from looking at the current schema in system/schema/mysql/
schema.sql, I'd go for something like this:

CREATE TABLE {$prefix}content (
id INT UNSIGNED NOT NULL AUTO_INCREMENT,
content_type SMALLINT UNSIGNED NOT NULL,
content LONGTEXT NOT NULL,
cached_content LONGTEXT NOT NULL,
pubdate DATETIME NOT NULL,
updated TIMESTAMP NOT NULL,
PRIMARY KEY (id),
);

CREATE TABLE {$prefix}posts (
content_id INT UNSIGNED NOT NULL, /* Foreign Key to content.id */
slug VARCHAR(255) NOT NULL,
title VARCHAR(255) NOT NULL,
guid VARCHAR(255) NOT NULL,
user_id SMALLINT UNSIGNED NOT NULL,
status SMALLINT UNSIGNED NOT NULL,
PRIMARY KEY (content_id),
UNIQUE KEY slug (slug(80))
);

CREATE TABLE {$prefix}comments (
content_id INT UNSIGNED NOT NULL, /* Foreign Key to content.id */
post_id INT UNSIGNED NOT NULL,
name VARCHAR(255) NOT NULL,
email VARCHAR(255) NOT NULL,
url VARCHAR(255) NULL,
ip INT SIGNED NOT NULL,
status SMALLINT UNSIGNED NOT NULL,
type SMALLINT UNSIGNED NOT NULL,
PRIMARY KEY (content_id),
KEY post_id (post_id)
);

With this kind of schema, you gain flexibility to add new specific
content types as you need (with separate tables for specific
attributes of the new content type) but keep normalization rules and
don't store redundant or non-entity-specific information...

Also, someone mentioned the ability to relate posts to other posts
(sub-posts) and also having threaded comments. Both of these can be
accomplished using the schema above by adding a new simple
relationship table that stores an adjacency set model:

CREATE TABLE {$prefix}content2content (
parent_content_id INT UNSIGNED NOT NULL, /* Foreign Key to
content.id */
child_content_id INT UNSIGNED NOT NULL /* Foreign Key to content.id
*/
PRIMARY KEY (parent_content_id, child_content_id),
KEY child_content_id (child_content_id)
);

To make the above a little more robust -- to be able to search more
than a single relationship level in a single SQL query -- you could
use a nested set model in the content2content table...

Anyway, just my thoughts on this thread. Cheers!

Jay

ringmaster

unread,
Mar 30, 2008, 3:06:58 PM3/30/08
to habari-dev
On Mar 30, 2:05 pm, jaypipes <jaypi...@gmail.com> wrote:
> As I see it from looking at the current schema in system/schema/mysql/
> schema.sql, I'd go for something like this:

If we must integrate comment and post data, I support the change so
far only in the manner you've suggested. This change is non-trivial.

We could take this opportunity to rename our base entities/tables so
that they make more sense. Ala post->content, entry->post. An
interstitial class between QueryRecord and Post/Comment, such as
Content, would also be useful. We could also take the opportunity in
the database churn to reform the tags tables into a more useful
taxonomy system. All of the changes would need to be paralleled on
SQLite too, so we must be sure that whatever methods fetch and store
the data work there. The upgrade plan to implement all this without
data loss seems daunting.

Incidentally, this is almost exactly Drupal's database schema in
regard to multiple content types, but even they keep comments in a
separate, non-node table. It might be edifying to ask someone who
knows the Drupal DB schema why comments aren't also nodes.

> To make the above a little more robust -- to be able to search more
> than a single relationship level in a single SQL query -- you could
> use a nested set model in the content2content table...

I'd like to hear more about how this would work.

Thanks for peeking back in, Jay.

Owen

jaypipes

unread,
Mar 30, 2008, 9:34:39 PM3/30/08
to habari-dev
On Mar 30, 3:06 pm, ringmaster <epit...@gmail.com> wrote:
> On Mar 30, 2:05 pm, jaypipes <jaypi...@gmail.com> wrote:
>
> > As I see it from looking at the current schema in system/schema/mysql/
> > schema.sql, I'd go for something like this:
>
> If we must integrate comment and post data, I support the change so
> far only in the manner you've suggested. This change is non-trivial.

Much agreed. This is a non-trivial change and would require
significant testing in order to make sure it doesn't go terribly
wrong. :) Then, again, as others have pointed out, it's still early
in the game, and making this kind of a change later on would likely be
more complicated. Still, that said, it's a major undertaking that
probably would deserve a code branch...

> We could take this opportunity to rename our base entities/tables so
> that they make more sense. Ala post->content, entry->post.

I don't see an "entry" table in the current schema. Is this new?

Also, I wouldn't necessarily rename post to content, as the content
table in the proposed schema above stores all attributes common to all
types of content, not just posts...

> An interstitial class between QueryRecord and Post/Comment, such as
> Content, would also be useful.

True, though at first I can foresee the database relationship between
the content and posts table to be pretty much hidden within the Post
class, no?

> We could also take the opportunity in
> the database churn to reform the tags tables into a more useful
> taxonomy system. All of the changes would need to be paralleled on
> SQLite too, so we must be sure that whatever methods fetch and store
> the data work there. The upgrade plan to implement all this without
> data loss seems daunting.

Agreed. :)

> Incidentally, this is almost exactly Drupal's database schema in
> regard to multiple content types, but even they keep comments in a
> separate, non-node table. It might be edifying to ask someone who
> knows the Drupal DB schema why comments aren't also nodes.

Hmm, I wasn't aware this was the way Drupal organizes things. I can
understand, however, why they wouldn't want comments being nodes.
AFAIU, nodes in Drupal have some special characteristics, notably in
regards to the permissions system. For comments, I'm not sure those
special attributes applied. But, I may be totally wrong about this...

> > To make the above a little more robust -- to be able to search more
> > than a single relationship level in a single SQL query -- you could
> > use a nested set model in the content2content table...
>
> I'd like to hear more about how this would work.

OK, but I'll leave that for a separate thread. :) Don't want to make
this one too much longer!

> Thanks for peeking back in, Jay.

Hopefully, I'll be peeking in a lot more now that the whole Sun
acquisition is over with! :)

-jay

> Owen

ringmaster

unread,
Mar 30, 2008, 11:49:58 PM3/30/08
to habari-dev
On Mar 30, 9:34 pm, jaypipes <jaypi...@gmail.com> wrote:
> > We could take this opportunity to rename our base entities/tables so
> > that they make more sense. Ala post->content, entry->post.
>
> I don't see an "entry" table in the current schema. Is this new?
>
> Also, I wouldn't necessarily rename post to content, as the content
> table in the proposed schema above stores all attributes common to all
> types of content, not just posts...

I was referring to changing to how we currently refer to discrete
units of data. Currently chronological "entries" and static "pages"
are both "posts" -- their data is both stored in the posts and
retrieved via the Posts class. Changing the naming so that "posts"
and "pages" are types of "content" would make things clearer,
especially if there were tables named "content", "posts", and "pages".

> > An interstitial class between QueryRecord and Post/Comment, such as
> > Content, would also be useful.
>
> True, though at first I can foresee the database relationship between
> the content and posts table to be pretty much hidden within the Post
> class, no?

I'm sure it could be done either way, all depending on who approaches
it.

> Hmm, I wasn't aware this was the way Drupal organizes things. I can
> understand, however, why they wouldn't want comments being nodes.
> AFAIU, nodes in Drupal have some special characteristics, notably in
> regards to the permissions system. For comments, I'm not sure those
> special attributes applied. But, I may be totally wrong about this...

Is Habari going to include something similar to Drupal's node-level
permissions? If so, then we need to more carefully consider making
comments a subclass of content equal to a post.

Owen

Michael C. Harris

unread,
Mar 31, 2008, 12:36:30 AM3/31/08
to habar...@googlegroups.com
On Sun, Mar 30, 2008 at 08:49:58PM -0700, ringmaster wrote:
>
> Is Habari going to include something similar to Drupal's node-level
> permissions? If so, then we need to more carefully consider making
> comments a subclass of content equal to a post.

Sorry to be thick but can you briefly explain what you mean by node
level permissions?

ringmaster

unread,
Mar 31, 2008, 2:52:09 AM3/31/08
to habari-dev
On Mar 31, 12:36 am, "Michael C. Harris" <michael.twof...@gmail.com>
wrote:
> On Sun, Mar 30, 2008 at 08:49:58PM -0700, ringmaster wrote:
>
> > Is Habari going to include something similar to Drupal's node-level
> > permissions? If so, then we need to more carefully consider making
> > comments a subclass of content equal to a post.
>
> Sorry to be thick but can you briefly explain what you mean by node
> level permissions?

In Drupal, each node (a Drupal "node" is like our "post", or more like
the described "content" in that it's mostly an id that other data-
storing tables rely on as a key) is also referenced by a table that
caches read and edit permissions by role for every node. This allows
Drupal to run a query that fetches only nodes that a user of the
current role is allowed to view/edit without having to validate each
node after it is returned in a query.

To check permissions in Habari, there would need to be an associated,
joined row of cached permission data or a check of permissions for
each row accessed in the posts/content table. Including comment spam,
the number of potentially needless permission checks could be very
high. Whether this is necessary depends on whether discrete
permissions are required on each comment-stored-as-post and how
permissions are applied to these objects. Storing comments in the
same place as posts may imply a similar permission structure, which is
why I was advising careful consideration. Of course, we may
alternatively not include this level of permission, which would make
the issue moot.

Owen

Arthus Erea

unread,
Mar 31, 2008, 6:40:22 PM3/31/08
to habari-dev
Actually, I disagree with having different tables for each content
type. I say the database schema for the "master" content table should
be simplified to look something like this:

CREATE TABLE {$prefix}content (
id INT UNSIGNED NOT NULL AUTO_INCREMENT,
content_type SMALLINT UNSIGNED NOT NULL,
content LONGTEXT NOT NULL,
cached_content LONGTEXT NOT NULL,
pubdate DATETIME NOT NULL,
updated TIMESTAMP NOT NULL,
PRIMARY KEY (id),
);

However, instead of having a separate table for every content type
(which could potentially create far too many tables for reasonable
consumption) we could define the fields applicable to specific content
types (thus making it trivial for a plugin to add a new field) in a
table like this:

CREATE TABLE {$prefix}contentfields (
`id` int(10) unsigned NOT NULL auto_increment,
`name` varbinary(255) NOT NULL,
`content_type` SMALLINT UNSIGNED NOT NULL,
PRIMARY KEY (`id`)
)

Finally, actual post properties (including status and user) would be
stored in a table like this:

CREATE TABLE {$prefix}contentinfo (
`post` int(10) unsigned NOT NULL auto_increment,
`type` smallint(5) unsigned NOT NULL default '0',
`value` text collate utf8_unicode_ci,
PRIMARY KEY (`post_id`,`name`)
)

With some strong classes built around it, this database schema could
be infinitely extendable for all forms of content without bothering
around with a plethora of repetitive and similar tables.

> > We could take this opportunity to rename our base entities/tables so
> > that they make more sense. Ala post->content, entry->post.

> I don't see an "entry" table in the current schema. Is this new?

> Also, I wouldn't necessarily rename post to content, as the content
> table in the proposed schema above stores all attributes common to all
> types of content, not just posts...

Actually, I think it could stay as posts if that is preferred, since
in the current system posts refer to all types of content-not just
chronolical entries and pages. Still, for generic purposes, content
does make more sense.

Michael C. Harris

unread,
Mar 31, 2008, 6:48:09 PM3/31/08
to habar...@googlegroups.com

So it's possible to have different permissions for a role/group on
individual posts of the same type (group can view entry 1 but not
entry 2), not just each type of post (group can view entries)? I
certainly was not envisioning such granularity.

Of course, however we tackle permissions, there are potential
pitfalls; let's call them interesting challenges.

Arthus Erea

unread,
Mar 31, 2008, 7:08:03 PM3/31/08
to habari-dev
I think so. Here's how I envision permissions, in order of application

a) sitewide default roles/rules
b) group rules for entire content types
c) group rules for single posts
d) individual rules for single content types
d) individual rules for single posts

Actually, I was thinking of crazy awesomeness we could do with the
ACL: imagine being able to construct rules like iTunes smart folders.
(Group => developers, tag => code, type => page, view, edit, publish)

On Mar 31, 6:48 pm, "Michael C. Harris" <michael.twof...@gmail.com>

Michael C. Harris

unread,
Mar 31, 2008, 7:37:04 PM3/31/08
to habar...@googlegroups.com
Please take this in the friendly, community minded way that it's
intended. Does anyone actually find it easier to read if replies are
at the top?

http://daringfireball.net/2007/07/on_top

Michael C. Harris

unread,
Mar 31, 2008, 7:41:15 PM3/31/08
to habar...@googlegroups.com
On Mon, Mar 31, 2008 at 04:08:03PM -0700, Arthus Erea wrote:
>
> I think so.

Sorry, what was that in answer to?

> Here's how I envision permissions, in order of application
>
> a) sitewide default roles/rules
> b) group rules for entire content types

Agreed.

> c) group rules for single posts
> d) individual rules for single content types
> d) individual rules for single posts

I can't actually see the need in Habari for these. Who would want to
set permissions on a single post or for a single user? It also
increases the required complexity enormously. I'm happy to hear where
these would be useful.

> Actually, I was thinking of crazy awesomeness we could do with the
> ACL: imagine being able to construct rules like iTunes smart folders.
> (Group => developers, tag => code, type => page, view, edit, publish)

That's worth thinking about.

Arthus Erea

unread,
Mar 31, 2008, 8:33:58 PM3/31/08
to habari-dev


On Mar 31, 7:41 pm, "Michael C. Harris" <michael.twof...@gmail.com>
wrote:
> On Mon, Mar 31, 2008 at 04:08:03PM -0700, Arthus Erea wrote:
>
> > I think so.
>
> Sorry, what was that in answer to?

Sorry, that was in response to your to your question about granularity
of post permissions:
> So it's possible to have different permissions for a role/group on
> individual posts of the same type (group can view entry 1 but not
> entry 2), not just each type of post (group can view entries)? I
> certainly was not envisioning such granularity.

---

> > c) group rules for single posts
> > d) individual rules for single content types
> > d) individual rules for single posts
>
> I can't actually see the need in Habari for these. Who would want to
> set permissions on a single post or for a single user? It also
> increases the required complexity enormously. I'm happy to hear where
> these would be useful.

I disagree. There are many cases where I want someone to only be able
to edit a single post, but nothing else. For instance, if I run a
group blog with bio pages for I would want to give authors the ability
to edit their own specific bio page, but not anybody else's. Also, in
professional environments, this would be quite useful. (You wouldn't
want that guy from HR to be able to edit the home page text, but he
should be able to change his contact info listing)

I'll agree we don't need rule sets for specific individuals, since a
new group can always be created with only them in it.

> > Actually, I was thinking of crazy awesomeness we could do with the
> > ACL: imagine being able to construct rules like iTunes smart folders.
> > (Group => developers, tag => code, type => page, view, edit, publish)
>
> That's worth thinking about.

I might do a mockup of this, if only for some interesting conversation.

Michael C. Harris

unread,
Mar 31, 2008, 9:31:31 PM3/31/08
to habar...@googlegroups.com
On Mon, Mar 31, 2008 at 05:33:58PM -0700, Arthus Erea wrote:
> On Mar 31, 7:41 pm, "Michael C. Harris" <michael.twof...@gmail.com> wrote:
> > On Mon, Mar 31, 2008 at 04:08:03PM -0700, Arthus Erea wrote:
>
> > > c) group rules for single posts
> > > d) individual rules for single content types
> > > d) individual rules for single posts
> >
> > I can't actually see the need in Habari for these. Who would want to
> > set permissions on a single post or for a single user? It also
> > increases the required complexity enormously. I'm happy to hear where
> > these would be useful.
>
> I disagree. There are many cases where I want someone to only be able
> to edit a single post, but nothing else. For instance, if I run a
> group blog with bio pages for I would want to give authors the ability
> to edit their own specific bio page, but not anybody else's.

We add a bio content type in a plugin and rely on group permissions to
control their creation and editing. An admin can create and edit any
number of bio pages (and assign ownership to other users) but a
registered user can create a maximum of one and can edit bio posts for
which they are the author.

> Also, in professional environments, this would be quite useful. (You
> wouldn't want that guy from HR to be able to edit the home page
> text, but he should be able to change his contact info listing)

He's in a group that allows him to edit the contact info. The home
page is a theme template (to edit it he'd have to access to the theme,
in which case he can do what he wants anyway) with much of its content
dynamically generated from other posts, which he may or may not have
permission to compose, depending on his group membership.

I'm still not entirely convinced, but feel free to keep trying.

> > > Actually, I was thinking of crazy awesomeness we could do with the
> > > ACL: imagine being able to construct rules like iTunes smart folders.
> > > (Group => developers, tag => code, type => page, view, edit, publish)
> >
> > That's worth thinking about.
>
> I might do a mockup of this, if only for some interesting conversation.

Please do, that would be great.

Arthus Erea

unread,
Mar 31, 2008, 9:51:06 PM3/31/08
to habari-dev


On Mar 31, 9:31 pm, "Michael C. Harris" <michael.twof...@gmail.com>
wrote:
> We add a bio content type in a plugin and rely on group permissions to
> control their creation and editing. An admin can create and edit any
> number of bio pages (and assign ownership to other users) but a
> registered user can create a maximum of one and can edit bio posts for
> which they are the author.

I should have to find and install a plugin just to keep someone from
editing a page or allow them to only write their own? I think the
ability to constrict editing of a post should definitely be core
functionality-simply because there are a plethora of situations where
this would be useful. It can't possibly be expected that there is be a
specific plugin built for each use case, while having this as a core
feature would allow for any use case. I really don't think content
types should be used solely for permissions. If it's a page, it is a
page: regardless of who can edit it.

In all seriousness, when I think of excellent ACL I expect that the
permissions would be granular down to the post level... I really
shouldn't have to add a whole new plugin /content type just to protect
a specific page from being editted. If we do an ACL, we should do it
right.

> He's in a group that allows him to edit the contact info. The home
> page is a theme template (to edit it he'd have to access to the theme,
> in which case he can do what he wants anyway) with much of its content
> dynamically generated from other posts, which he may or may not have
> permission to compose, depending on his group membership.
>
> I'm still not entirely convinced, but feel free to keep trying.
>
On many corporate sites, there is some home text. Wouldn't it make
sense for that text to be stored in a specific "home" page... In the
grand scheme of things, I still think it should be painlessly easy to
give a group access to only a specific page or deny access for a
specific page. (This would also make it effortlessly easy to "protect"
a page/entry)

> > I might do a mockup of this, if only for some interesting conversation.
>
> Please do, that would be great.

Okay, will do. But keep in mind this being implemented would
*definitely* require post-level rules.

ringmaster

unread,
Mar 31, 2008, 10:43:22 PM3/31/08
to habari-dev
On Mar 31, 6:48 pm, "Michael C. Harris" <michael.twof...@gmail.com>
wrote:
>
> So it's possible to have different permissions for a role/group on
> individual posts of the same type (group can view entry 1 but not
> entry 2), not just each type of post (group can view entries)? I
> certainly was not envisioning such granularity.
>

There are two occasions I can think of where this granularity is not
as granular as it seems.

First, consider the occasion where one role has the ability to edit
posts created by another role or roles. Combine that with having the
ability to edit based on content type and the cached permissions start
to become attractive.

Second, remember that just because something is a post does not mean
that it must be something editable by a registered user. There may be
the occasion where visitor-created content employs permission
restrictions. For example, a Habari content type could be used to
store user-submitted issues into a tracking system. The permission to
view or edit that content could depend on the project the visitor
associated the content to on submission.

Anyway, there are some advantages to this approach, just as there are
some disadvantages, which I've not mentioned here at all, mainly that
when you change permission rules, the entire permissions cache needs
to be rebuilt for it to work effectively.

Owen

Arthus Erea

unread,
Mar 31, 2008, 11:52:28 PM3/31/08
to habari-dev
Alright, here's a preliminary mockup of how I envision a rule-based
and highly granular permissions system might work.

The left panel has rule sets (copy recommendations welcome).

The top right has the rule properties. Every rule either filters:
1) what it is applied to (tag => downtime)
2) who is it applied to (group -> IT)
3) or what they can do (allow => edit)

The bottom right is a live preview of what posts the people the rule
applies to would be able to edit.

http://www.flickr.com/photos/arthuserea/2379429992/

Scott Merrill

unread,
Apr 3, 2008, 8:27:28 AM4/3/08
to habar...@googlegroups.com
This thread has kind of veered off from the original discussion about
comments-as-content-type. Maybe we should move the permissions aspect
of this thread into a new discussion.

Help me understand this picture. The rule applies to the group
"Information Technology", and the author must be in the "Information
Technology" group: isn't that redundant?

Arthus Erea

unread,
Apr 3, 2008, 8:57:27 AM4/3/08
to habari-dev
The rule name is simply for organization purposes.. it doesn't
actually establish any limits. The point of rules is that you can
limit them how you want to. (For instance, you could have all users
who have "xyz" in their username or in their group name) You could
also establish joint rules (like in this example you must be in both
groups to match the rule). In the end, I think this system would be
far more granular and interesting, especially if we added even more
rule choices.

On Apr 3, 8:27 am, "Scott Merrill" <ski...@skippy.net> wrote:
> This thread has kind of veered off from the original discussion about
> comments-as-content-type. Maybe we should move the permissions aspect
> of this thread into a new discussion.
>

Arthus Erea

unread,
Apr 3, 2008, 9:01:46 AM4/3/08
to habari-dev
Also: the "group" filters filter who this rule applies to. The
"author" filters filters which posts it applies to. That is, users
from these groups can "edit, publish, and view" posts which have an
author from the group "information technology."

Scott Merrill

unread,
Apr 3, 2008, 9:18:21 AM4/3/08
to habar...@googlegroups.com
On Thu, Apr 3, 2008 at 8:57 AM, Arthus Erea <arthu...@gmail.com> wrote:
> The rule name is simply for organization purposes.. it doesn't
> actually establish any limits. The point of rules is that you can
> limit them how you want to. (For instance, you could have all users
> who have "xyz" in their username or in their group name) You could
> also establish joint rules (like in this example you must be in both
> groups to match the rule). In the end, I think this system would be
> far more granular and interesting, especially if we added even more
> rule choices.

Have you looked at the existing Habari ACL code?
http://wiki.habariproject.org/en/Classes/ACL
http://wiki.habariproject.org/en/Permissions

I *think* our extant code is functionally similar to what you're
proposing, but it's still early in the morning and my brain is fuzzy.
Can you please review those two links and then let me know what you
think?

The notable difference I see between your rules and our ACLs is that
your rules support operations based on external criteria (username
contains XYZ). I imagine we could do something similar with ACLs, but
it might require some hoop-jumping. Finally, I'm not entirely sure
that such rules would be of sufficient use to most users to justify
the possible complexity they introduce.

Cheers,
Scott

Reply all
Reply to author
Forward
0 new messages