Scheduled posts ( http://trac.habariproject.org/habari/ticket/36 ) are
a feature that has garnered a fair amount of interest, and as Habari
becomes more popular the demand for it will grow.
There has been one patch submitted to enable this feature, but it is
fairly poor (I know. I put it in. ) It has been mentioned in irc that
using a crontab may be the way to go.
My idea is to add a new post status to the poststatus table called
'future', and a new cronjob to the cron table to periodically update
the posts table, changing all posts whose status is 'future' to a
status of 'published' if the publication of the date is the post is
less than the date that the cronjob is run. My first impression is
that the cronjob should run once a minute, since people will expect
their post to be published when they tell it to be published, but I
don't know what kind of load this would entail on a busy site. A less
frequent schedule may be more suitable.
When a post is saved, we would check to see if it's status was set to
'published' by the user. If so, check the publication date. If it has
been set to a date or time in the future, change the post status to
'future' internally before saving the post. Using these mechanisms,
scheduled posts are enabled without changing the current interface
that users see.
The cronjob and new status can be added to the database when Habari is
installed. For people who are currently using Habari, a plugin can be
developed to add the cronjob and new status to their database.
Alternatively, the database version can be bumped and an upgrade
triggered to add the new data, though I'm not sure of the suitability
of that since this entails no change to the database format, only to
the data in it.
On Thu, Apr 17, 2008 at 03:28:07PM -0700, rick c wrote:
> Scheduled posts ( http://trac.habariproject.org/habari/ticket/36 ) are > a feature that has garnered a fair amount of interest, and as Habari > becomes more popular the demand for it will grow.
Agreed.
> There has been one patch submitted to enable this feature, but it is > fairly poor (I know. I put it in. ) It has been mentioned in irc that > using a crontab may be the way to go.
> My idea is to add a new post status to the poststatus table called > 'future', and a new cronjob to the cron table to periodically update > the posts table, changing all posts whose status is 'future' to a > status of 'published' if the publication of the date is the post is > less than the date that the cronjob is run. My first impression is > that the cronjob should run once a minute, since people will expect > their post to be published when they tell it to be published, but I > don't know what kind of load this would entail on a busy site. A less > frequent schedule may be more suitable.
Your idea sounds reasonable to me. I think running once an hour would be sufficient though, as long as authors know that's the case.
> When a post is saved, we would check to see if it's status was set to > 'published' by the user. If so, check the publication date. If it has > been set to a date or time in the future, change the post status to > 'future' internally before saving the post. Using these mechanisms, > scheduled posts are enabled without changing the current interface > that users see.
There has to be a change, because there'd have to be a date picker for the publish date to be in the future. That could be hidden in the splitter though.
> The cronjob and new status can be added to the database when Habari is > installed. For people who are currently using Habari, a plugin can be > developed to add the cronjob and new status to their database. > Alternatively, the database version can be bumped and an upgrade > triggered to add the new data, though I'm not sure of the suitability > of that since this entails no change to the database format, only to > the data in it.
All of this can and should be accomplished through a plugin. I don't think it should be core.
Since I am one of the people who suggested the use of cron, I like it! My only question is the same one you raise here, namely how often should the job run.
> Scheduled posts ( http://trac.habariproject.org/habari/ticket/36 ) are > a feature that has garnered a fair amount of interest, and as Habari > becomes more popular the demand for it will grow.
> There has been one patch submitted to enable this feature, but it is > fairly poor (I know. I put it in. ) It has been mentioned in irc that > using a crontab may be the way to go.
> My idea is to add a new post status to the poststatus table called > 'future', and a new cronjob to the cron table to periodically update > the posts table, changing all posts whose status is 'future' to a > status of 'published' if the publication of the date is the post is > less than the date that the cronjob is run. My first impression is > that the cronjob should run once a minute, since people will expect > their post to be published when they tell it to be published, but I > don't know what kind of load this would entail on a busy site. A less > frequent schedule may be more suitable.
> When a post is saved, we would check to see if it's status was set to > 'published' by the user. If so, check the publication date. If it has > been set to a date or time in the future, change the post status to > 'future' internally before saving the post. Using these mechanisms, > scheduled posts are enabled without changing the current interface > that users see.
> The cronjob and new status can be added to the database when Habari is > installed. For people who are currently using Habari, a plugin can be > developed to add the cronjob and new status to their database. > Alternatively, the database version can be bumped and an upgrade > triggered to add the new data, though I'm not sure of the suitability > of that since this entails no change to the database format, only to > the data in it.
I think the job running period really depends on how accurate we would want to have the scheduled posting to be. I am guessing an hour would be more than sufficient for this, maybe 2 hours?
On Fri, Apr 18, 2008 at 1:37 PM, Chris J. Davis <c...@chrisjdavis.org> wrote:
> Since I am one of the people who suggested the use of cron, I like > it! My only question is the same one you raise here, namely how often > should the job run.
Chris J. Davis wrote: > Since I am one of the people who suggested the use of cron, I like > it! My only question is the same one you raise here, namely how often > should the job run.
Our cron is pretty efficient, I don't see the problem with running the publisher cron every 1-5 minutes.
I'm unclear why we'd want to do this with cron. If the publish date hasn't arrived yet, don't display the post. If it has, then do display it.
On Apr 18, 2008, at 07:48, Ali B. wrote:
> I think the job running period really depends on how accurate we > would want to have the scheduled posting to be. I am guessing an > hour would be more than sufficient for this, maybe 2 hours?
-- "Reality is that which, when you stop believing in it, doesn't go away." (Philip K. Dick)
The idea is to have a crotab auto-publish the post is due. That is, saving it as draft and have habari publish it at a time. Rather than having the post status as "published" while it is actually not
On Fri, Apr 18, 2008 at 2:14 PM, Rich Bowen <rbo...@rcbowen.com> wrote: > I'm unclear why we'd want to do this with cron. If the publish date hasn't > arrived yet, don't display the post. If it has, then do display it. > On Apr 18, 2008, at 07:48, Ali B. wrote:
> I think the job running period really depends on how accurate we would > want to have the scheduled posting to be. I am guessing an hour would be > more than sufficient for this, maybe 2 hours?
> -- > "Reality is that which, when you stop believing in it, doesn't go away." > (Philip K. Dick)
> The idea is to have a crotab auto-publish the post is due. That is, > saving it as draft and have habari publish it at a time. Rather > than having the post status as "published" while it is actually not
No. The status is "scheduled". Then, the first time a pageview happens after the publish date, it gets updated to "published". I fail to see why we'd need a cron job for that. You merely look for stuff that is scheduled, and for which the publish date is past. I don't see any value in making it more complex than that.
-- Patriotism is often an arbitrary veneration of real estate above principles. George Jean Nathan
On Fri, Apr 18, 2008 at 8:28 AM, Rich Bowen <rbo...@rcbowen.com> wrote:
> On Apr 18, 2008, at 08:26, Ali B. wrote: > The idea is to have a crotab auto-publish the post is due. That is, saving > it as draft and have habari publish it at a time. Rather than having the > post status as "published" while it is actually not
> No. The status is "scheduled". Then, the first time a pageview happens after > the publish date, it gets updated to "published". I fail to see why we'd > need a cron job for that. You merely look for stuff that is scheduled, and > for which the publish date is past. I don't see any value in making it more > complex than that.
WordPress used the model you described for quite some time, and as I recall it introduced some pretty hefty performance penalties on big sites as you include the MySQL NOW function in each query. That means that the server needs to compute a new value for every query, which means that queries (and their results) cannot be effectively cached.
I think that the use of cron allows us to better leverage our API. In your model, Rich, a post's status changes from "scheduled" to "published", but it's not entirely clear whether the entire stack of actions triggered on such a status change would execute. If we use cron, we can unambiguously execute Post::publish() on the item in question, trigger ping notifications, pingbacks, and everything else: all before the theme collects the list of published posts, to ensure that the just-published post is now included in the current request.
> WordPress used the model you described for quite some time, and as I > recall it introduced some pretty hefty performance penalties on big > sites as you include the MySQL NOW function in each query. That means > that the server needs to compute a new value for every query, which > means that queries (and their results) cannot be effectively cached.
So don't use the mysql now function. Use php to generate a timestamp, give it a granularity of 30 minutes, and use that. That can be cached.
> I think that the use of cron allows us to better leverage our API. In > your model, Rich, a post's status changes from "scheduled" to > "published", but it's not entirely clear whether the entire stack of > actions triggered on such a status change would execute. If we use > cron, we can unambiguously execute Post::publish() on the item in > question, trigger ping notifications, pingbacks, and everything else: > all before the theme collects the list of published posts, to ensure > that the just-published post is now included in the current request.
Right. So use Post::publish() in my scenario too, obviously.
I dislike solutions that seem to introduce layers of complexity to solve simple problems.
-- The most likely way for the world to be destroyed, most experts agree, is by accident. That's where we come in; we're computer professionals. We cause accidents. (Nathaniel Borenstein)
> On Apr 18, 2008, at 08:26, Ali B. wrote: >> The idea is to have a crotab auto-publish the post is due. That is, >> saving it as draft and have habari publish it at a time. Rather than >> having the post status as "published" while it is actually not
> No. The status is "scheduled". Then, the first time a pageview happens > after the publish date, it gets updated to "published". I fail to see > why we'd need a cron job for that. You merely look for stuff that is > scheduled, and for which the publish date is past. I don't see any value > in making it more complex than that.
That is pretty much exactly how our cron works. On every page view, if the cron job hasn't run in this period, it runs and changes all matching future posts to published.
On Apr 18, 6:59 am, "Michael C. Harris" <michael.twof...@gmail.com>
wrote:
> On Thu, Apr 17, 2008 at 03:28:07PM -0700, rick c wrote:
> There has to be a change, because there'd have to be a date picker for
> the publish date to be in the future. That could be hidden in the
> splitter though.
On the settings tab of the publish page's splitter there is already an
editable control containing the post's time and date of publication.
This can be used to set the pubdate of the post without introducing
new controls.
> All of this can and should be accomplished through a plugin. I don't
> think it should be core.
Either way would be fine. A case could be made either way. If it is
made into a plugin, though, I think it is an important enough feature
to be a core plugin.
Matthias Bauer wrote: > Rich Bowen wrote: >> On Apr 18, 2008, at 08:26, Ali B. wrote: >>> The idea is to have a crotab auto-publish the post is due. That is, >>> saving it as draft and have habari publish it at a time. Rather than >>> having the post status as "published" while it is actually not >> No. The status is "scheduled". Then, the first time a pageview happens >> after the publish date, it gets updated to "published". I fail to see >> why we'd need a cron job for that. You merely look for stuff that is >> scheduled, and for which the publish date is past. I don't see any value >> in making it more complex than that.
> That is pretty much exactly how our cron works. On every page view, if > the cron job hasn't run in this period, it runs and changes all matching > future posts to published.
Here's what I think should happen, although I know that it currently does not.
The primary reason you would set things up the following way is to avoid executing non-cacheable queries - queries in which a date changes in the query on every execution - on every page load. Not doing things as described below (or by taking some similar steps to avoid non-caching queries and synchronous cron execution) will cause significant, visible performance degradation.
When you publish an entry that is scheduled for the future, its status needs to be set to "scheduled". This simple switch allows for entries in the future to remain out of the list of posts that are published to the home page (unless you're running some crazy Post::get() call that gets posts of any status, which would not be standard). I imagine that the setting of the "scheduled" status could be transparent to the user - they'd set a future date and status of "published", and upon save Habari would transparently change the status to "scheduled" without user interaction.
When a scheduled post is saved, a query is performed to find the next scheduled post. It could be the one just entered, but it may be some other post. The scheduled date of the post is noted.
A specifically named cronjob is then created or updated and set to execute at the time of the next scheduled post. CronJobs can be set to execute at specific times, if no visitor triggers the job, then it executes whenever the next immediate visit occurs.
The job runs a function specific to updating scheduled posts whose times have come to "published" status. At this time, if there are scheduled posts remaining, the CronJob is updated to execute at the time of the next scheduled post.
The advantage of this process is that there is only ever a single CronJob running for all scheduled posts. This CronJob runs at exactly the moment that the scheduled post is to be published (eliminating all of the "how frequently do we do this?" questions).
To clarify for benefit of the bulk of the thread topic, the Habari CronTab and CronJob classes do not execute like a unix cron. You can set any job to run once at a specific time, or multiple times at an interval, or even many times with an end time. It is much more flexible than unix cron. The only drawback with it is that it runs only when the site is visited, so tasks that should be automated even when there are no visitors will not run on their own. However, this can be accomplished by setting a unix cron that fetches the Habari cron URL periodically.
Most significant of all, this process allows us to run a single, cacheable query to return all CronJobs and loop through them, executing only those whose execution times have passed. The CronTab currently does not do this, and is causing an inefficiency. What should happen is the CronTab should query for ALL CronJobs with a single "SELECT * FROM {crontab}" and then loop through them in memory to detect which ones need to execute, updating as appropriate. This will yield significant performance gains.
One downside to this process is that because cron should execute asynchronously, the request that triggers the cron will likely not benefit from its execution. A user who triggers the cron on a publish should not need to wait for the process behind the cron to execute before displaying their request, so we execute crons asychronously.
What should happen is when a visitor requests the page, Habari will detect whether cron jobs should execute, and if so, issue itself via RemoteRequest another HTTP request to the cron URL. This allows execution on the visitor's request to continue without impedance. The cron URL requested on a separate thread will do the actual work.
The reason for this is that publishing a post can be a time-consuming process since Habari may need to contact many external servers for pings, etc. As a result, since the cron execution actually happens in a separate/subsequent request, the user who triggered the cron won't see the result of the cron unless they reload. I think that this is a fair trade-off for smooth site running, and anyone uncomfortable with that delay should set up a unix cron to periodically poll Habari's cron URL, which will most likely obviate the issue.
If this is all still unclear, maybe I can work up some flowcharts later on that depict the process more clearly.
> Most significant of all, this process allows us to run a single, > cacheable query to return all CronJobs and loop through them, > executing > only those whose execution times have passed. The CronTab currently > does not do this, and is causing an inefficiency. What should > happen is > the CronTab should query for ALL CronJobs with a single "SELECT * FROM > {crontab}" and then loop through them in memory to detect which ones > need to execute, updating as appropriate. This will yield significant > performance gains.
Whether now() is less, or more, efficient than fetching all the records is entirely a function of the number of records, but, yes, I think in general this approach makes more sense. Thanks for the clarification.
-- One of the advantages of being disorderly is that one is constantly making exciting discoveries. A. A. Milne
On Apr 18, 8:28 am, Rich Bowen <rbo...@rcbowen.com> wrote:
> On Apr 18, 2008, at 08:26, Ali B. wrote:
> > The idea is to have a crotab auto-publish the post is due. That is,
> > saving it as draft and have habari publish it at a time. Rather
> > than having the post status as "published" while it is actually not
> No. The status is "scheduled". Then, the first time a pageview
> happens after the publish date, it gets updated to "published". I
> fail to see why we'd need a cron job for that. You merely look for
> stuff that is scheduled, and for which the publish date is past. I
> don't see any value in making it more complex than that.
My original attempt was something along the lines that Ali describes,
setting the date in the future and the status as published, then in
Posts::get() eliminating all published posts with a date in the
future. This is less efficient than using the cronjob and seems like
it would bog down a busy site, plus Post::publish was called too soon
on these posts, with things like pings sent out before the post was
actually visible on the site.
>> WordPress used the model you described for quite some time, and as I
>> recall it introduced some pretty hefty performance penalties on big
>> sites as you include the MySQL NOW function in each query. That means
>> that the server needs to compute a new value for every query, which
>> means that queries (and their results) cannot be effectively cached.
>So don't use the mysql now function. Use php to generate a timestamp,
>give it a granularity of 30 minutes, and use that. That can be cached.
I don't understand the workflow for what you're describing, Rich.
> > The cronjob and new status can be added to the database when Habari is > > installed. For people who are currently using Habari, a plugin can be > > developed to add the cronjob and new status to their database. > > Alternatively, the database version can be bumped and an upgrade > > triggered to add the new data, though I'm not sure of the suitability > > of that since this entails no change to the database format, only to > > the data in it.
> All of this can and should be accomplished through a plugin. I don't > think it should be core.
I think future posting should be a core feature. It's a default option in many other weblog applications, and it provides a very useful feature.
Our invocation of this feature would provide a nice built-in demonstration of how to use our cron system, too.
Scott Merrill wrote: >> > The cronjob and new status can be added to the database when Habari is >> > installed. For people who are currently using Habari, a plugin can be >> > developed to add the cronjob and new status to their database. >> > Alternatively, the database version can be bumped and an upgrade >> > triggered to add the new data, though I'm not sure of the suitability >> > of that since this entails no change to the database format, only to >> > the data in it.
>> All of this can and should be accomplished through a plugin. I don't >> think it should be core.
> I think future posting should be a core feature. It's a default > option in many other weblog applications, and it provides a very > useful feature.
> Our invocation of this feature would provide a nice built-in > demonstration of how to use our cron system, too.
I think this was all covered before, but I'll try to recap quickly. First, WordPress did this, as mentioned, and it makes the queries uncachable, which is a really bad thing for high traffic sites or low spec servers. Second, pingbacks, trackbacks, etc. all need to be sent out when the post is actually being displayed on the site, not before, which can cause problems. Technorati for example would try to access the post directly and get a 404, which if I remember correctly bumps your probability of being a spammer in their database pretty quickly.
I think I hit all the big stuff.
On 4/20/08, Graham Christensen <graham.christen...@iamgraham.net> wrote:
> Scott Merrill wrote: > >> > The cronjob and new status can be added to the database when Habari is > >> > installed. For people who are currently using Habari, a plugin can be > >> > developed to add the cronjob and new status to their database. > >> > Alternatively, the database version can be bumped and an upgrade > >> > triggered to add the new data, though I'm not sure of the suitability > >> > of that since this entails no change to the database format, only to > >> > the data in it.
> >> All of this can and should be accomplished through a plugin. I don't > >> think it should be core.
> > I think future posting should be a core feature. It's a default > > option in many other weblog applications, and it provides a very > > useful feature.
> > Our invocation of this feature would provide a nice built-in > > demonstration of how to use our cron system, too.
For all who are interested in testing it, I've posted a patch to
ticket #36 incorporating scheduled posts as a cronjob into Habari. The
patch incorporates adding a new 'scheduled' status to the poststatus
table during installation, a new table on the dashboard showing
scheduled posts, and appropriate log entries when the post goes live.
Scheduling a post for the future entails only putting a future date in
the publication date edit box on the publish screen and hitting the
publish button. The user is then notified the post has been scheduled.
When the time comes, it goes live.
On Thu, Apr 24, 2008 at 8:16 AM, rick c <rickcock...@gmail.com> wrote: > For all who are interested in testing it, I've posted a patch to > ticket #36 incorporating scheduled posts as a cronjob into Habari. The > patch incorporates adding a new 'scheduled' status to the poststatus > table during installation, a new table on the dashboard showing > scheduled posts, and appropriate log entries when the post goes live. > Scheduling a post for the future entails only putting a future date in > the publication date edit box on the publish screen and hitting the > publish button. The user is then notified the post has been scheduled. > When the time comes, it goes live.
A cursory overview of the patch looks good. A few comments / questions:
In my opinion, the list of scheduled posts in the dashboard should only show up if there are, in fact, scheduled posts.
When publishing a future-dated item, we should probably indicate to the user that the future-date has been preserved. Something along the lines of "Your post has been scheduled for publication at ... '.
Have you tested this yourself, Rick? How has it worked for you?
> When publishing a future-dated item, we should probably indicate to > the user that the future-date has been preserved. Something along the > lines of "Your post has been scheduled for publication at ... '.
I've tested this a little bit on localhost, and it seems to be functional. Good work!
The only niggling complaint I have after minimal testing is that future-dated posts show up as "published" to the author, with no styling differentiation. One needs to look at the site without being logged in to confirm that the future-dated post is not visible to everyone.
This caused us a lot of grief with drafts, originally, as they, too, were displayed on the front page to the logged-in user without styling differentiation. We should find a way to visually disambiguate the future-dated posts to reduce confusion.
I think future-posted posts are considered drafts until they are due to be published and thus they can simply be displayed the way drafts are.
On Thu, Apr 24, 2008 at 3:37 PM, Scott Merrill <ski...@skippy.net> wrote: > We should find a way to visually disambiguate the > future-dated posts to reduce confusion.
Thanks. Yes, I have tested it. Quite a bit locally, and it is
currently live on my site. Posts appear on schedule. Changing their
status appears to be working properly.
On Apr 24, 9:37 am, "Scott Merrill" <ski...@skippy.net> wrote:
> > When publishing a future-dated item, we should probably indicate to
> > the user that the future-date has been preserved. Something along the
> > lines of "Your post has been scheduled for publication at ... '.
> I've tested this a little bit on localhost, and it seems to be
> functional. Good work!
> The only niggling complaint I have after minimal testing is that
> future-dated posts show up as "published" to the author, with no
> styling differentiation. One needs to look at the site without being
> logged in to confirm that the future-dated post is not visible to
> everyone.
> This caused us a lot of grief with drafts, originally, as they, too,
> were displayed on the front page to the logged-in user without styling
> differentiation. We should find a way to visually disambiguate the
> future-dated posts to reduce confusion.
The solution for drafts was to give them a different style in the
theme css. I had assumed this would be the same process used with
scheduled posts. Add a scheduled class to the css, and style it
appropriately. I'm not sure how else to differentiate scheduled posts
from published posts unless they're ignored during Posts::get.
> On Thu, Apr 24, 2008 at 8:16 AM, rick c <rickcock...@gmail.com> wrote:
> > For all who are interested in testing it, I've posted a patch to
> > ticket #36 incorporating scheduled posts as a cronjob into Habari. The
> > patch incorporates adding a new 'scheduled' status to the poststatus
> > table during installation, a new table on the dashboard showing
> > scheduled posts, and appropriate log entries when the post goes live.
> > Scheduling a post for the future entails only putting a future date in
> > the publication date edit box on the publish screen and hitting the
> > publish button. The user is then notified the post has been scheduled.
> > When the time comes, it goes live.
> A cursory overview of the patch looks good. A few comments / questions:
> In my opinion, the list of scheduled posts in the dashboard should
> only show up if there are, in fact, scheduled posts.
Okay. The current display is copied from what is done with drafts.
>When publishing a future-dated item, we should probably indicate to
> the user that the future-date has been preserved. Something along the
> lines of "Your post has been scheduled for publication at ... '.