I find the design decision to change the data type of the
posts.pubdate column (et al) to an unsigned integer bizarre in the
extreme.
In modern databases, a date is normally represented a date which makes
sense. For example, this change now makes using SQL to select posts
difficult.
I don't doubt PHP has built in functions to convert Unix epoch numbers
to dates and times in lots of formats.
Don't get me wrong - I am now on the latest SVN and like the timezone
support (and scheduled posts) but out of idle curiosity, what was the
rationale for this ?
> I find the design decision to change the data type of the
> posts.pubdate column (et al) to an unsigned integer bizarre in the
> extreme.
> In modern databases, a date is normally represented a date which makes
> sense. For example, this change now makes using SQL to select posts
> difficult.
> I don't doubt PHP has built in functions to convert Unix epoch numbers
> to dates and times in lots of formats.
> Don't get me wrong - I am now on the latest SVN and like the timezone
> support (and scheduled posts) but out of idle curiosity, what was the
> rationale for this ?
> It's quite simple, it's unsigned because Unix epoch can't be negative.
> Only problem with using Unix epoch is year 2038 when it will end...
> On Sep 14, 3:17 pm, Andy C <andyc...@gmail.com> wrote:
> > I find the design decision to change the data type of the
> > posts.pubdate column (et al) to an unsigned integer bizarre in the
> > extreme.
> > In modern databases, a date is normally represented a date which makes
> > sense. For example, this change now makes using SQL to select posts
> > difficult.
> > I don't doubt PHP has built in functions to convert Unix epoch numbers
> > to dates and times in lots of formats.
> > Don't get me wrong - I am now on the latest SVN and like the timezone
> > support (and scheduled posts) but out of idle curiosity, what was the
> > rationale for this ?
I think we abandoned DB-native date formats for an INT simply because of compatibility. It's easier to handle converting to and from INT and date on the DB side across platforms than handling timezones across platforms. At least, that's the impression I got and the only rationale I can think of. I could, of course, be totally wrong.
On Sun, Sep 14, 2008 at 3:30 PM, Andy C <andyc...@gmail.com> wrote:
> I wasn't asking why it is unsigned - I'm asking why a date/time is > being represented as a number.
> On Sep 14, 8:26 pm, Andrew da Silva <andrewdasi...@mac.com> wrote: > > It's quite simple, it's unsigned because Unix epoch can't be negative.
> > Only problem with using Unix epoch is year 2038 when it will end...
> > On Sep 14, 3:17 pm, Andy C <andyc...@gmail.com> wrote:
> > > I find the design decision to change the data type of the > > > posts.pubdate column (et al) to an unsigned integer bizarre in the > > > extreme.
> > > In modern databases, a date is normally represented a date which makes > > > sense. For example, this change now makes using SQL to select posts > > > difficult.
> > > I don't doubt PHP has built in functions to convert Unix epoch numbers > > > to dates and times in lots of formats.
> > > Don't get me wrong - I am now on the latest SVN and like the timezone > > > support (and scheduled posts) but out of idle curiosity, what was the > > > rationale for this ?
Reading back -dev threads, I converted DATETIME and TIMESTAMP to
integer.
Main to reasons I can think of:
- DateTime object needs an strtotime() compatible string.
- Unix epoch makes mathematical manipulation easier and trustable?
- Less code/processing than asking PHP to convert your current date/
time in UTC.
Of course, this was done in a branch at first, to be reviewed.
If people bring forth counter-arguments, no grudge! ;)
On Sep 14, 3:34 pm, "Chris Meller" <ch...@doesnthaveone.com> wrote:
> I think we abandoned DB-native date formats for an INT simply because of
> compatibility. It's easier to handle converting to and from INT and date on
> the DB side across platforms than handling timezones across platforms.
> At least, that's the impression I got and the only rationale I can think of.
> I could, of course, be totally wrong.
> On Sun, Sep 14, 2008 at 3:30 PM, Andy C <andyc...@gmail.com> wrote:
> > I wasn't asking why it is unsigned - I'm asking why a date/time is
> > being represented as a number.
> > On Sep 14, 8:26 pm, Andrew da Silva <andrewdasi...@mac.com> wrote:
> > > It's quite simple, it's unsigned because Unix epoch can't be negative.
> > > Only problem with using Unix epoch is year 2038 when it will end...
> > > On Sep 14, 3:17 pm, Andy C <andyc...@gmail.com> wrote:
> > > > I find the design decision to change the data type of the
> > > > posts.pubdate column (et al) to an unsigned integer bizarre in the
> > > > extreme.
> > > > In modern databases, a date is normally represented a date which makes
> > > > sense. For example, this change now makes using SQL to select posts
> > > > difficult.
> > > > I don't doubt PHP has built in functions to convert Unix epoch numbers
> > > > to dates and times in lots of formats.
> > > > Don't get me wrong - I am now on the latest SVN and like the timezone
> > > > support (and scheduled posts) but out of idle curiosity, what was the
> > > > rationale for this ?
> I think we abandoned DB-native date formats for an INT simply because of
> compatibility. It's easier to handle converting to and from INT and date on
> the DB side across platforms than handling timezones across platforms.
Right, it solves a bunch of compatibility issues with how various DB
engines store dates. If we simply store integers, then we don't need
to worry about DBs applying their own timezone rules which might be
different than the PHP DateTime rules.
> For example, this change now makes using SQL to select posts
> difficult.
I disagree. It is now easier to do many things in a way which does not
depend on DB vendor-specific syntax. Want all posts from a week ago?
Do something like:
You developers tend to think like, well, developers.
I appreciate that from a developer viewpoint, for a plethora of
different reasons, it may well make the implementation (and your
lives) much easier.
Ironically enough, the use of HabariDateTime::date function actually
masks the physical implementation of this object in the schema.
This encapsulation should have helped to implement postdate as a UTC
date and adjust for the configured timezone.
However, from a data modeling viewpoint, implementing a date as a 32-
bit number is horrible. In fact, for a so-called next generation piece
of software, it's embarassing.
You remind me of the Oracle developer who maintained that using a
VARCHAR for a timestamp field was 'no problem'. He could easily tell
me the month, the day, the year, the minute from any given date - in
fact, he could tell me everything.
Well - no problem until I asked him to write me a 'tomorrow()'
function :-)
PS. If I am a mere user using PhpAdmin to access my Habari tables, am
I able to use PHP functions ?
On Sep 15, 4:15 pm, Blake Johnson <blakejohnso...@gmail.com> wrote:
> > I think we abandoned DB-native date formats for an INT simply because of
> > compatibility. It's easier to handle converting to and from INT and date on
> > the DB side across platforms than handling timezones across platforms.
> Right, it solves a bunch of compatibility issues with how various DB
> engines store dates. If we simply store integers, then we don't need
> to worry about DBs applying their own timezone rules which might be
> different than the PHP DateTime rules.
> > For example, this change now makes using SQL to select posts
> > difficult.
> I disagree. It is now easier to do many things in a way which does not
> depend on DB vendor-specific syntax. Want all posts from a week ago?
> Do something like:
> This encapsulation should have helped to implement postdate as a UTC > date and adjust for the configured timezone.
Perhaps.
The current implementation doesn't have to deal with PDO returning text-based values for dates. Everything starts in an integer format that can be worked with. It shouldn't be incumbent upon the datetime class to account for idiosyncrasies between date storage and output of each database engine, that should be a function of the data layer.
> However, from a data modeling viewpoint, implementing a date as a 32- > bit number is horrible. In fact, for a so-called next generation piece > of software, it's embarassing.
It makes me sad that this email is so critical now. The schema has been on the list for months, and the branch has contained this code for weeks before the recent merge.
To me, it's more important that the software functions efficiently than it allows native engine function manipulation of date types. Still, if there's a way to have database engines return a usable format - one that isn't returned as a string we have to parse - and allows us to store the data in a native date format, it may yet be useful to explore it.
On Mon, Sep 15, 2008 at 1:34 PM, Owen Winkler <epit...@gmail.com> wrote:
> To me, it's more important that the software functions efficiently than > it allows native engine function manipulation of date types. Still, if > there's a way to have database engines return a usable format - one that > isn't returned as a string we have to parse - and allows us to store the > data in a native date format, it may yet be useful to explore it.
I like the idea of storing timestamp ints for all dates. It does look like the best bet for supporting different database platforms, and actually makes search/retrieval faster while pushing more computation out to the webserver.
I just launched a high-traffic site for a huge media company here in Times Square that gets ~40 million page views weekly. I fretted the entire time about application server performance, especially as it used a totally object-driven MVC framework. The thing launched and surprisingly enough, immediately became database bound, throughput-wise.
It is a LOT easier to throw in more webservers, than it is to throw in more database servers. Keeping the job on the database simple is a really smart approach.
On Mon, Sep 15, 2008 at 1:50 PM, Mitch Pirtle <mitch.pir...@gmail.com>wrote:
<snip>
It is a LOT easier to throw in more webservers, than it is to throw in
> more database servers. Keeping the job on the database simple is a > really smart approach.
I agree wholeheartedly. An INT field is fast, you don't have to worry about different database compatibility, and you don't have to constantly wrap your queries in some arcane DB-dependent date_format() wrapper, nor do you have to pick apart text strings.
It may not be the most usable if you're picking through an individual database manually, but then I don't think that's necessarily the highest priority here - the percentage of queries based on manual manipulation vs. automatic page generation is more than a little lopsided. Besides, normalization sucks for manual manipulation too, but we all take it for granted...
On Mon, Sep 15, 2008 at 08:25:37AM -0700, Andy Cowling wrote:
> However, from a data modeling viewpoint, implementing a date as a 32- > bit number is horrible. In fact, for a so-called next generation piece > of software, it's embarassing.
I know nothing of data modelling. Can you explain in words of two syllables or fewer why this is so bad ? And explain what you think is the right approach ?
On Tue, Sep 16, 2008 at 4:20 AM, Mitch Pirtle <mitch.pir...@gmail.com> wrote:
> I like the idea of storing timestamp ints for all dates. It does look > like the best bet for supporting different database platforms, and > actually makes search/retrieval faster while pushing more computation > out to the webserver.
The DB should be considered a raw data store-house. I'm not some leet developer, or a massively programatically knowledgeable sod, so let me put it in the terms that may best illustrate my humble point:
Consider a refrigerator - it keeps things cold and holds them in place. It stores food in reasonably defined but flexible places.
You don't typically use a fridge to heat your food, or mix it, or fold it into pretty shapes and serve it out for you. That takes other processes and tools. When you keep a fridge, a fridge, it means any chef can use anything else with it.
It remains highly compatible with pretty much any other kitchen tool or appliance- and, most importantly, remains highly efficient.
When you start building many things into the fridge, or make up rules about what can go in the fridge, how the fridge should store it, etc - the bottleneck and feature set becomes all about _the fridge_. Not the chef. Or his recipes. Or even the kitchen.
If the fridge then starts to become overloaded with tasks, it will eventually start to fail in doing what it does best. Acting as simple, reliable storage.
My point?
The DB should be just a store. When you start using the DB engine for computational work, rather than outright in/ out processing, it will slow the hell down.
Just because you can query every record with * over a 5 million row table, doesn't mean you should. Sometimes, less is more. Less is also much much faster. And as we all know - the single worst thing is a slow DB engine.
Everything _else_ falls apart as a result. The DB is key. Thus the fridge is important.
So keeping data like dates as generic as well as cross-db-independant as possible is going to mean Habari doesn't fall down the same endless pit Wordpress has.
This is a long post. My appologies on that, but Habari is at the point now where it can have all manner of 'features' that will absolutely KILL DB performance.
It's critical that, where possible, data (outside of post content, perhaps) is stored in as simple, un-processed state as possible.
Most web server engines have more than enough features and grunt, along with PHP, to generate all sorts of content. Keep the load where it can be scaled and leave the DB to be the fridge. :)
[Is this where I get kicked out for too-many-words illness? :)]
I am not a Habari or PHP developer. Surprisingly, I don't stay up all
night monitoring svn commits. Indeed I am not qualified to do so.
I just thought the design merited some discussion and I am glad I
aired the topic.
For the record, I would have favoured a UTC timestamp with a wrapper
function to adjust for the local timezone.
Unsigned integers work. The functionality works (and more). It doesn't
matter to the end user or most developers. It ain't worth reverting.
There are some arguments for unsigned integers and it is clear a vote
would have confirmed the current decision.
This snippet from the wonderful (and now broken) monthly archives
plugin illustrates the point beautifully
$q= 'SELECT YEAR( p.pubdate ) AS year, MONTH( p.pubdate ) AS month,
COUNT( p.id ) AS cnt
FROM ' . DB::table( 'posts' ) . ' p
WHERE p.content_type = ? AND p.status = ?
GROUP BY year, month
ORDER BY p.pubdate DESC ' . $limit
Self-documenting, captures all the semantics in five lines, portable
to all database platforms.
Now, obviously, this can be fixed by coding the equivalent in PHP -
it's only 1's and 0's after all but I doubt the solution would be as
concise and elegant.
Oh and please don't tell me integers are better for performance. I
work for Oracle. Databases are here to stay :-)
Habari - the next generation blogging platform (at least until 2038)
I don't know fully what's happening yet (not sure if this is relevant), currently latest svn copy broke a custom function (at a test location) to display posts 1, 2 and 3 years ago (using time() of server) so I'm not upping the real site yet until dust is settled.
On Wed, Sep 17, 2008 at 2:50 PM, Andy C <andyc...@gmail.com> wrote:
> I am not a Habari or PHP developer. Surprisingly, I don't stay up all > night monitoring svn commits. Indeed I am not qualified to do so.
> I just thought the design merited some discussion and I am glad I > aired the topic.
> For the record, I would have favoured a UTC timestamp with a wrapper > function to adjust for the local timezone.
> Unsigned integers work. The functionality works (and more). It doesn't > matter to the end user or most developers. It ain't worth reverting.
> There are some arguments for unsigned integers and it is clear a vote > would have confirmed the current decision.
> This snippet from the wonderful (and now broken) monthly archives > plugin illustrates the point beautifully
> $q= 'SELECT YEAR( p.pubdate ) AS year, MONTH( p.pubdate ) AS month, > COUNT( p.id ) AS cnt > FROM ' . DB::table( 'posts' ) . ' p > WHERE p.content_type = ? AND p.status = ? > GROUP BY year, month > ORDER BY p.pubdate DESC ' . $limit
> Self-documenting, captures all the semantics in five lines, portable > to all database platforms.
> Now, obviously, this can be fixed by coding the equivalent in PHP - > it's only 1's and 0's after all but I doubt the solution would be as > concise and elegant.
> Oh and please don't tell me integers are better for performance. I > work for Oracle. Databases are here to stay :-)
> Habari - the next generation blogging platform (at least until 2038)
Monthly Archives has been updated. It has been tested with SQLite, but
not MySQL or Postgres. It should no longer be broken. If it is, please
reopen the ticket for it. Also note that no php was required in the
query.
As a side note, functions like MONTH, YEAR, UNIX_TIMESTAMP (which is
all that is needed to get the query to work), NOW, etc are *not*
portable to all database platforms. The minute you get beyond the very
basics, translations have to be made.
Rick
On Sep 17, 2:50 am, Andy C <andyc...@gmail.com> wrote:
> I am not a Habari or PHP developer. Surprisingly, I don't stay up all
> night monitoring svn commits. Indeed I am not qualified to do so.
> I just thought the design merited some discussion and I am glad I
> aired the topic.
> For the record, I would have favoured a UTC timestamp with a wrapper
> function to adjust for the local timezone.
> Unsigned integers work. The functionality works (and more). It doesn't
> matter to the end user or most developers. It ain't worth reverting.
> There are some arguments for unsigned integers and it is clear a vote
> would have confirmed the current decision.
> This snippet from the wonderful (and now broken) monthly archives
> plugin illustrates the point beautifully
> $q= 'SELECT YEAR( p.pubdate ) AS year, MONTH( p.pubdate ) AS month,
> COUNT( p.id ) AS cnt
> FROM ' . DB::table( 'posts' ) . ' p
> WHERE p.content_type = ? AND p.status = ?
> GROUP BY year, month
> ORDER BY p.pubdate DESC ' . $limit
> Self-documenting, captures all the semantics in five lines, portable
> to all database platforms.
> Now, obviously, this can be fixed by coding the equivalent in PHP -
> it's only 1's and 0's after all but I doubt the solution would be as
> concise and elegant.
> Oh and please don't tell me integers are better for performance. I
> work for Oracle. Databases are here to stay :-)
> Habari - the next generation blogging platform (at least until 2038)
> Monthly Archives has been updated. It has been tested with SQLite, but
> not MySQL or Postgres. It should no longer be broken. If it is, please
> reopen the ticket for it. Also note that no php was required in the
> query.
> As a side note, functions like MONTH, YEAR, UNIX_TIMESTAMP (which is
> all that is needed to get the query to work), NOW, etc are *not*
> portable to all database platforms. The minute you get beyond the very
> basics, translations have to be made.
> Rick
> On Sep 17, 2:50 am, Andy C <andyc...@gmail.com> wrote:
> > I am not a Habari or PHP developer. Surprisingly, I don't stay up all
> > night monitoring svn commits. Indeed I am not qualified to do so.
> > I just thought the design merited some discussion and I am glad I
> > aired the topic.
> > For the record, I would have favoured a UTC timestamp with a wrapper
> > function to adjust for the local timezone.
> > Unsigned integers work. The functionality works (and more). It doesn't
> > matter to the end user or most developers. It ain't worth reverting.
> > There are some arguments for unsigned integers and it is clear a vote
> > would have confirmed the current decision.
> > This snippet from the wonderful (and now broken) monthly archives
> > plugin illustrates the point beautifully
> > $q= 'SELECT YEAR( p.pubdate ) AS year, MONTH( p.pubdate ) AS month,
> > COUNT( p.id ) AS cnt
> > FROM ' . DB::table( 'posts' ) . ' p
> > WHERE p.content_type = ? AND p.status = ?
> > GROUP BY year, month
> > ORDER BY p.pubdate DESC ' . $limit
> > Self-documenting, captures all the semantics in five lines, portable
> > to all database platforms.
> > Now, obviously, this can be fixed by coding the equivalent in PHP -
> > it's only 1's and 0's after all but I doubt the solution would be as
> > concise and elegant.
> > Oh and please don't tell me integers are better for performance. I
> > work for Oracle. Databases are here to stay :-)
> > Habari - the next generation blogging platform (at least until 2038)
> This snippet from the wonderful (and now broken) monthly archives
> plugin illustrates the point beautifully
> $q= 'SELECT YEAR( p.pubdate ) AS year, MONTH( p.pubdate ) AS month,
> COUNT( p.id ) AS cnt
> FROM ' . DB::table( 'posts' ) . ' p
> WHERE p.content_type = ? AND p.status = ?
> GROUP BY year, month
> ORDER BY p.pubdate DESC ' . $limit
> Self-documenting, captures all the semantics in five lines, portable
> to all database platforms.
I'd just like to point out that YEAR() and MONTH() are MySQL specific
funcitons. So, this query is not portable to all platforms. It
happened to work, previously, because we provided translations for
SQLite and PostgreSQL.
I will also mention that you don't have to write your own query to do
this. It is provided by the API with:
$counts = Posts::get( array( 'month_cts' => 1, 'status' =>
Post::status( 'published' ) ) );
If plugins use our API which abstracts DB details, upgrades will be
much less painful.
> > This snippet from the wonderful (and now broken) monthly archives
> > plugin illustrates the point beautifully
> > $q= 'SELECT YEAR( p.pubdate ) AS year, MONTH( p.pubdate ) AS month,
> > COUNT( p.id ) AS cnt
> > FROM ' . DB::table( 'posts' ) . ' p
> > WHERE p.content_type = ? AND p.status = ?
> > GROUP BY year, month
> > ORDER BY p.pubdate DESC ' . $limit
> > Self-documenting, captures all the semantics in five lines, portable
> > to all database platforms.
> I'd just like to point out that YEAR() and MONTH() are MySQL specific
> funcitons. So, this query is not portable to all platforms. It
> happened to work, previously, because we provided translations for
> SQLite and PostgreSQL.
> I will also mention that you don't have to write your own query to do
> this. It is provided by the API with:
> $counts = Posts::get( array( 'month_cts' => 1, 'status' =>
> Post::status( 'published' ) ) );
> If plugins use our API which abstracts DB details, upgrades will be
> much less painful.
A quick lesson to me please, not capable of having my own sql query nor to use API yet I'm.
I wrote a little theme function to display posts either 1, 2 or 3 years ago (with parameters such as a time window, e.g. display posts between 1 year less 2 weeks to 1 year ...etc). How do I use the lastest API to do the same thing?
On Thu, Sep 18, 2008 at 2:14 AM, Blake Johnson <blakejohnso...@gmail.com>wrote:
> > This snippet from the wonderful (and now broken) monthly archives > > plugin illustrates the point beautifully
> > $q= 'SELECT YEAR( p.pubdate ) AS year, MONTH( p.pubdate ) AS month, > > COUNT( p.id ) AS cnt > > FROM ' . DB::table( 'posts' ) . ' p > > WHERE p.content_type = ? AND p.status = ? > > GROUP BY year, month > > ORDER BY p.pubdate DESC ' . $limit
> > Self-documenting, captures all the semantics in five lines, portable > > to all database platforms.
> I'd just like to point out that YEAR() and MONTH() are MySQL specific > funcitons. So, this query is not portable to all platforms. It > happened to work, previously, because we provided translations for > SQLite and PostgreSQL.
> I will also mention that you don't have to write your own query to do > this. It is provided by the API with: > $counts = Posts::get( array( 'month_cts' => 1, 'status' => > Post::status( 'published' ) ) );
> If plugins use our API which abstracts DB details, upgrades will be > much less painful.
On Sep 17, 8:11 pm, "Stanley FONG" <sfon...@gmail.com> wrote:
> A quick lesson to me please, not capable of having my own sql query nor to
> use API yet I'm.
> I wrote a little theme function to display posts either 1, 2 or 3 years ago
> (with parameters such as a time window, e.g. display posts between 1 year
> less 2 weeks to 1 year ...etc). How do I use the lastest API to do the
> same thing?
Hi Stanley,
Posts::get() accepts 'before' and 'after' parameters. So you can do
> On Sep 17, 8:11 pm, "Stanley FONG" <sfon...@gmail.com> wrote:
> > A quick lesson to me please, not capable of having my own sql query nor to
> > use API yet I'm.
> > I wrote a little theme function to display posts either 1, 2 or 3 years ago
> > (with parameters such as a time window, e.g. display posts between 1 year
> > less 2 weeks to 1 year ...etc). How do I use the lastest API to do the
> > same thing?
> Hi Stanley,
> Posts::get() accepts 'before' and 'after' parameters. So you can do
> What parameters it is possible to use with Posts::get(), and how to > use them, really should be documented in the wiki.
> Rick
> On Sep 21, 10:14 am, Blake Johnson <blakejohnso...@gmail.com> wrote: >> On Sep 17, 8:11 pm, "Stanley FONG" <sfon...@gmail.com> wrote:
>>> A quick lesson to me please, not capable of having my own sql >>> query nor to >>> use API yet I'm.
>>> I wrote a little theme function to display posts either 1, 2 or 3 >>> years ago >>> (with parameters such as a time window, e.g. display posts between >>> 1 year >>> less 2 weeks to 1 year ...etc). How do I use the lastest API to >>> do the >>> same thing?
>> Hi Stanley,
>> Posts::get() accepts 'before' and 'after' parameters. So you can do
> > What parameters it is possible to use with Posts::get(), and how to > > use them, really should be documented in the wiki.
> > Rick
> > On Sep 21, 10:14 am, Blake Johnson <blakejohnso...@gmail.com> wrote: > >> On Sep 17, 8:11 pm, "Stanley FONG" <sfon...@gmail.com> wrote:
> >>> A quick lesson to me please, not capable of having my own sql > >>> query nor to > >>> use API yet I'm.
> >>> I wrote a little theme function to display posts either 1, 2 or 3 > >>> years ago > >>> (with parameters such as a time window, e.g. display posts between > >>> 1 year > >>> less 2 weeks to 1 year ...etc). How do I use the lastest API to > >>> do the > >>> same thing?
> >> Hi Stanley,
> >> Posts::get() accepts 'before' and 'after' parameters. So you can do
> On Sep 17, 8:11 pm, "Stanley FONG" <sfon...@gmail.com> wrote: > > A quick lesson to me please, not capable of having my own sql query nor > to > > use API yet I'm.
> > I wrote a little theme function to display posts either 1, 2 or 3 years > ago > > (with parameters such as a time window, e.g. display posts between 1 year > > less 2 weeks to 1 year ...etc). How do I use the lastest API to do the > > same thing?
> Hi Stanley,
> Posts::get() accepts 'before' and 'after' parameters. So you can do