Nodular

Kiran Jonnalagadda

unread,

Mar 19, 2013, 7:01:40 AM3/19/13

to hasgee...@googlegroups.com

Starting a separate thread for Nodular.

Code: https://github.com/hasgeek/nodular

Documentation: http://nodular.readthedocs.org/en/latest/

Test results: https://travis-ci.org/hasgeek/nodular

Except for the i18n bit, I'm mostly done with the base models. I'll now switch to working on the HTTP publisher and content types. UI after that.

Kiran

--

Kiran Jonnalagadda

+91-99452-35123

http://hasgeek.com

Kiran Jonnalagadda

unread,

Mar 21, 2013, 4:52:39 AM3/21/13

to hasgee...@googlegroups.com

Today's update: I've removed language support from revisions and am writing tests for content revisioning.

Yesterday I checked in support for HTTP 410 Gone responses. When a node is deleted, a placeholder is left in its place indicating that the node is gone. This seemingly useless feature is required for data sync in the absence of site-wide version control. There is no other way to indicate that a node has been removed from the source repo and should be removed from everywhere else.

When testing this, I found SQLite3 doesn't enforce foreign key constraints, so when a parent node is deleted, child nodes remain in the database, with null parents. SQLite3 has optional support for constraints. They have to be turned on each time. I added this patch to Nodular: http://git.io/GYWETQ

I'm not currently testing with MySQL, but I'm aware that MySQL's default MyISAM engine has the same problem. MySQL can be reconfigured to use InnoDB by default, and SQLAlchemy can be told to select InnoDB in each table's configuration.

I wonder if SQLAlchemy can be configured to prefer InnoDB by default globally.

Kiran

--

Kiran Jonnalagadda

+91-99452-35123

http://hasgeek.com

Devi

unread,

Mar 21, 2013, 6:29:59 AM3/21/13

to HasGeek Code

> I wonder if SQLAlchemy can be configured to prefer InnoDB by default globally.

I don't see any direct way to do this.

How about this - have a class MySQLMixin with __table_args__ =
{'mysql_engine': 'InnoDB'} and add that to the __bases__ of BaseMixin
whenever the db connection is mysql.

BTW, do we use MySQL or Postgres for production?

kracekumar ramaraju

unread,

Mar 21, 2013, 6:32:15 AM3/21/13

to hasgee...@googlegroups.com

Postgres in production. I think mysql engine details should lie in config file rather than mixin.

--
You received this message because you are subscribed to the Google Groups "HasGeek Code" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hasgeek-code...@googlegroups.com.
To post to this group, send email to hasgee...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
Thanks & Regards

"Talk is cheap, show me the code" -- Linus Torvalds

kracekumar
www.kracekumar.com

Devi

unread,

Mar 21, 2013, 7:51:04 AM3/21/13

to hasgee...@googlegroups.com

Agreed. The mixin can pick 'mysql_engine' up from the config.
But the whole thing is hacky.

~Devi

Kiran Jonnalagadda

unread,

Mar 21, 2013, 8:20:59 AM3/21/13

to hasgee...@googlegroups.com

Engine configuration happens way after all the models and tables have been defined so it's too late to add MySQL configuration then.

The problem with putting __table_args__ in the base class is that that field is also used for other bits of configuration like multi-column unique constraints. Every model that sets __table_args__ now also needs to check for an inherited value.

Next, __table_args__ supports dual syntax: a dictionary OR a tuple with the last item as a dictionary. This is too much to check for in the subclass:

class MyModel(BaseMixin, db.Model):

if isinstance(BaseMixin.__table_args__, dict):

__table_args__ = (..., BaseMixin.__table_args__)

else:

__table_args__ = (...) + (BaseMixin.__table_args__)

I don't even know if an `if` statement directly in a class statement works, but as you can see, this approach is just ugly.

Kiran

--

Kiran Jonnalagadda

+91-99452-35123

http://hasgeek.com

Kiran Jonnalagadda

unread,

Mar 21, 2013, 11:24:56 AM3/21/13

to hasgee...@googlegroups.com

On Thursday, 21 March 2013 at 2:22 PM, Kiran Jonnalagadda wrote:

Today's update: I've removed language support from revisions and am writing tests for content revisioning.

This is now in: https://github.com/hasgeek/nodular/commit/78362034899f4281f8a55cd8902510ab21cba75a#L1R138

I've replaced the integer workflow status with a text label, primarily so future content types have the leeway to add more workflow states than Draft/Pending/Published.

I made an interesting discovery with how SQLAlchemy orders statements for a commit.

In a revision, (node_id, workflow_label) is unique to ensure that a given label can be present only once per node. In the link above, when a new revision is created, the label is first removed from the existing revision, if any. The SQL order of statements is UPDATE, INSERT. For some reason however, SQLAlchemy's unit-of-work sorter emits them as INSERT, UPDATE, causing an IntegrityError.

I've thrown in a session.flush() to push out the UPDATE before INSERT, but I feel dirty about this. Session flush() does not commit, but in a non-transactional db like MySQL with MyISAM, it's as good as a commit.

Kiran

Kiran Jonnalagadda

unread,

Mar 21, 2013, 6:43:58 PM3/21/13

to hasgee...@googlegroups.com

I just checked in support for "properties", basically {name: value} pairs that can be attached to any node. The value can be any JSON-serializable value, including strings, integers, lists and dictionaries, but notably not dates.

Each property is stored as a distinct row in the database.

Properties can ostensibly be used to create user-defined types where all the data is stored in properties, but I don't think that is appropriate. A user-defined type should store all data as a JSON blob in a text field, not in distinct rows.

The intended use for properties is to store configuration such as a Google Analytics code or a Typekit code. Values that (a) should not be hardcoded into the template and (b) should be very easy to retrieve from the database (and hence in separate rows).

I'm facing a strange issue with the test setup under pypy. The test setup runs the properties tests twice, once against the Node base class and once against the TestType dummy content type. It seems that on the second run, the old Property rows are still in the db, leading to a FlushError (not the usual IntegrityError). This happens with both SQLite and PostgreSQL, but only with PyPy. CPython is fine.

I haven't encountered this one before. There should be no data in the database already because all contents are dropped and re-created for each test. Apparently it's possible for some data to remain in the session between tests, but it's not clear to me why this is only happening with PyPy.

These are the only references I could find in the docs:

Current: http://docs.sqlalchemy.org/en/latest/orm/session.html#merge-tips

Obsolete: http://osdir.com/ml/python.sqlelixir/2007-04/msg00005.html

Very odd.

Kiran

--

Kiran Jonnalagadda

+91-99452-35123

http://hasgeek.com

Kiran Jonnalagadda

unread,

Mar 22, 2013, 12:06:45 AM3/22/13

to hasgee...@googlegroups.com

On Thursday, 21 March 2013 at 8:54 PM, Kiran Jonnalagadda wrote:

I've thrown in a session.flush() to push out the UPDATE before INSERT, but I feel dirty about this. Session flush() does not commit, but in a non-transactional db like MySQL with MyISAM, it's as good as a commit.

According to the docs, it's normal for SQLAlchemy to do a flush before a query. http://docs.sqlalchemy.org/en/latest/orm/session.html#flushing

My code doesn't perform a query, which is why I need to manually flush. I guess I'm good.

Kiran

Kiran Jonnalagadda

unread,

Mar 24, 2013, 2:40:41 PM3/24/13

to hasgee...@googlegroups.com

On Friday, 22 March 2013 at 4:13 AM, Kiran Jonnalagadda wrote:

I'm facing a strange issue with the test setup under pypy. The test setup runs the properties tests twice, once against the Node base class and once against the TestType dummy content type. It seems that on the second run, the old Property rows are still in the db, leading to a FlushError (not the usual IntegrityError). This happens with both SQLite and PostgreSQL, but only with PyPy. CPython is fine.

Adding session.remove() in tearDown() fixed this. I've lost a couple of days hitting this wall. Now to get back on track.

Kiran

Kiran Jonnalagadda

unread,

Apr 2, 2013, 2:59:58 AM4/2/13

to hasgee...@googlegroups.com

I've started work on the node publisher:

Docs: http://nodular.readthedocs.org/en/latest/publisher.html

Tests: https://github.com/hasgeek/nodular/blob/master/tests/test_publisher.py

There's about a day's work left on the publisher and another pending module, the node registry, following which we can start moving into app space. TODO:

1. nodular.publisher -- handle 410 responses, handle permissions, handle HTTP views

2. nodular.registry -- provide a place to register nodes and their handlers

3. nodular.content.* -- common content types: data models and UI handlers

4. ...

5. PROFIT!

--

Kiran Jonnalagadda

+91-99452-35123

http://hasgeek.com

Kiran Jonnalagadda

unread,

Apr 7, 2013, 4:20:52 PM4/7/13

to hasgee...@googlegroups.com

The publisher and registry are both done. Test coverage is at 97% and updated documentation is at http://nodular.readthedocs.org/en/latest/.

We are finally in content-and-application space! I will now start rewriting Eventframe atop Nodular.

Kiran

--

Kiran Jonnalagadda

+91-99452-35123

http://hasgeek.com

Kiran Jonnalagadda

unread,

Apr 8, 2013, 3:22:55 AM4/8/13

to hasgee...@googlegroups.com

On Monday, 8 April 2013 at 1:50 AM, Kiran Jonnalagadda wrote:

The publisher and registry are both done. Test coverage is at 97% and updated documentation is at http://nodular.readthedocs.org/en/latest/.

We are finally in content-and-application space! I will now start rewriting Eventframe atop Nodular.

So at this point I have to wonder again if Nodular is just a waste of time since Pyramid already has a very malleable traversal framework. Here's the documentation: http://docs.pylonsproject.org/projects/pyramid/en/latest/narr/traversal.html

The consolation: Pyramid's lookup is one level of hierarchy at a time because it can't make assumptions about what is being looked up. This is somewhat inefficient when each lookup involves a database roundtrip.

Nodular's traversal only works with the Node model and it knows how to jump to any depth level, so it's capable of pulling out the entire hierarchy in one query. I suppose this and other such context-aware optimizations will justify the month+ I've sunk into this.

Kiran

Kiran Jonnalagadda

unread,

May 30, 2013, 2:40:38 AM5/30/13

to hasgee...@googlegroups.com

I just pushed out updates to Nodular:

Nodular now supports multiple root nodes. This means you can have multiple node trees in the same database.

This is a significant change to how apps can be built around Nodular. Previously, apps had to commit to all their data living in a single node tree. Now apps can have tables just like before, and can link a tree to wherever one is required.

The key change is that the "node" table now has a "root_id" column that points to the root node, and paths are unique in combination with the root. The value of this column is automatically maintained: Root is always the parent's root or self.

I'm writing tests for moving nodes across roots. Will push those out later today.

Kiran

--
Kiran Jonnalagadda
+91-99452-35123
http://hasgeek.com

Reply all

Reply to author

Forward