ODM Data Structure

7 views
Skip to first unread message

Saul Howard

unread,
Aug 21, 2010, 1:13:33 AM8/21/10
to oedipus-decision-maker-discuss
I've been considering the use of MySQL tables for storing the Drama
data for ODM.

Of course a web app will use SQL for much of its data, but I'm
thinking that maybe it's not the best fit for the actual drama data.

It also introduces a dependency on MySQL. If Oedipus were to include
non web-app components in the future this would be undesirable.

I have put some of my thoughts on the wiki, here:

http://wiki.drama-theory.com/ODMDataStructure

there is a JSON representation of the data there as well, to get some
idea of I'm talking about.

Alternatives to a MySQL store are limited to flat files, or a Key-
Value database, I suppose.

Saul

Robert Impey

unread,
Aug 22, 2010, 3:46:16 AM8/22/10
to oedipus-decision-maker-discuss
Is the problem one of efficiency? Nested SQL calls might be a bad
idea. Can they not be avoided by using joins? I've added a folder with
an example SELECT statement that fetches all the data that you show on
the wiki page to the project at

http://oedipus-decision-maker.googlecode.com/svn/sites/main/trunk/sql/useful-queries/

Many of the relationships between the tables look like they should be
many-to-many, rather than one-to-many. There seems to be some
duplication in the characters and frames tables in particular. I think
before we go to a different data storing system, we should try to fix
the one we have.

Rob

Robert Impey

unread,
Aug 22, 2010, 3:46:21 AM8/22/10
to oedipus-decision-maker-discuss
Is the problem one of efficiency? Nested SQL calls might be a bad
idea. Can they not be avoided by using joins? I've added a folder with
an example SELECT statement that fetches all the data that you show on
the wiki page to the project at

http://oedipus-decision-maker.googlecode.com/svn/sites/main/trunk/sql/useful-queries/

Many of the relationships between the tables look like they should be
many-to-many, rather than one-to-many. There seems to be some
duplication in the characters and frames tables in particular. I think
before we go to a different data storing system, we should try to fix
the one we have.

Rob

On Aug 21, 6:13 am, Saul Howard <saulhow...@gmail.com> wrote:

Saul Howard

unread,
Aug 22, 2010, 9:00:52 AM8/22/10
to oedipus-decision-maker-discuss
On Aug 22, 2:46 pm, Robert Impey <robert.im...@gmail.com> wrote:
> Is the problem one of efficiency? Nested SQL calls might be a bad
> idea. Can they not be avoided by using joins? I've added a folder with
> an example SELECT statement that fetches all the data that you show on
> the wiki page to the project at
>
> http://oedipus-decision-maker.googlecode.com/svn/sites/main/trunk/sql...
>

Efficiency was one of my concerns, yes, and almost certainly one that
can be fixed with better SQL code. I think it's certainly worth
perfecting a SQL storage system for the data. The SQL statements
you've posted are just the thing.

If you have time to rewrite the actual functions, then you should of
course just go ahead. The problem is most clear in the one I linked to
on the wiki:

http://code.google.com/p/oedipus-decision-maker/source/browse/sites/main/trunk/classes/helpers/Oedipus_FrameHelper.inc.php?r=117&spec=svn131#118

Which is a horribly written thing! It came about when I was first
experimenting with the table structure and I never got round to fixing
it. Not that it is slow with the current small data set.

> Many of the relationships between the tables look like they should be
> many-to-many, rather than one-to-many. There seems to be some
> duplication in the characters and frames tables in particular. I think
> before we go to a different data storing system, we should try to fix
> the one we have.
>

I certainly agree that more work on what we've got would be
illuminating. I think the problem I have is that I feel the data is
perhaps not represented as well as it could be. The tables haven't
been through any revisions since I started. I would like to hear more
on the duplications you've found. There was a problem with characters
- sometimes a character has the same name but is not the same etc. I
think it would be good to start work by re-designing the tables, (if
necessary of course). Get the data right and the rest will follow, as
it were.

I'll start by looking at the current SQL tables and see what I come up
with.

Saul

Robert Impey

unread,
Aug 22, 2010, 9:13:38 AM8/22/10
to oedipus-decisio...@googlegroups.com
I think that a lot of web applications can be thought of a UI wrappers
around a database. It should possible to do almost everything "by
hand" in SQL as well as with the web UI.

I think a good first step to improving the database structure is to
use MySQL work bench.

http://wb.mysql.com/

It's available on all the OS's and is very useful for schema comparisons.

I've added a work bench file to project. Unfortunately, the files are
binary so conflicts might be a bore to resolve. While the set of data
that we are working with is small, it's not difficult to have create
and import scripts under SVN as well as the work bench files. If we
start to have a larger set of data on a live system, dropping the
contents of the database ceases to be an option. Updating a live
database with a tool like MySQL WB is probably the best option.

> --
> You received this message because you are subscribed to the Google Groups "oedipus-decision-maker-discuss" group.
> To post to this group, send email to oedipus-decisio...@googlegroups.com.
> To unsubscribe from this group, send email to oedipus-decision-make...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/oedipus-decision-maker-discuss?hl=en.
>
>

--
Rob Impey

Robert Impey

unread,
Aug 22, 2010, 3:14:03 PM8/22/10
to oedipus-decision-maker-discuss
I've committed some changes to the database structure. I've separated
the database scripts for ODM and general Haddock CMS into separate
files for convenience. The entity relationship diagram is quite
interesting.

There are a number of areas where I think the database structure could
improve.

We add duplicate rows for characters each time we create a new frame.
Characters should be unique for a drama. The involvement of a
character in a frame should be stored in a linking table. We would
need to make updates to the stated intentions, positions and options
tables to accommodate this change.

The notes table is an example of the "one true lookup table" anti-
pattern.

http://www.projectdmx.com/dbdesign/lookup.aspx

I've not added foreign keys to the tables involved because I think
that they should be removed.

We should either have a table for notes on a scene and another for
notes on a frame, or (my preference) add a nullable column called
'notes' to the scenes and frames tables. We should not add the
instructions to the notes when we create the new frame or scene. "No
data" should be stored as null. The PHP UI should display instructions
if it finds a null in the notes column.

The frames tree table is like a doubly linked list. This is not a good
way to store sort order data in a table because it is not easy to
select the frames of a scene in order. A better approach might be to
add a sort order column to the frames table. The value would be unique
for each scene. This makes selecting in order the frames of a scene
trivial. We could also add such a column to scenes and acts. The
problem with this approach is that it means that you have to update
all the frames in a scene if you insert a new frame at the beginning
of a scene. As the UI only allows us to add frames to the end of a
scene, this shouldn't be a problem.

The doubt column in the stated intentions table should probably be an
enum or a look up table.

We should make greater use of views and stored routines.

Looking at the database of this project, I wonder if drama theory can
be expressed entirely in SQL. I believe that it can. This should
probably be the line of attack that we take before we do much more
work on the UI.

Saul Howard

unread,
Aug 23, 2010, 1:42:03 AM8/23/10
to oedipus-decision-maker-discuss
On Aug 23, 2:14 am, Robert Impey <robert.im...@gmail.com> wrote:
> We add duplicate rows for characters each time we create a new frame.
> Characters should be unique for a drama. The involvement of a
> character in a frame should be stored in a linking table. We would
> need to make updates to the stated intentions, positions and options
> tables to accommodate this change.
>

This approach sounds good, and it will also allow for more useful
analysis to be done (if we know that all instances of a character are
the same character).

There is a sense in which characters are unique to frames. Characters
may change name between frames ('Anonymous Stranger' -> 'John Smith')
and they may split up ('NATO' -> 'US' & 'Britain'). It looks as though
there might be a need for some meta-data which tracks aliases and
groupings.

I think it's best to proceed as you say, making the character unique
to the drama, then we can think of a way to link different character's
in the UI if it becomes desirable, I'm thinking of some kind of [This
character is an Alias of ....] drop-down, or a [this character is a
group composed of ...] dialog.

> We should either have a table for notes on a scene and another for
> notes on a frame, or (my preference) add a nullable column called
> 'notes' to the scenes and frames tables. We should not add the
> instructions to the notes when we create the new frame or scene. "No
> data" should be stored as null. The PHP UI should display instructions
> if it finds a null in the notes column.
>

I agree with adding a nullable column to the scenes and frames tables.
A much better solution. It's interesting to see how I fell into that
anti-pattern so neatly!


> The frames tree table is like a doubly linked list. This is not a good
> way to store sort order data in a table because it is not easy to
> select the frames of a scene in order. A better approach might be to
> add a sort order column to the frames table. The value would be unique
> for each scene. This makes selecting in order the frames of a scene
> trivial. We could also add such a column to scenes and acts. The
> problem with this approach is that it means that you have to update
> all the frames in a scene if you insert a new frame at the beginning
> of a scene. As the UI only allows us to add frames to the end of a
> scene, this shouldn't be a problem.
>

I should point out that the UI will in the future certainly support
adding intermediary frames. It will also support branching of frames.
In fact it will be necessary for the frame, act and scene
representations to support every kind of branching, forking, inserting
and copying and pasting of any arbitrary selection of nodes to other
points in the tree or indeed to other dramas. These features will be
powerful and necessary when creating fictional dramas and modelling
existing dramas. They are limited at the moment by the necessity of
creating a workable UI! I've always imagined a SVG/HTML5/JS solution
to the UI, the one at the moment is not representative of what I think
can be created.

I suppose a 'sort_order' column approach is out, as it won't support
branching frames (scenes/acts).

Bearing this in mind, what do you think will be the best
representation? I'm sure what we've got at the moment is not the best
method. It's true that selecting the frames (scenes/acts) in order
will be the most common operation, but all of the moving nodes about
will be quite common as well.

> The doubt column in the stated intentions table should probably be an
> enum or a look up table.
>

Yes. In fact the position columns in the stated_intentions and
positions tables should be an enum (or look-up table) also.

> We should make greater use of views and stored routines.
>

I'm not sure what you mean!

> Looking at the database of this project, I wonder if drama theory can
> be expressed entirely in SQL. I believe that it can. This should
> probably be the line of attack that we take before we do much more
> work on the UI.

Yes. I agree with this totally. I'm sure it can be expressed in SQL. I
unfortunately didn't bring the original spec for the project out with
me, but when I get back next month I will type it up and put it
online. As a mathematical specification of the oedipus engine it will
be a useful reference.

Robert Impey

unread,
Aug 23, 2010, 3:08:50 PM8/23/10
to oedipus-decisio...@googlegroups.com
On 23 August 2010 06:42, Saul Howard <saulh...@gmail.com> wrote:
> On Aug 23, 2:14 am, Robert Impey <robert.im...@gmail.com> wrote:
>> We add duplicate rows for characters each time we create a new frame.
>> Characters should be unique for a drama. The involvement of a
>> character in a frame should be stored in a linking table. We would
>> need to make updates to the stated intentions, positions and options
>> tables to accommodate this change.
>>
>
> This approach sounds good, and it will also allow for more useful
> analysis to be done (if we know that all instances of a character are
> the same character).
>
> There is a sense in which characters are unique to frames. Characters
> may change name between frames ('Anonymous Stranger' -> 'John Smith')
> and they may split up ('NATO' -> 'US' & 'Britain'). It looks as though
> there might be a need for some meta-data which tracks aliases and
> groupings.

That's a interesting point about the groups splitting up or the name
changing. I think that it will be useful to be able to navigate the UI
by a characters involvement in the drama.

>
> I think it's best to proceed as you say, making the character unique
> to the drama, then we can think of a way to link different character's
> in the UI if it becomes desirable, I'm thinking of some kind of [This
> character is an Alias of ....] drop-down, or a [this character is a
> group composed of ...] dialog.
>
>> We should either have a table for notes on a scene and another for
>> notes on a frame, or (my preference) add a nullable column called
>> 'notes' to the scenes and frames tables. We should not add the
>> instructions to the notes when we create the new frame or scene. "No
>> data" should be stored as null. The PHP UI should display instructions
>> if it finds a null in the notes column.
>>
>
> I agree with adding a nullable column to the scenes and frames tables.
> A much better solution. It's interesting to see how I fell into that
> anti-pattern so neatly!
>

I admit that I only saw it because I saw a similar thing at work on Friday.

>
>> The frames tree table is like a doubly linked list. This is not a good
>> way to store sort order data in a table because it is not easy to
>> select the frames of a scene in order. A better approach might be to
>> add a sort order column to the frames table. The value would be unique
>> for each scene. This makes selecting in order the frames of a scene
>> trivial. We could also add such a column to scenes and acts. The
>> problem with this approach is that it means that you have to update
>> all the frames in a scene if you insert a new frame at the beginning
>> of a scene. As the UI only allows us to add frames to the end of a
>> scene, this shouldn't be a problem.
>>
>
> I should point out that the UI will in the future certainly support
> adding intermediary frames. It will also support branching of frames.
> In fact it will be necessary for the frame, act and scene
> representations to support every kind of branching, forking, inserting
> and copying and pasting of any arbitrary selection of nodes to other
> points in the tree or indeed to other dramas. These features will be
> powerful and necessary when creating fictional dramas and modelling
> existing dramas. They are limited at the moment by the necessity of
> creating a workable UI! I've always imagined a SVG/HTML5/JS solution
> to the UI, the one at the moment is not representative of what I think
> can be created.
>
> I suppose a 'sort_order' column approach is out, as it won't support
> branching frames (scenes/acts).
>

We could probably make it work, but it would involve shifting the sort
order of all the frames in the scene after the branch down to start at
zero.

> Bearing this in mind, what do you think will be the best
> representation? I'm sure what we've got at the moment is not the best
> method. It's true that selecting the frames (scenes/acts) in order
> will be the most common operation, but all of the moving nodes about
> will be quite common as well.

There's no simple answer to the question about what data structure to
choose. Normally there is a compromise for any of the choices. My
guess would be that selecting will be the more frequently performed
operation.

>
>> The doubt column in the stated intentions table should probably be an
>> enum or a look up table.
>>
>
> Yes. In fact the position columns in the stated_intentions and
> positions tables should be an enum (or look-up table) also.
>
>> We should make greater use of views and stored routines.
>>
>
> I'm not sure what you mean!

We could use a view for the select statement for accessing the frames at:

https://oedipus-decision-maker.googlecode.com/svn/sites/main/trunk/sql/useful-queries/select-frame.sql

http://dev.mysql.com/doc/refman/5.1/en/create-view.html

We could use a stored routine for operations like adding a frame or branching.

http://dev.mysql.com/doc/refman/5.1/en/stored-routines.html

We can survive perfectly well without them, but they allow us to lock
down the spec of the system.

>
>> Looking at the database of this project, I wonder if drama theory can
>> be expressed entirely in SQL. I believe that it can. This should
>> probably be the line of attack that we take before we do much more
>> work on the UI.
>
> Yes. I agree with this totally. I'm sure it can be expressed in SQL. I
> unfortunately didn't bring the original spec for the project out with
> me, but when I get back next month I will type it up and put it
> online. As a mathematical specification of the oedipus engine it will
> be a useful reference.
>

I'll look forward to it.

I wonder how many other game theory-like systems have been codified in
SQL. SQL is a set based language. Can drama theory be expressed
completely in set theory?

Rob

Reply all
Reply to author
Forward
0 new messages