[PROPOSAL] Fix data structures for editing projects

2 views
Skip to first unread message

Sander W G van der Waal

unread,
Apr 21, 2011, 11:15:32 AM4/21/11
to simal-con...@googlegroups.com
Dear Simal-ers,

There's a nasty problem with the data structure of the
projects that I've run into again when working on editing
projects on the project page. It's documented in issue 289 [1]
but I'm proposing a different, more incremental solution
here than what's suggested there.

The core of the problem is that the data structure of projects
is such that each project is represented by a simal:Project,
linked to one or more sources (of type doap:Project) via
rdfs:seeAlso.

On the project detail page all of these sources are collected
and represented as one project. This poses problems when editing
the project, because when you're saving data back you need to
keep track of which attribute originates from which of the
(possibly related) source projects etc. This is currently not
completely unsupported and very difficult to put in. Also,
deleting imported data from eg. JISC might not be what we want.

My suggested solution is:
- Keep a reference to each different source and a timestamp
so we know when and from where a source has been assigned
(proposed earlier and already in the wiki [2]).
- Initially keep the mechanism for displaying the project detail
page the same (ie. collate the information from all the sources)
- When editing a project for the first time, create a new source
reference with a designated simal:source, eg. 'web-interface'.
- Duplicate the information from all the source projects to this
new source project.
- Edit and save all changes to this new source project
- From now on always use the project with source 'web-interface'

For now I feel this is the best combination of consistency in the
UI and a relative easy solution.

Also, this enables us to keep historical information because we're
not directly editing imports but still displaying what's the 'current'
view. Further down the line I imagine something like several tabs
on the project detail page, one for each source, which would enable
keeping a more detailed view on the project.

Any thoughts or comments are more than welcome!

Sander

[1] http://code.google.com/p/simal/issues/detail?id=289
[2] http://code.google.com/p/simal/wiki/Schema

OSS Watch - supporting open source in education and research http://www.oss-watch.ac.uk

Steve Bennett

unread,
May 23, 2011, 10:15:29 PM5/23/11
to simal-con...@googlegroups.com
On 22 April 2011 01:15, Sander W G van der Waal

<sander.v...@oucs.ox.ac.uk> wrote:
> - When editing a project for the first time, create a new source
>  reference with a designated simal:source, eg. 'web-interface'.
> - Duplicate the information from all the source projects to this
>  new source project.
> - Edit and save all changes to this new source project
> - From now on always use the project with source 'web-interface'

Does this mean that once a project is edited, it is effectively frozen
and automatic harvests will always be superseded by that original
duplication? In other words:

- Description: "My project", source JISC
- SVN: svn.myproject.com, source JISC

I edit, update SVN. We now have:

- Description: "My project", source web-interface
- Description: "My project", source JISC
- SVN: svn.newhost.com, source web-interface
- SVN: svn.myproject.com, source JISC

Are you sure you want the (unchanged) Description element duplicated?

Steve

Sander van der Waal

unread,
May 25, 2011, 10:41:23 AM5/25/11
to simal-con...@googlegroups.com
> From: simal-con...@googlegroups.com [mailto:simal-
> contri...@googlegroups.com] On Behalf Of Steve Bennett
> Sent: 24 May 2011 03:15
> To: simal-con...@googlegroups.com
> Subject: Re: [Simal] [PROPOSAL] Fix data structures for editing
> projects

I don't see a better way (in terms of cost/benefits) of dealing with the
dynamic nature of editing projects. In your example, one could argue
that it's better to not duplicate the Description. When displaying the
project detail page you would just look for a Description in one of the
other sources if it's not in the web-interface source.

However, that makes it very difficult to remove attributes. If I wanted
to remove the Description (or one of the programming languages, as a more
likely use case), I would to either have to remove it from the source or
not remove it but find an alternative way of denoting that it had been
removed. I therefore thought that duplicating the whole source as a web-
interface specific source would be better, but I'm very welcome to suggestions
of how we can otherwise do this.

Sander

> Steve
>
> --
> You received this message because you are subscribed to the Google
> Groups "Simal contributors" group.
> To post to this group, send an email to simal-
> contri...@googlegroups.com.
> To unsubscribe from this group, send email to simal-
> contributors...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/simal-contributors?hl=en-GB.

Ross Gardler

unread,
May 25, 2011, 12:13:48 PM5/25/11
to simal-con...@googlegroups.com
On 21/04/2011 16:15, Sander W G van der Waal wrote:
> My suggested solution is:
> - Keep a reference to each different source and a timestamp
> so we know when and from where a source has been assigned
> (proposed earlier and already in the wiki [2]).


See also http://code.google.com/p/simal/issues/detail?id=195 and the
issues it is blocking

> - Initially keep the mechanism for displaying the project detail
> page the same (ie. collate the information from all the sources)

That's a simplification of what it does. In fact there is a "preference"
mechanism. So where, for example, multiple descriptions are provided, we
select a preferred one. I can't recall what selection mechanism I
implemented now, but the objective is described in
http://code.google.com/p/simal/issues/detail?id=190

> - When editing a project for the first time, create a new source
> reference with a designated simal:source, eg. 'web-interface'.
> - Duplicate the information from all the source projects to this
> new source project.
> - Edit and save all changes to this new source project
> - From now on always use the project with source 'web-interface'

I'd argue that http://code.google.com/p/simal/issues/detail?id=190 is
the way to go. Once that is implemented then we know which data source
we are editing. That being said, since the external data comes from some
third party we can't really edit it. Therefore we need some local
version of data as you propose.

I'm not convinced this should be RDF though. My experience of RDF is
that it does not scale well. Shouldn't we put a DBMS in front of this
data to both solve this problem and improve performance?

http://code.google.com/p/simal/issues/detail?id=267 suggests using a
cache, but it would equally be solved with a DBMS to "cache" preferred
and edited data. We would then only need to query the RDF data when
doing deep searches (or whatever the right term is for searching the
full tree rather than the displayed data).

Ross

Ross Gardler

unread,
May 25, 2011, 12:24:17 PM5/25/11
to simal-con...@googlegroups.com

See http://code.google.com/p/simal/issues/detail?id=190 and my earlier
reply.

I believe my proposal solves this problem if we extend it to allow rules
like:

<sourcePreferences>
<default>
<pattern override="true">www.jisc.ac.uk/*</pattern>
<pattern>pims.oss-watch.ac.uk/*</pattern>
<pattern>prod.cetis.ac.uk/*</pattern>
</default>
<doap:description override="true">
<pattern>pims.jisc.ac.uk/*</pattern>
<pattern>prod.cetis.ac.uk/*</pattern>
</doap:description>
</sourcePreferences>

This is the same as the example in the issue except I've added
override="true" to the first default pattern. What this is intended to
mean is that we use the local version of the attribute (if it exists)
unless the value from www.jisc.ac.uk/* is newer.

In the description element I'm saying override our local description if
any of the third party data has changed (I doubt we would use this, but
see the next example for a real use).

I believe this solves the problem Steve is concerned about, which is an
important problem.

>
> However, that makes it very difficult to remove attributes. If I wanted
> to remove the Description (or one of the programming languages, as a more
> likely use case), I would to either have to remove it from the source or
> not remove it but find an alternative way of denoting that it had been
> removed. I therefore thought that duplicating the whole source as a web-
> interface specific source would be better, but I'm very welcome to suggestions
> of how we can otherwise do this.

The above also solves this problem, e.g.

<programming-language override="false">
<pattern>simal.oss-watch.ac.uk/*</pattern>
<pattern>pims.jisc.ac.uk/*</pattern>
<pattern>prod.cetis.ac.uk/*</pattern>
</programming-language>

Now we can ensure that the programming-language is set to whatever we
want to and whatever happens elsewhere is ignored.

Of course we probably don't want to ignore it. We probably want some
reporting tools that tells us when our data is "out of date".

Ross

Reply all
Reply to author
Forward
0 new messages