Feedback to G4 Open Data Framework Project

14 views
Skip to first unread message

jkonga

unread,
Nov 3, 2010, 4:57:37 PM11/3/10
to DataTO
Hi folks,

Some of you may be aware I’m working an Open Data Framework project
for the “G4” being, Vancouver, Toronto, Edmonton and Ottawa (in order
of site launches). This is another opportunity to identify challenges
and look for opportunities to make Open Data better in your community
and the larger community.

Here’s some things I’d be interested in seeing discussed:
- What data standards/formats should be available in Open Data sites …
do you care?

- The Terms of Use agreement have some issues – this has been
recognized and they’re being reviewed … have you come across a really
good TOU

- The open Data website is meant to serve many groups from the Hack
community to ordinary citizens … what will make these sites more
useful

- What are the major opportunities you see for Open Data moving
forward … time for some blue skying 

- How about joining the Open Data communities together … not just in
hackathons but ongoing dialogue

Looking forward to the discussion!

Richard Weait

unread,
Nov 3, 2010, 5:57:41 PM11/3/10
to dat...@googlegroups.com
On Wed, Nov 3, 2010 at 4:57 PM, jkonga <jko...@sympatico.ca> wrote:
> Hi folks,
>
> Some of you may be aware I’m working an Open Data Framework project
> for the “G4” being, Vancouver, Toronto, Edmonton and Ottawa (in order
> of site launches).  This is another opportunity to identify challenges
> and look for opportunities to make Open Data better in your community
> and the larger community.
>
> Here’s some things I’d be interested in seeing discussed:
> -       What data standards/formats should be available in Open Data sites …
> do you care?
>
> -       The Terms of Use agreement have some issues – this has been
> recognized and they’re being reviewed … have you come across a really
> good TOU

Yes. The Public Domain Dedication and License (PDDL) [1] [2] is a
mature license, drafted and maintained by international open data
professionals, and vetted by Open Data communities world wide. The
PDDL is written and maintained by the Open Data Commons[3] at the Open
Knowledge Foundation.[4]

License proliferation is a blight on the Open Data community and has
unintended consequences that prevent the use and distribution of data.

Publishing your data with prominent PDDL terms makes your data very
obviously available to the widest possible audience.

> -       The open Data website is meant to serve many groups from the Hack
> community to ordinary citizens … what will make these sites more
> useful

RSS feeds and announce mailing lists to announce updates and new data sets.

[1] Summary http://www.opendatacommons.org/licenses/pddl/summary/
[2] License http://www.opendatacommons.org/licenses/pddl/1.0/
[3] http://www.opendatacommons.org/
[4] http://okfn.org/

Stephen van Egmond

unread,
Nov 3, 2010, 6:07:51 PM11/3/10
to dat...@googlegroups.com
> Here’s some things I’d be interested in seeing discussed:
> - What data standards/formats should be available in Open Data sites …
> do you care?

The formats that are as close as possible to the original format, and yet are free of encumbrances. To be more specific depends on the subject-matter of the original dataset.

Most developers can adapt to formats that are textual and compact. XML, JSON, even YAML are perfectly reasonable.

And I hope we don't go down a semantic-web rabbithole. Bashing human taxonomies into formal computer languages is hopeless nonsense.

> - The Terms of Use agreement have some issues – this has been
> recognized and they’re being reviewed … have you come across a really
> good TOU

Not yet. For instance, I am using the TTC Next Vehicle Arrival data, and it is being served up by the vendor that provides their NVAS prediction service. They have a license agreement that I had to agree to, but I don't see why that need be so.

In my view, open data access must be built into vendor agreements from the outset. For instance, significant portions of various city datasets are stripped out before release because Stats Can attaches licensing strings to their data. That's nuts!

> - The open Data website is meant to serve many groups from the Hack
> community to ordinary citizens … what will make these sites more
> useful

I don't think this data is useful for citizens. I think that's a useless demographic to aim for.

Frequent, preferably automatic, data updates is very important.

> - What are the major opportunities you see for Open Data moving
> forward … time for some blue skying

This is infrastructure. The point is you don't know how it's going to be used. What is the opportunity made possible from building a MARS Discovery District? Building a streetcar line? A water main? A hydro pole?


> - How about joining the Open Data communities together … not just in
> hackathons but ongoing dialogue

Twitter seems to work just fine.

How about a conference? I'd go.

Neil McEvoy

unread,
Nov 4, 2010, 6:43:52 AM11/4/10
to dat...@googlegroups.com

A conference is a good idea. At the Cloud Camp I attended recently Open
Data was a conversation that came to the fore quite a few times and it was
suggested to run a seperate event to cater for it.

I can help with organizing, logistics, corporate sponsors etc.

Neil.


--


Kieran Huggins

unread,
Nov 4, 2010, 12:56:30 PM11/4/10
to DataTO
While a conference does sound all official and whatnot, why don't we
start with drinks before renting out a hall? The Rhino is a typical
nerdy meeting space with a good selection of beer.

Geoffrey Wiseman

unread,
Nov 4, 2010, 2:46:20 PM11/4/10
to dat...@googlegroups.com
On Wed, Nov 3, 2010 at 4:57 PM, jkonga <jko...@sympatico.ca> wrote:
Here’s some things I’d be interested in seeing discussed:
-       What data standards/formats should be available in Open Data sites …
do you care?

Yes and no; at this stage, I'm happy to see more and more data being rolled out in whatever format makes sense to the people providing the data, simply to ease the opening of data.  

In the long term, converging as much as possible on common data formats and common APIs will make it a lot easier to build an app that works with open data from more than one source.  

I'd also love to see more of this data available through an API rather than a big data dump; at the moment, it requires the data consumer to do much of the work to determine when the data has changed, what has changed, filter and subset the data, and so forth.  The more these are made available as sensible APIs, the easier it will be for people to build applications based on the current data.  That requires some effort on the part of the data providers, and may increase the load on the machines and network connections serving the data in some scenarios, but in the long term, I think this is a worthwhile tradeoff.
 
  - Geoffrey
--
Geoffrey Wiseman
http://www.geoffreywiseman.ca/

Geoffrey Wiseman

unread,
Nov 4, 2010, 2:49:33 PM11/4/10
to dat...@googlegroups.com
Yes, a "camp" style regular meeting is probably a better starting place than a Conference, IMO, although I'm not opposed to either.

Being an east-ender, I'm a little irritated by the sheer volume of technical events that happens in the west end, so the Rhino wouldn't be my first choice, but I recognize that I may be in the minority on that point.  ;)

Kimberly Silk

unread,
Nov 4, 2010, 3:05:21 PM11/4/10
to dat...@googlegroups.com

Hi everyone,

 

I’m interested in finding out if the Open Data sites are tracking what data sets are downloaded; I’d like to create an index that measures the effectiveness of open data sites, and one of the measures (IMHO) is how frequently the catalogues are accessed.

 

Do any of you know if any Open Data projects (Cdn or int’l) are measuring this, and/or making their measurements public?

 

Thanks,

Kim

 

 

------------------------------------------------------------

Kimberly Silk, MLS
Data Librarian, The Martin Prosperity Institute

Joseph L. Rotman School of Management, University of Toronto

President, Faculty of Information Alumni Association

 

Office: 416-673-8586

Mobile: 416-721-8955

Email: kimber...@martinprosperity.org

Twitter: @kimberlysilk

 

What REALLY goes on at a think tank: blog.martinprosperity.org

Twitter: @martinprosperiT

**********************************************

SAVE THE CENSUS LONG FORM 2011

http://capdu.wordpress.com

**********************************************

Mark Kuznicki

unread,
Nov 4, 2010, 4:03:01 PM11/4/10
to dat...@googlegroups.com
I think that an open data monthly meetup is a great idea! Any ideas on a particular day of the month that doesn't conflict with the many dev-friendly meetups that are already out there?

Thoughts?


On Thu, Nov 4, 2010 at 12:56 PM, Kieran Huggins <kieran....@gmail.com> wrote:

Richard Weait

unread,
Nov 4, 2010, 5:21:24 PM11/4/10
to dat...@googlegroups.com
On Thu, Nov 4, 2010 at 4:03 PM, Mark Kuznicki <ma...@remarkk.com> wrote:
> I think that an open data monthly meetup is a great idea! Any ideas on a
> particular day of the month that doesn't conflict with the many dev-friendly
> meetups that are already out there?
> Thoughts?

The OpenStreetMap meetup is mostly-monthly and definitely OpenData
friendly. Next one is Monday 15 November, at C'est What - 67 Front
Street East, http://osm.org/go/ZX6BrdQbe-?m

http://www.meetup.com/OpenStreetMap-Toronto/

Kieran Huggins

unread,
Nov 4, 2010, 5:26:53 PM11/4/10
to DataTO
Hmm - that conflicts with Ruby Pub Night, which is very popular among
creative devs.

I'd like to suggest the first Monday of the month, though we should
probably get the ball rolling and hold the first this coming Monday.

C'est What works for me, and it's a nice east-west compromise.

So unless there are any major issues, Kevin and I will be there Monday
(Nov 8) @ 6:30 - hope to meet a bunch of you fine folks there too!

Cheers,
Kieran

Brian Gilham

unread,
Nov 4, 2010, 5:36:35 PM11/4/10
to dat...@googlegroups.com, DataTO
I'll try my best to attend!

Typos by iPhone.

Marcel Fortin

unread,
Nov 5, 2010, 8:45:57 AM11/5/10
to dat...@googlegroups.com
From the academic standpoint here are my comments, concerns and issues.
- formats:
Yes, I care very much. While i understand the need to provide access to
users in a variety of formats, a format as close to the original as
possible is preferable if only one is to be provided. An API or a markup
format may be good for the hacker, but the raw data are what we need for
cartographic use and analysis. And by raw, I am mostly referring to GIS
formats such as Shapefile or Autocad formats such as DXF or DWG.
Therefore, I would suggest making the data available in several formats
if possible. If not, there are many tools out there for converting data,
let's not strip out datasets from the source. Let's maintain the
integrity of the data from the originator as much as possible.

- License. I think Richard has done a good job of exposing the pitfalls
of the licensing and I think his comments should be incorporated into
any changes to the license agreements.

- In terms of what I'd like to see, especially for Toronto are some of
the datasets you know exist and would be incredible resources. For
instance, the orthophotos for Toronto from 2003 are of 7cm or so. That
is one pixel is 7cm. Let's make those and any newer ones available.
These are crucial for academic research and for documenting many things
map data miss. There are also the cities' land use and zoning data.
While zoning is available via a web page in Toronto, the data are what
are needed. Viewing a web page and incorporating it into your other data
is another.

Planning departments need to open up a bit and release data that are
crucial for analysis. Land use is one of the standard things cities
formerly released to the public via maps and reports. Trying to find
such information now is often futile and mostly an exercise in
frustration. One day a pdf map (locked no less) of land use will be on
the toronto.ca web pages and the next day gone. Good luck trying to find
the previous official plans or parcel data as well. We at the U of
Toronto have land use and parcel maps for many major cities in Canada
throughout the 60s and 70s and 80s, but good luck trying to find
anywhere any maps or data for that matter that replaced these maps in
the 90s and 2000s. These are all gone. Which leads me to my next point
that I feel very strongly about.

The data should be archived. Too much data and other digital information
are disappearing. Those Toronto parcels for the 1990s that were in
digital format are now available in paper format only. We also have to
remember that versioning and backing up is not archiving. The Ontario
Ministry of Natural Resources now has an archiving plan for all Ontario
Geospatial Data. A model therefore exists. Vancouver used to archive its
data (and I know they are looking into doing this again), there's
another model. I think it is imperative that all governments back up
all their data and make the plans available to the public. And this
stands for all data, not just the open datasets.

Marcel Fortin
GIS and Map Librarian
University of Toronto
http://mdl.library.utoronto.ca

Stephen van Egmond

unread,
Nov 5, 2010, 10:46:34 AM11/5/10
to dat...@googlegroups.com

On 2010-11-04, at 5:26 PM, Kieran Huggins wrote:

> C'est What works for me, and it's a nice east-west compromise.
>
> So unless there are any major issues, Kevin and I will be there Monday
> (Nov 8) @ 6:30 - hope to meet a bunch of you fine folks there too!

Works for me!

Mark Kuznicki

unread,
Nov 5, 2010, 12:02:51 PM11/5/10
to dat...@googlegroups.com
Looks like a good turnout. Philip Smith put it up on Plancast: http://plancast.com/p/2vfc
Reply all
Reply to author
Forward
0 new messages