Data Warehouse idea, using GitHub.com with an index

171 views
Skip to first unread message

Will Newby

unread,
May 26, 2013, 2:22:32 AM5/26/13
to wikihouse...@googlegroups.com
Hello,

I'm interested in building the "Data Warehouse" for the "Wikipedia of Things" that Alastair mentioned. Here's my idea: use GitHub to store the data (plans, revisions, etc) and build a searchable index of these projects (essentially by repository, so 1 github repo == 1 set of wikihouse (or general wikiobject?) plans ). We could build the index as a GAE App and make it query GitHub for raw data.

I'm taking inspiration from Homebrew, with their github-based hosting. When you install software via Homebrew, it clones the github.com repo in order to build the new software, we could do the same, except with wikihouse data. To create new houses, we would have users create a repository for each individual design. As long as we enforce a naming schema (i.e. wikihouse-<design name>), we can use the GitHub API to query for them and create an index. Once we have this, as more people start creating their own versions, it's possible to search the index, attribute plans to people and let people fork repositories in order to make their own changes. 

We get the stability and openness of Github, while creating an easy index which is searchable and displayable. We can even display the "Top 5" wikihouses which can be the "top starred" repos (wikihouse plans)

TL;DR Here's what we would need:
A Google App Engine App (index.wikihouse.cc, for example)
GitHub.com

in GAE:
Create Cron to query the GitHub API (hourly? daily?) for "wikihouse-", parse the results into a Google Cloud Datastore table for easy querying
Create Query Page for users to be able to search for designs and display them in a reasonable fashion.

in GitHub.com
1) Enforce a naming schema of GitHub Repos (i.e. wikihouse-<user defined name> )
2) Require a wikihouse.yaml (or other name/format) file in the root of the repository for any other metadata we may want (i.e. object type, approx size, amount and type of materials required, etc)
Bonus: Use GitHub OAuth to login to the WebGL "create your WikiHouse" interface for a super-clean UI/UX which can auto-setup repositories for users with the appropriate files.

What do we think? I'm especially curious for feedback, brokenness, or improvements!

Cheers!
Will

thruflo

unread,
Jul 9, 2013, 10:15:25 AM7/9/13
to wikihouse...@googlegroups.com
Hi Will,

Great to hear from you and thanks for sharing the idea.

I think this kind of architecture would be great.  We've had similar thoughts and have been looking at how to do the same thing for OpenDesk as well.

The key thing we've not yet worked out is a strategy for large files handling, ideally with meaningful changesets.  Tools like git annex work for large files but place quite a barrier on top of the usual github use and the text / source code formats for 3D files all seem to change all over the shop (as opposed to a clean source code changeset) in response to the smallest design change.

It may be that there are solutions to this, and it may well be that there's value in repos + index without solving them.  Certainly, we could integrate it into the wikihouse site with a bit of work, e.g.: to publish via github and to show forks alongside a library design, etc.

James.

Will Newby

unread,
Jul 9, 2013, 1:00:58 PM7/9/13
to thruflo, wikihouse...@googlegroups.com
Hi James,

Cool! I'm glad you like it! 

Yeah, that is a tricky thing. The cool thing about GitHub is that they've built an STL viewer, so we shouldn't be strictly limited to text outputs of the files (https://github.com/blog/1465-stl-file-viewing). I've emailed their support to see if they plan on supporting any other 3d formats, if they do, I don't see the file changes to be an actual issue (although large files still could be). 

I'm still familiarizing myself with Python in app engine, but it's looking pretty cool. Would this sit in its own GAE project, or would you want to integrate it into the existing wikihouse.cc codebase? I'd say it's likely *much* cleaner to make it its own codebase. 

Thoughts?
Cheers!
Will


---------------
Will Newby
Freelance SysAdmin/Coder/Brain
Phone: 612.208.3806
Email: will...@gmail.com

James Arthur

unread,
Jul 10, 2013, 8:56:50 AM7/10/13
to Will Newby, wikihouse...@googlegroups.com
Hi Will,

Yup -- could run either as a separate app or as a parallel WSGI app hosted on the main wikihouse-cc.appspot.com instance, i.e.: nothing to do with the current app except uses the same app.yaml

James.

Tom Kluyskens

unread,
Jul 30, 2013, 10:33:16 PM7/30/13
to wikihouse...@googlegroups.com, Will Newby
Hey Will, James,

FWIW, Spoke Creator boasts following features, some git-like:

- versioning
- per-version comments & thumbnails
- archiving
- permissions (private, public and selection of users)

And our asset structure is as follows:

- spokes: contain all the parametric logic
- presets: only contain the parameters of a spoke

Both spokes and presets can be versioned, made public, etc.
Users can 'branch' off other users' (or their own) spokes and presets: save their own copy and iterate from there.

Presets can be very lightweight and versatile vehicles of information exchange, instead of having to pass the whole parametric logic back and forth.  Presets also carry well over versions of the originating spoke, in a pretty robust way.  So you can change the underlying parametric logic, and older presets will attempt to 'mould' themselves around the changes.


This is to accomodate a situation like this:

WikiHouse New Zealand build a 'spoke' for their flavour of the WikiHouse using Spoke Creator in editor mode.  This spoke builds a house configurator that contains all the parametric logic, how it is presented online, and defines which parameters can be controlled by a user of the configurator.  They make the spoke public, and post a URL on their site, linking to the configurator.
A user clicks on the URL (is asked to log in to Spoke Creator if not yet done), configures the house, and saves a preset under his Spoke Creator account.

WikiHouse NZ people update the parametric logic, add an extra module.

User re-opens his preset saved a few weeks ago.  He has two options:

- open the preset using the exact spoke it came from (we never delete or overwrite anything)
- open the preset using he latest version of the WikiHouse NZ spoke: the preset applies as well as possible to the updated spoke

Tom

Will Newby

unread,
Jul 31, 2013, 11:23:57 PM7/31/13
to Tom Kluyskens, wikihouse...@googlegroups.com
Hi Tom,

Interesting. It looks like you already have the save and display interfaces sorted out too. It's starting to sound as if we don't really need to build anything to support this. 

What do you think, James et al?


---------------
Will Newby
Freelance SysAdmin/Coder/Brain
Phone: 612.208.3806
Email: will...@gmail.com


thruflo

unread,
Aug 2, 2013, 5:43:31 AM8/2/13
to wikihouse...@googlegroups.com, Tom Kluyskens
Hey,

I think this is really interesting and Spoke is indeed awesome!  That said, it would be good to understand:

* accessibility of designs mastered as spoke logic
* impact on current designs / extension / habits / docs all being SketchUp based

We have good links with SketchUp, who are looking at how to improve the authoring and integration experience.  If Spoke can output 3D models that can then be manually hacked in Sketchup, that would be amazing and we could perhaps feed this into the SketchUp devs.

As with Spoke, Sketchup is also proprietary software.  I know also that Tav had some suggestions around an open JSON + webgl approach to Spoke.  If we're going to base an open product library on the tech, a library that's intended to be a public domain bedrock of universally accessible designs, then the restrictions or otherwise of the data structures and software tools is important.

James.
Reply all
Reply to author
Forward
0 new messages