DataLFT

1 view
Skip to first unread message

Bryan Fuselier

unread,
Oct 20, 2009, 11:03:34 AM10/20/09
to lftd...@googlegroups.com
Good Morning Everyone,

I wanted to start out by thanking everyone who attended last night concerning the city-wide open access database which we have termed 'DataLFT'.

In summary:
The group decided that the end goal of this group should be to provide a standardized means of retrieving/storing data related to the Lafayette Area.
Many questions were raised on how to achieve that technologically, with many "I don't know's" and "WTH???"... j/k...not really but that's the gist.
We all determined that first and foremost, we have to learn what data is available in what formats and where the largest needs lie in terms of type of data sets.
Basically, we agreed to mimic the datasf.org site for now to gather these resources with some analysis to determine which data sets to focus on first.
The outline of our stripped down datasf.org clone is:
  • Home page
    • Listing of datasets and descriptions
    • Simple search
  • DataSet submission
    • Name of data set
    • URL to data set
    • Description of data set
    • Type of data set
    • (Optional) Instructions on how to use it as a feed
  • Dataset suggestion
    • Listing of dataset suggestions
      • Ability for other members to vote on whether they would like to have that data as well
    • Entry of new dataset suggestion
      • Suggested Name
      • Description of data the user would like to see
  • User Management
    • Sign up form
      • email
      • password
      • job title??? - I'm thinking to use for logistics of what type of person wants what type of data? Open for suggestions.
      • bot prevention
  • Admin Section
    • Ban/Delete abusive users
    • Approval of DataSet submission
      • 3 Administrators should review and agree the data set is valid
Please feel free to update anything I may have missed or forgotten.

Today I'll be setting up a virtual machine with CentOS installed and Mercurial. I'll be setting up the repository on http://bitbucket.org. Register now if you want or I'll send out the final information when I'm done. If you want to follow me on there I'm bryanfuselier. You will receive notification that I've created a repo when I do.

Which leads me to a small challenge I want to offer:
Last night when raised the question "What is the platform our website will be in?", naturally everyone brought up their specialties which were:
Coldfusion (with 4 votes)
PHP/Python (with 2 votes)

Everyone seemed to agree on MySQL as the database backend since everyone had worked with it. I started thinking after I went home last night that we're the only developer group that doesn't focus on a specific platform internally. So the challenge becomes:

Let's try something none of us have done. Why not? What's wrong with learning something different when we have each other to lean on? With such a small project, we won't be looking at large amounts of time to figure out what we want to do. If you all are up to this challenge, send back your feedback on what you would like to work with and why. Here's mine:

Tornado - www.tornadoweb.org
 - We are discussing building a user-generated content site. While we don't expect to have 10,000 users, why not go ahead and build it in a service that solves the C10K Problem? One thing that was mentioned was preparing this for usage in other areas. What if Los Angeles adopts it and they end up with over 10,000 users?

SQLite3 - www.sqlite.org
 - Small footprint because it runs only in script when necessary.. Actually I could re-list this: http://www.sqlite.org/features.html but it pretty much speaks for itself. While it definitely doesn't work for our "end product" of storing massive amounts of data, it would serve well for this small site and perform extremely well. Plus being file based, it can easily be handed over to people who want a copy of it ( to protect people's privacy we can create two db files: one for user management and the other for site data).

Let's keep the conversation going. I want suggestions and ideas. Thanks everyone!!!

Bryan F.

Ryan Letulle

unread,
Oct 20, 2009, 11:19:45 AM10/20/09
to lftd...@googlegroups.com
I have no problem personally with sqlite.  Seems to be accepted across the board open source and commercially. (i.e. django, adobe air)  I have personally never used it in production though.

I have never heard of Tornado.  That in itself doesn't make it bad ;).  After a brief reading it seems to be a Python framework with benefits?  I have also never used Python in production but I think it's been proven by many to be one of the best languages for the web.

IMO I am not sure that we will gain anything by using either of these.  So why do it?  The cons are that I am guessing none of us or one of us has used them.

Someone give me some pros please.  I'd like to hear more guru talk on scalability which seems to be the Tornado shtick.

If decided that these are optimal I am as always excited to learn.

--
Ryan LeTulle

Geoff Daily

unread,
Oct 20, 2009, 11:26:24 AM10/20/09
to lftd...@googlegroups.com
Bryan - Great stuff, my friend! Both in terms of the meeting and this writeup.

I'll leave discussions over what language to use up to you all, but what I'm wondering is have you guys scheduled your next meeting yet?

I'm going to be in town the first week of Nov, getting in no later than Tues and staying through the weekend. Was wondering if we wanted to try and schedule another meeting while I'm around.

In particular, did we want to try and schedule something on Nov 7th to coincide with SF's Hack the City event? I still haven't been able to connect with the organizers of that but I'll try again this week.

Would that be a good day to shoot for having a mini-codeathon to either start building the site for this datastore or to build a starter site for LFTDevNet? How long do we think it'd take to have something to show for our work? Would we need a whole day or would a 4-hour block of time be enough?

If we do decide we want to do something on the 7th, then what I may suggest is we also have a small reception at the end of it where we could invite in community leaders to see what we've done and to spark the discussion around getting LFT to open up its data communitywide and what benefits that would provide.

Looking forward to people's thoughts!

G

John St. Julien

unread,
Oct 20, 2009, 11:47:15 AM10/20/09
to lftd...@googlegroups.com
Just wanted to weigh in with an ataboy: good meeting, loosely guided, lots of substantive discussion...and a decisive closing. (That last is hard!) 

Thanks Bryan, all.

John


401 St. Charles St.
Lafayette, La.  70501

Land Line: (337) 269-0150
Cell Phone: (337) 739-6118

jo...@johnstjulien.com

The best way to predict the future is to invent it. (Alan Kay)


Crawford Comeaux

unread,
Oct 20, 2009, 12:06:30 PM10/20/09
to lftd...@googlegroups.com
There were a couple of discussions after the close pertaining to language/platform with v2.0 in mind. The consensus seems to be that v1.0 should be "disposable", so as to keep from getting too heavily invested in a framework that may wind up not fulfilling requirements we currently can't foresee, with v2 requiring a rewrite based on the usage that emerges. 

I think something written on Tornado might be nice if this winds up on Slashdot, but since we're not really talking about a lot of code on the front or back ends, I think it may also be overkill, even for that situation. 

My vote is for whatever framework allows us to easily mimic datasf.org, including their tag cloud, recent comments, etc. I'm thinking Joomla/Drupal may be the way to go, especially since there are already numerous plugins developed for those platforms. Let's minimize the coding effort for v1.0 so that it isn't too painful to scrap all of it, if necessary, when it's time to roll out v2.0.

Geoff Daily

unread,
Oct 20, 2009, 12:09:41 PM10/20/09
to lftd...@googlegroups.com
One other thought: did we want to see if the DataSF folks would give us their site's code and we could just modify it to our purposes? Might save us some time, and I'm guessing they'd be open to considering sharing it with us.

G

Ryan Letulle

unread,
Oct 20, 2009, 12:12:50 PM10/20/09
to lftd...@googlegroups.com
Who was in that group?

--
Ryan LeTulle

Bryan Fuselier

unread,
Oct 20, 2009, 12:19:06 PM10/20/09
to lftd...@googlegroups.com
Corey Bordelon, Myself, Crawford and Ryan.

I did agree last night that we could use some framework that could easily be scrapped in case of a re-write or re-structuring. My idea for the challenge was just to try and introduce some new technology in our area. Definitely not something I'm saying is necessary, just a suggestion.

If no one wants to devote the time to learn something new, then I suggest we take the road Geoff suggested in getting the datasf.org code from them and simply redesigning the interface.

If that is not possible, we'll just look into a quick easy framework such as joomla, drupal or wordpress.

Bryan F.

Ryan Letulle

unread,
Oct 20, 2009, 12:20:24 PM10/20/09
to lftd...@googlegroups.com
Ryan?

--
Ryan LeTulle

Raymond Camden

unread,
Oct 20, 2009, 12:21:52 PM10/20/09
to lftd...@googlegroups.com
Woah - so when did we agree that 1.0 should be disposable? I remember
it coming up and arguing _against_ that.

I do not think it is a good idea to build something we are just going
to scrap. I think it's a horrible idea actually.

Bryan Fuselier

unread,
Oct 20, 2009, 12:21:59 PM10/20/09
to lftd...@googlegroups.com
Ryan Deville

Ryan Letulle

unread,
Oct 20, 2009, 12:21:45 PM10/20/09
to lftd...@googlegroups.com
IMO Joomla, Drupal or Wordpress discredit the entire project.

Just my 2 cents.

--
Ryan LeTulle

Bryan Fuselier

unread,
Oct 20, 2009, 12:23:52 PM10/20/09
to lftd...@googlegroups.com
That's 3 against (since I'm starting to lean towards having our own product) and 3 for having a scrappable v1. We need a tie breaker...

Geoff Daily

unread,
Oct 20, 2009, 12:26:22 PM10/20/09
to lftd...@googlegroups.com
don't know if my vote should count, but why would we want to have a scrappable v.1? i'd rather have a simple v.1 that's extensible, if that's possible. would hate to waste effort.

conversely, if we wanted to just get something up that's simple, tweaking datasf.org's code might be an easy way to get us started. i've sent an email to SF's CIO asking to be put in touch with the person who could share that code with us, and asking him if they'd be willing to do that. will let you guys know what i hear.

G

Bryan Fuselier

unread,
Oct 20, 2009, 12:31:10 PM10/20/09
to lftd...@googlegroups.com
Geoff,

The argument was that since we have no idea what route we are going to go with version 2... why put forth a large effort in creating a custom version 1? What if we spend time and energy on v1 and later decide that we have to go a different route.

My argument against it is: I don't see v2 getting rid of these links and suggestion areas. I think that finding new data sources will always be a goal of the group no matter how we end up presenting it in the long run.

Using datasf.org's source as a base also locks us into their format and what they have used. I would like to see what they have written theirs in and let everyone decide if it's something they want to use.

Bryan F.

Jim Schmehil

unread,
Oct 20, 2009, 12:34:48 PM10/20/09
to lftd...@googlegroups.com
Unfortunately I wasn't able to attend last night so this might be out of turn, but I don't think a going in assumption of something disposable is the best approach.  I think that something simple and extensible provides for more options later...allowing for growth, change or simply disposable if need be.

Jim

Bryan Fuselier

unread,
Oct 20, 2009, 12:58:20 PM10/20/09
to lftd...@googlegroups.com
Sounds like the idea that a disposable v1 is necessary is waning... so we raise the question once again... what do we use?
So far from what I can tell:
Dev.
Coldfusion 4
PHP/Python 1
Tornado 1

Data
MySQL 5
SQLite 2

dtagert

unread,
Oct 20, 2009, 1:18:54 PM10/20/09
to lftd...@googlegroups.com
Just repeating what's already been said but I like the idea of getting datasf's codebase to see how they are storing their data as it may give us some ideas, not that this group lacks any.  Since v1 will be something very simple, just a few fields, why not throw up a custom app online quickly?

Then, when we decide where we're heading for v2, we can still use the data we have while adding on to the codebase and if absolutely necessary, start new.  IMO, the sooner we get something online, the less likely the project will lose momentum.


Darrell Tagert

On Tue, Oct 20, 2009 at 11:34 AM, Jim Schmehil <schm...@gmail.com> wrote:

Crawford Comeaux

unread,
Oct 20, 2009, 2:24:54 PM10/20/09
to lftd...@googlegroups.com
I apologize for implying that any kind of decision had been made on whether or not v1 should be scrappable; I simply meant to put the idea out there, along with its merits.

After talking to Bryan further, though, I'm no longer for that idea. I think we do want to focus on a solution that allows for rapid development, but also scales well. I'm currently checking out Tornado to see what all it has to offer, but don't have a vote to register for any certain framework/data storage solutions. Agree with Ryan Letulle that Joomla/Drupal/Wordpress could be discrediting and also think they would likely open us up to nasty vulnerabilities.

Corey Bordelon

unread,
Oct 20, 2009, 2:48:06 PM10/20/09
to lftd...@googlegroups.com
Since my name came up on the "scrappable v1" list, I wanted to defend what I said. When we discussed it after the meeting, I was more focusing on making sure that we didn't invest too much effort into making v1 since it would probably be awhile before we made a decision on the technology stack for v2.  

Assuming it would take alot of discussion to agree what would be used for v2, I said that there should be as little investment in v1 as possible, just in case the technology stack decided for v2 is different than what v1 is.  If we did decide to go a different direction, I didn't want the choice made for v1 to hamper what we can do for v2, so I said it should be disposable (or as close to it as possible).  I was thinking this would leave the group with plenty of leeway to accomplish v2 while getting v1 out quickly.

Bryan already summed up this position, but I wanted to explain the thought process (or lack thereof) behind the idea, since I think the word "scrappable" first came from my mouth.

Sorry for the confusion to those that didn't hear this conversation after the meeting. 

Ryan Letulle

unread,
Oct 20, 2009, 3:08:29 PM10/20/09
to lftd...@googlegroups.com
2 cents is not enough for me.  Here's some more random thoughts.

IMO Unless something drastic changes (which is entirely possible if not probable) we will always need the list-like offering similar to what datasf currently has.  Why not?

I see no logical reason why a phase 2 would have to be developed with the exact same tools used for the first.  In fact I would like to think we could get a little more creative than that.  Everybody else does it.

By all means, let's continue this discussion.  I love where it has been and is going. 

IMO We certainly want to be able to handle the highest volume of 'simultaneous' visitors using the least amount of server resources possible that doesn't break the back (too much) of the programmers.

Much of this phase's programming (if we use the datasf model) will be javascript/AJAX.  I just don't see much server side code to get this done.

My vote is for open source first.  The rest is just syntax.

--
Ryan LeTulle

Matthew Turland

unread,
Oct 21, 2009, 9:50:55 PM10/21/09
to LFTDevNet
Sorry I missed the meeting, but I wanted to weigh in. I agree with
aforementioned concerns that SQLite won't scale to the amount of data
that we would ideally have; I have firsthand experience with that. If
we're going to plan for C10K, let's plan for C10K. There are a number
of newer data sources that we could try using and would scale better
to the ideal problem size. A few of these are search solutions that
could be backed by either SQLite or MySQL.

MemcacheDB - http://memcachedb.org/
Tokyo Tyrant - http://1978th.net/tokyotyrant/
MongoDB - http://www.mongodb.org
Sphinx - http://www.sphinxsearch.com/
Solr - http://lucene.apache.org/solr/

Just my two cents.

Regards,

Matthew Turland

Ryan Letulle

unread,
Oct 21, 2009, 10:50:02 PM10/21/09
to lftd...@googlegroups.com
Mongodb Coldfusion blog post.  ( for all the cf geeks ;)

http://blog.mxunit.org/2009/10/look-ma-no-sql-mongodb-and-coldfusion.html

--
Ryan LeTulle
Reply all
Reply to author
Forward
0 new messages