Calagator use outside Portland

64 views
Skip to first unread message

Ian Forrester

unread,
Jun 16, 2009, 9:39:52 AM6/16/09
to PDX Tech Calendar, ian.fo...@bbc.co.uk
Hi All,

I hope this is the right place to post this.

I have been watching the calagator project grow and expand for the
last few months. It seems like a great project and I was over the moon
to find the code is also available for others to build there own
community around.

In the North of England, UK there are tons of small events but no
where focused to find out about them. So generally people don't know
whats happening in the same town or city they live in. Part of my job
for the BBC is to encourage grassroots movements like usergroups and
barcamps. So we're looking to run Calagator on one of research servers
and see how it goes.

I've proposed this plan internally and externally and a couple of
questions have come back which I'm hoping someone could maybe answer.

1. Has anyone attempted to connect Calagator to another authentication
server/systems, such as LDAP, OpenID, Google login, etc
2. Will there be the ability to browse and search via tags?
3. Can people add tags to existing events, bit like upcoming?
4. Is there a way to browse and arrange by geolocation? And is it
possible to aggregator a couple of areas instead of everything
5. Can Calagator pull and push from/to Upcoming groups?
6. Can Calagator pull and push from the Facebook and Google Calendar
API

The last two are very important for adoption, and if its possible.

Thanks,

Ian Forrester - backstage.bbc.co.uk

Igal Koshevoy

unread,
Jun 16, 2009, 10:31:54 AM6/16/09
to pdx-tech...@googlegroups.com, ian.fo...@bbc.co.uk
Ian Forrester wrote:
> I hope this is the right place to post this.
It is.

> In the North of England, UK there are tons of small events but no
> where focused to find out about them. So generally people don't know
> whats happening in the same town or city they live in. Part of my job
> for the BBC is to encourage grassroots movements like usergroups and
> barcamps. So we're looking to run Calagator on one of research servers
> and see how it goes.
>

Great! I think Calagator could be a good fit for what you're describing.
We'd be glad to work with you and your team.

> 1. Has anyone attempted to connect Calagator to another authentication
> server/systems, such as LDAP, OpenID, Google login, etc
>

Yes. There's an old fork that includes OpenID, along with some unwanted
code available at
<http://github.com/igal/calagator/tree/with_my_events>. The OpenID
functionality could be extracted into its own branch and freshened up
with some new features and bug fixes from the code at OpenConferenceWare
<http://github.com/igal/openconferenceware/tree/master>.

This Calagator with_my_events branch also features a mechanism for
publishing events the logged-in user is going to. The functionality's
been done for a long time, but we've had a lot of debate over whether
it's doing the right thing conceptually and thus never merged it into
the official copy. Unlike Upcoming, there isn't an authoritative owner
for a Calagator event -- it's like a wiki, where everyone owns
everything -- so you can't have authoritative reservations. We'll
probably scrap this branch, but keep the OpenID, and try to reuse the
"favorites" system from OpenConferenceWare, e.g.
<http://opensourcebridge.org/users/1/favorites>.

> 2. Will there be the ability to browse and search via tags?
>

We're storing tags and displaying them, but provide no navigation to
find things by tag. Adding a way to browse and click on tags should be
very easy.

> 3. Can people add tags to existing events, bit like upcoming?
>

Every Calagator event is publicly editable, so anyone can add tags.

> 4. Is there a way to browse and arrange by geolocation? And is it
> possible to aggregator a couple of areas instead of everything
>

This is something we'd *really* like to do but haven't had time to
implement. There are some fairly detailed posts describing approaches
for this in the mail archive. If so, then we could have a single
calagator.org site to handle events anywhere. If you or others can help
add this, we'd be grateful because most of the team working on
Calagator's been swamped with other projects, such as
OpenSourceBridge.org, the past few months.

> 5. Can Calagator pull and push from/to Upcoming groups?
>

Calagator can import Upcoming events, but not create them.

Pushing events to Upcoming is tricky because you must have a user
account to own Upcoming events. Thus the Calagator instance would need
to have its own login to Upcoming under which it creates and updates
Upcoming events, and uses some kind of asynchronous queue to push
updates to cope with Upcoming's downtime and support retries.

> 6. Can Calagator pull and push from the Facebook and Google Calendar
> API
>

Google Calendar events can be imported as iCalendar. Calagator could
export events to Google Calendar, but we received a ticket recently
alerting us that this functionality broke because Google decided to
change the API for describing dates.

As for Facebook, I don't know and don't use it. However, I won't object
to seeing an importer/exporter for it.

-igal

PS: I regularly enjoy the wonderful news and programming provided by the
BBC. Thanks!

Ryan Aslett

unread,
Jun 17, 2009, 4:29:14 PM6/17/09
to pdx-tech...@googlegroups.com
Ian: you may find fusecal.com useful in addition to calagator:

Everybody else: Does anybody know if there are any plans to add
features to calagator similar to what fusecal.com has created? They
offer a website scraping service for calendar information that takes
most any unstructured format and turns it into iCal. The advantage is
I now have about 21 sources of events related to my topic of interest,
and only four of them use any sort of calendaring standard
(iCal/hCal/upcoming/meetup etc), and now they are neatly formatted in
a subscribable iCal feed.

The disadvantages are that I have to rely on a proprietary third party
to distill this unformatted data into something useful, and their
revenue model seems to be "add text ads to your event summaries" in
its parsing process. Additionally I have to rely on their refresh
schedule, when I'd like to dictate my own rules for refresh frequency.

Is anybody in the open source community working on screen scraping
poorly formatted calendars, and proprietary data silos (like
facebook), or anything similar for aggregation purposes?

Thanks,
Ryan Aslett

lucia...@gmail.com

unread,
Sep 6, 2012, 4:00:34 AM9/6/12
to pdx-tech...@googlegroups.com, ryana...@ryanaslett.com
Hey Ryan Aslett,  
    please tell me if your found anything for scraping events from other sites, I'm looking to set up something extremely similar to calagator for Davis CA. I need something that can scrape events from meetup, and a school html site. Any word from you would be greatly appreciated. 
-Lucian

Ryan Aslett

unread,
Sep 6, 2012, 11:19:31 AM9/6/12
to lucia...@gmail.com, pdx-tech...@googlegroups.com
Unfortunately I havent been looking at calendar scraping in a while.  If its for something one off, you might be able to pull it off with something like http://querypath.org/ .  Hope that helps..

R

Igal Koshevoy

unread,
Sep 6, 2012, 12:07:10 PM9/6/12
to pdx-tech...@googlegroups.com, ryana...@ryanaslett.com
On Thu, Sep 6, 2012 at 1:00 AM, <lucia...@gmail.com> wrote:
    please tell me if your found anything for scraping events from other sites, I'm looking to set up something extremely similar to calagator for Davis CA. I need something that can scrape events from meetup, and a school html site. Any word from you would be greatly appreciated. 
 
You can scrape events from HTML, but that's a path wrought with peril:
  • Scraping is the wrong way to do this. The right way is to convince the event publishers to offer their event information as hCalendar or iCalendar so you and other people can just import the data without needing to write a scraper. Pitch this to them as a big step forward because it makes it so much easier for people to make use their event data.
  • HTML scrapers are extremely fragile, and you will constantly have to check and fix them. Every time the publisher edits their HTML, they're likely to break your scraper. You'll need to periodically test to make sure the scraper works or have to setup some kind of integration test that runs regularly to ensure that t still works. That sucks.
  • Writing HTML scrapers is tricky. Lots of HTML in the wild is malformed and you'll need a very smart parser library to make sense of it (Calagator uses a couple such libraries, so if you use it, just follow the approach we used in our other importers). Once the data is parsed, you need to carefully write code that's as forgiving and flexible as possible so that it does the right thing, strips out junk, etc.
So in a nutshell: you can scrape events from HTML, but you really shouldn't.

-igal 
Reply all
Reply to author
Forward
0 new messages