web api

62 views
Skip to first unread message

Witold Karpeta

unread,
Oct 2, 2012, 3:33:18 PM10/2/12
to gcd-...@googlegroups.com
As I said before I'm starting a new subject on this, but including last answer from Henry since it's important.

And answering - no, I didn't have any library in mind yet, but I'll certainly look at those you gave. But I think that first thing to do should be the specification of the api, and technical matters as which library to use or if this api should be RESTful or SOAP (I think the first is better) should go after that (or concurrently). So I'd like to hear some opinions - whether you had some specific thoughts about it or not yet.

First thing that comes to me is that the api should be read-only since the gcd approval process makes the possible writing part of the api hard to use. E.g. if some external application was to write data to gcd what should be it's identity? And where should the editor direct his comments? If one could login with e.g. openID than it could be made, but without it I believe we should focus on data access.

And coming to that does anything from public database comes to your minds that maybe shouldn't be accessible via api? If not, than after becoming more familiar with your database format I'll propose some format of the api, in a ticket maybe?

Witek


Witold, did you have a library in mind for helping to build the API?  We never really solved this issue in the past, but I've since looked into REST API libraries for Django.  Here's a chart comparing several options:
The main ones (as of about four months ago) were:

Pistons:  Old, stable and broadly used, but no longer a really good fit with current REST thinking.
TastyPie:  Newer, very popular, but sometimes very limiting.  The group I spoke with who had tried to use it found that there was a bit too much "magic" going on.
Django-Rest-Framework (DRF):  Newer, so not as broadly used, but under very active development and again based on some discussions with folks using it, quite reliable and flexible.  The self-documenting features seem quite nice.

Based on this, I would recommend DRF, but would be happy to hear of alternative suggestions/arguments.

As for the API itself, once you're ready could you explain a bit about how you see the API mapping to the GCD?  We'll want it to be set up to handle the changes that we already know are coming as well.  For instance, while it's only currently possible for a series to have one master publisher, we know that we want to change that.  It would make sense to make the API look a little closer to where we want the database to be, when such a thing is possible.

thanks,
-henry

Henry Andrews

unread,
Oct 2, 2012, 3:49:47 PM10/2/12
to gcd-...@googlegroups.com
Just a quick reply now- more later.  All prior discussions of an API that were at all specific about the mechanism have been in terms of a REST API.  To the best of my recollection, no one has ever suggested or endorsed SOAP.  All of the frameworks I noted were for REST.

thanks,
-henry


From: Witold Karpeta <wkar...@gmail.com>
To: gcd-...@googlegroups.com
Sent: Tuesday, October 2, 2012 12:33 PM
Subject: [gcd-tech] web api

--
GCD-Tech mailing list - gcd-...@googlegroups.com
To unsubscribe send email to gcd-tech+u...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/gcd-tech


Clay Mitchell

unread,
Dec 21, 2012, 6:02:10 PM12/21/12
to gcd-...@googlegroups.com
Is there a API that's being worked on? 

I'm currently working on a project that could really benefit from a good API (the comicvine api doesn't quite cut it)

If so, I might even be able to assist.

Thanks
-Clay

Witold Karpeta

unread,
Dec 21, 2012, 6:20:54 PM12/21/12
to gcd-...@googlegroups.com
My personal priorities were explained in later posts and as my.comics.org project showed up I began working on it instead of the API. Looking at it now it shouldn't be very difficult to write one though.
But now you could only write it by yourself with assist from Jochen, Henry and others as I see it. And maybe it would be even better for you since you could design it more as you like it?

Witek

2012/12/22 Clay Mitchell <cl...@pfd.net>

Don Kelly

unread,
Dec 21, 2012, 7:04:01 PM12/21/12
to gcd-...@googlegroups.com
I'm also interested in the API project for a personal project that could benefit from NOT scraping the web frontend. I'd started looking a working something up in python/flask, considering the broader project's use of python.

If you're interested, perhaps we can throw something together?

(longtime lurker on the tech list)
Don/

--
GCD-Tech mailing list - gcd-...@googlegroups.com
To unsubscribe send email to gcd-tech+u...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/gcd-tech

Clay Mitchell

unread,
Dec 21, 2012, 7:29:00 PM12/21/12
to gcd-...@googlegroups.com
Isn't GCD built on django already? Not sure you'd want to throw another framework into the mix, unless it was an entirely separate site, and I'm not sure it would be a good idea even then. And I'm a fan Flask :)

Here's a list of django API packages: http://www.djangopackages.com/grids/g/api/

It seems that Piston is the most popular - https://bitbucket.org/jespern/django-piston/wiki/Home

We'd of course need to figure out what was acceptable to expose, if anything but GET requests are allowed, things like that. Since I'm extremely new, I'd appreciate if somebody with more tenure chimed in.

I'll start getting things up and running on my end. I thought there would already be some work going on, but if things are from scratch, I guess that's ok too.

-Clay

Henry Andrews

unread,
Dec 21, 2012, 8:44:10 PM12/21/12
to gcd-...@googlegroups.com
Having evaluated python REST packages in my day job within the last year, I would recommend the one that's just called "Django REST Framework" (a.k.a DRF).  While Piston is more broadly adopted, that has more to do with it being older.  It has not kept up with developments in the REST world.  TastyPie is another viable alternative, but I and others I work with found DRF to be more appealing.  TastyPie does have direct oauth support, but that's something we'll need to integrate directly into the GCD anyway ourselves.  I also consulted with another professional web development team that tried TastyPie but found it difficult to work with, particularly due to lack of flexibility, once they got going.

The chart illustrates some other issues with Piston, which include having only one major maintainer, infrequent recent commits, and the last release back in March.  DRF and TastyPie are both supported by large communities and have released new versions within the last four months.

DRF has also recently released a major revamp 2.0.0 version which cleans up a lot of the rough edges I noticed six months ago.  And DRF APIs are immediately browseable in HTML form, which is a tremendous plus- rather than needing painstakingly detailed documentation, you can simply point your browser at the API and click around it and immediately see exactly how it behaves.

So obviously, I'm casting a strong vote for DRF.  Everyone else is encouraged to make their case for Piston or TastyPie or one of the other Django-based libraries.

Doing the work through another framework such as Flask will be a much harder sell here, assuming you want the API hosted as the GCD's official API on comics.org.  Our infrastructure is all centered around Django, so you'd need to convince the entire tech team of the validity of supporting a whole separate application stack.  I'm not saying that's impossible, just that you'll need to be very convincing.

thanks,
-henry


From: Clay Mitchell <cl...@pfd.net>
To: gcd-...@googlegroups.com
Sent: Friday, December 21, 2012 4:29 PM
Subject: Re: [gcd-tech] Re: web api

Jochen G.

unread,
Dec 22, 2012, 1:17:46 AM12/22/12
to gcd-...@googlegroups.com
I go with Henry here in the preference of possible packages. As he said,
you can make your case for other solutions, but it needs to be
convincing :-)

Jochen
> ------------------------------------------------------------------------
> *From:* Clay Mitchell <cl...@pfd.net>
> *To:* gcd-...@googlegroups.com
> *Sent:* Friday, December 21, 2012 4:29 PM
> *Subject:* Re: [gcd-tech] Re: web api
> <http://groups.google.com/group/gcd-tech>

michael Savarese

unread,
Dec 22, 2012, 12:31:19 PM12/22/12
to gcd-...@googlegroups.com
I am also very interested in helping with makign a RESTful web api. I am not overly familiar with Djano or Python as most of my experience with creating api's has been using C# or PHP. That being said I can learn it quickly. So lets pick a framework and get started! DRF is my vote since it has two already.

Don Kelly

unread,
Dec 23, 2012, 1:21:09 PM12/23/12
to gcd-...@googlegroups.com
I really have no preference for the framework package. Flask just happens to be the only thing from the Python universe that I've used in the past. Frankly, anything that gets the project off the ground would be completely acceptable. All I need is a remote API (rather than importing the db directly).

What's the next step to start hacking?

Jochen G.

unread,
Dec 23, 2012, 5:40:50 PM12/23/12
to gcd-...@googlegroups.com
two aspects I would say

one is to figure out what an API should do. There are at least three of
you from the 'user' side who want to use an API, so it would good if you
can discuss what the API should do. We from the GCD side would also have
some requirements. Let's call this design part.

the other is how to do it, i.e. look into the framework discussed and
start on a dev-server locally so that you understand how it could be
done, once we figure out what should be done.

Jochen
> official API on comics.org <http://comics.org>. Our infrastructure
> is all centered around Django, so you'd need to convince the entire
> tech team of the validity of supporting a whole separate application
> stack. I'm not saying that's impossible, just that you'll need to
> be very convincing.
>
> thanks,
> -henry
>
> ------------------------------------------------------------------------
> *From:* Clay Mitchell <cl...@pfd.net <mailto:cl...@pfd.net>>
> *To:* gcd-...@googlegroups.com <mailto:gcd-...@googlegroups.com>
> *Sent:* Friday, December 21, 2012 4:29 PM
> *Subject:* Re: [gcd-tech] Re: web api
>
> Isn't GCD built on django already? Not sure you'd want to throw
> another framework into the mix, unless it was an entirely
> separate site, and I'm not sure it would be a good idea even
> then. And I'm a fan Flask :)
>
> Here's a list of django API
> packages: http://www.djangopackages.com/grids/g/api/
>
> It seems that Piston is the most popular
> - https://bitbucket.org/jespern/django-piston/wiki/Home
>
> We'd of course need to figure out what was acceptable to expose,
> if anything but GET requests are allowed, things like that.
> Since I'm extremely new, I'd appreciate if somebody with more
> tenure chimed in.
>
> I'll start getting things up and running on my end. I thought
> there would already be some work going on, but if things are
> from scratch, I guess that's ok too.
>
> -Clay
>
> On Friday, December 21, 2012 7:04:01 PM UTC-5, kar...@gmail.com
> --
> GCD-Tech mailing list - gcd-...@googlegroups.com
> <mailto:gcd-...@googlegroups.com>
> To unsubscribe send email to
> gcd-tech+u...@googlegroups.com
> <mailto:gcd-tech%2Bunsu...@googlegroups.com>
> For more options, visit this group at
> http://groups.google.com/group/gcd-tech
>
>
> --
> GCD-Tech mailing list - gcd-...@googlegroups.com
> <mailto:gcd-...@googlegroups.com>
> To unsubscribe send email to gcd-tech+u...@googlegroups.com
> <mailto:gcd-tech%2Bunsu...@googlegroups.com>
> For more options, visit this group at
> http://groups.google.com/group/gcd-tech
>
>
>
>
> --
> http://hi.im/donkelly
> https://github.com/karfai
>

michael Savarese

unread,
Dec 24, 2012, 11:26:50 PM12/24/12
to gcd-...@googlegroups.com
I am putting together a document I will share shortly of my vision for the API. This is of course just my thoughts and open to discussion, but will entail what I need to be able to do to utilize the data for my other project.

Henry Andrews

unread,
Dec 24, 2012, 11:58:19 PM12/24/12
to gcd-...@googlegroups.com
Awesome!  We really need a starting point, and having someone come in with requirements is a great way to do that.

thanks,
-henry
(currently failing to sleep east coast hours due to a west coast internal clock)


From: michael Savarese <mikesa...@gmail.com>
To: gcd-...@googlegroups.com
Sent: Monday, December 24, 2012 11:26 PM
Subject: Re: [gcd-tech] Re: web api

I am putting together a document I will share shortly of my vision for the API. This is of course just my thoughts and open to discussion, but will entail what I need to be able to do to utilize the data for my other project. --

Clay Mitchell

unread,
Dec 28, 2012, 8:51:16 PM12/28/12
to gcd-...@googlegroups.com
Just an update from me - I just had back surgery so am somewhat out it, but hope to start working on this in the next week. The doctor says I'm not to sit at a 90 degree angle for a bit, which makes the computer a difficult proposition :)

Anyway, getting what everyone would like to see in it would be a great start.

Off the top of my head, I think that the Comicvine API does a lot of things well - however it has a few gaping holes, the largest of which is the ability to pull issue information in bulk.

As a starting point, do we have a consolidated list of what topics the GCD database / system has (or has the capability of storing) ?

-Clay

Lionel English

unread,
Dec 28, 2012, 9:03:44 PM12/28/12
to GCD tech

If you're asking about the main types of objects stored, that would include publisher, indicia publisher, publisher brands, series, issues, sequences (stories, covers, ads, etc), characters, and creators (sub-divided into writers, pencillers, inkers, letterers, colorists, and editors).

--

Clay Mitchell

unread,
Dec 28, 2012, 9:15:49 PM12/28/12
to gcd-...@googlegroups.com
Yes, that's exactly what I'm asking about :)

Probably a hard question to answer, but how completely is this data?

Henry Andrews

unread,
Dec 28, 2012, 9:49:07 PM12/28/12
to gcd-...@googlegroups.com
I have no idea how to answer the completeness question in general.  U.S. superheroes tend to be well indexed.  UK work is spotty (some well covered, but other long-running comics like The Dandy are almost entirely absent- series/issue skeletons only).  Less popular U.S. genres such as funny animals (1940s/50s) are less well indexed than superheroes.  And I just picked three things more or less at random.

As far as what objects are more complete than others:

* Series and issue skeleton data (complete series object, issue objects with just numbers and maybe dates, but many other fields blank and no sequence/story objects) are the most common

* We do have plenty of issues with story objects.  Typically, all comic story sequences and the front cover are indexed.  Whether other sequences such as ads, text stories, "filler" (whatever that meant to the indexer and approver), etc. are present is highly variable.  Going by our front page stats, 20% or so of our issues are "fully indexed", which (I think) means that they have more than half of their pages accounted for by sequence objects.

* We have enough "master publishers" a.k.a. "publishers" to handle all of the series.  Whether or not the set of master publishers is correct in some sense of the word is hotly debated.  The definition of "correct" in this usage is also hotly debated.

* indicia publisher and brand are newer concepts designed to sidestep some of the problems with master publisher.  See http://docs.comics.org/wiki/User%27s_Guide_to_Publishers for more explanations.

* creators and characters currently do not exist as real database objects (because the database started as flat text files, and we've gradually impose a relational structure on that).  Those are big projects on the medium-term frontier.  The Who's Who project is essentially about properly handling creators.  Characters will probably come after that.

thanks,
-henry


From: Clay Mitchell <cl...@pfd.net>
To: gcd-...@googlegroups.com
Sent: Friday, December 28, 2012 6:15 PM
Subject: Re: [gcd-tech] Re: web api

Henry Andrews

unread,
Dec 28, 2012, 11:15:44 PM12/28/12
to gcd-...@googlegroups.com
I hope your recovery goes swiftly and smoothly!

What do you mean by "in bulk"?  The Comicvine API looks like it's paginated to 100 elements at a time.  We would probably do something similar and/or implement throttling ( http://django-rest-framework.org/api-guide/throttling.html ) to prevent overloading the server.  We don't have the bandwidth to support a bunch of really large bulk transfers.  We do have the SQL dump downloads, but only a few people actually download those so there's not much overall performance impact there.

Taking a *very* casual glance at the Comicvine API structure, it seems fairly reasonable.  I would tend to lean towards fewer sub-lists in the details and more URLs to allow clients to fetch those sub-lists if we want them (once the first version of the API settles, perhaps we can add read-ahead options to preemptively follow those URLs and assemble the larger response on the server).

I might also have some slightly different ideas on resource naming and query parameters, but would prefer to hear a bit more from others before getting into that (translation: I'm feeling too lazy to write it up right now :-)

thanks,
-henry


From: Clay Mitchell <cl...@pfd.net>
To: gcd-...@googlegroups.com
Sent: Friday, December 28, 2012 5:51 PM
Subject: Re: [gcd-tech] Re: web api

--

michael Savarese

unread,
Jan 12, 2013, 1:11:20 AM1/12/13
to gcd-...@googlegroups.com
Sorry about the delay in getting the document finalized, but been working two jobs lately. I have the doc started and will be finalizing it this weekend for everyone to review.
Reply all
Reply to author
Forward
0 new messages