Complex queries / reporting

246 views
Skip to first unread message

James McGinty

unread,
Nov 7, 2012, 3:50:08 AM11/7/12
to api-...@googlegroups.com
Hi API-ers, we are working on moving a large & established ERP system to a more modular, rest API driven system.
This is raising questions on the subject of how to address reporting needs in a way which is loosely coupled, but powerful enough to be useful.
At the moment, most of our reports are crystal reports driven by sql, but we don't want to provide direct sql access to our database anymore.

There are several problems as I see it:
1. The query language for resources is usually quite basic, and may not be powerful enough to support complex queries on a single resource.
i.e. Tell me all customers who bought less from us this Year To Date than last YTD., etc etc .

2. Queries which must aggregate across resources are problematic - what would normally be a "join" on a database is not possible.

I am sure there is no one answer, but I would like to hear how others have approached this problem.

Have you created some special 'reporting' resources which contain 'flattened' data for reporting?

Do you have a special 'query' type resource like facebook's FQL which lets you run queries directly? https://developers.facebook.com/docs/reference/fql/
This seems like it is a sort of cop out, but very powerful.

James

Dietrich Schulten

unread,
Nov 7, 2012, 7:12:41 AM11/7/12
to api-...@googlegroups.com

Maybe Sparql could be the right approach for this kind of queries. You would have to add rdf based meta info in order to use it,  e.g. Json-ld. Schema.org and goodrelations seem to have vocabularies ready for use with product and customer data. A sparql request extracts this info to aggregate the data. See the tutorial on linkeddata.org for basics. Certainly a huge hammer, but maybe it fits your nail.
Best regards
Dietrich

--
You received this message because you are subscribed to the Google Groups "API Craft" group.
To unsubscribe from this group, send email to api-craft+...@googlegroups.com.
Visit this group at http://groups.google.com/group/api-craft?hl=en.
 
 

Mike Kelly

unread,
Nov 7, 2012, 8:15:58 AM11/7/12
to api-...@googlegroups.com
Hi James,

fwiw, I take an iterative approach. I use URI templates for queries
but don't involve a query language or anything like that.
I tend the introduce the various resources gradually, exposing the
most common/important information first and going from there.

Cheers,
M
> --
> You received this message because you are subscribed to the Google Groups
> "API Craft" group.
> To unsubscribe from this group, send email to
> api-craft+...@googlegroups.com.
> Visit this group at http://groups.google.com/group/api-craft?hl=en.
>
>



--
Mike

http://twitter.com/mikekelly85
http://github.com/mikekelly
http://linkedin.com/in/mikekelly123

James McGinty

unread,
Nov 7, 2012, 9:21:01 PM11/7/12
to api-...@googlegroups.com
Hi Mike, can you expand on how you use the URI templates for queries?
James

Romain Bessuges-Meusy

unread,
Dec 14, 2012, 6:46:15 PM12/14/12
to api-...@googlegroups.com
Hi James !

I assume your ERP system defines a lot of related but quite stand-alone resources : orders, invoices, bills, customers. Cross-querying theses resources via the QUERY_STRING, in a restful and stateless way is, IMO, absolutely impossible.

Instead, i would definitely take a deep look at data warehousing. Here's a simplistic picture : every action in your business processes emits an event that is stored in a flattened state, in big tables full of columns. Those tables are easily queried in SQL, and can be too in a REST way. To do so, IMO again, you provide a specific API that manipulates queries resources. Each resources is composed by fliters and filters sub-sets.

POST http://api.data.myerp.com/queries => returns an identifier (i.e. 2323)
DELETE http://api.data.myerp.com/queries/2323 => deletes the query if you don't plan to reuse it.

Hope you'll find those lines interesting ;-)

Regards,

Romain 

James McGinty

unread,
Dec 15, 2012, 5:04:38 AM12/15/12
to api-...@googlegroups.com
Merci Romain, for your thoughts!
I must admit, I had never considered what you are proposing. If I understand correctly, all actions in the system end up in an 'event' / 'audit log' resource.
Have you applied this screnario in practice?

James

Romain Bessuges-Meusy

unread,
Dec 15, 2012, 8:39:52 AM12/15/12
to api-...@googlegroups.com
Avec plaisir !

I haven't even tried it yet, but I'm planning to do so for my enterprise proprietary ERP. 
My approach is inspired by data mining and OLAP cubes which manipulates dimensions (time, population, space) to analyse facts (checkin, checkouts, etc.)
You may check Saiku, an OLAP analytics solution which provides a RESTful API : 


Romain

davidE

unread,
Oct 10, 2013, 7:55:12 PM10/10/13
to api-...@googlegroups.com
This thread is pretty dead, but I thought I would try my luck.

James, did you ever end up doing anything like FQL? What was your experience and how did you find use-cases to design toward?

Kevin Swiber

unread,
Oct 10, 2013, 8:24:00 PM10/10/13
to api-...@googlegroups.com
Hey David,

I've been working in this space lately.  I have a query language that can be mapped to custom server logic. It supports field selection, filtering, and sorting of collections.

When providing an API over a collection, I find this pattern to be quite useful.

Complex ad-hoc reporting is a slightly different requirement than this use case. Basic query features over a single collection is pretty useful, however. Once the parser is built, the rest is fairly easy.  I'm in the process of open sourcing this code if you're interested.

For more advanced querying, I might expose a separate reporting API and use something like a stored procedure model with async results and proper restrictions on access/execution time.

Sent from my iPhone
--
You received this message because you are subscribed to the Google Groups "API Craft" group.
To unsubscribe from this group and stop receiving emails from it, send an email to api-craft+...@googlegroups.com.
Visit this group at http://groups.google.com/group/api-craft.
For more options, visit https://groups.google.com/groups/opt_out.

davidE

unread,
Oct 11, 2013, 3:32:19 PM10/11/13
to api-...@googlegroups.com
Hey Kevin,

Thanks for the response. What language are you using? We are open sourcing a similar parsing library in Java for Maven dependencies next week (it's public on GitHub now if anyone is interested).

Are you backing the queries with MySQL? It sounds like it is internal so I am guessing that you can be a little more lax about security, but how do you handle SQL injects? 

How is it exposed in your API(s)? Internal, external? I can imagine performance concerns exposing it externally to customers. Do you have any permissions concerns? If so, do you just flatten the permissions in the data model and include a conditional in the query?

@All,

Do you think FQL-like query languages are the future of flexible REST APIs or is it too difficult to implement, document and maintain?

Kevin Swiber

unread,
Oct 11, 2013, 4:46:19 PM10/11/13
to api-...@googlegroups.com
On Fri, Oct 11, 2013 at 3:32 PM, davidE <esposit...@gmail.com> wrote:
Hey Kevin,

Thanks for the response. What language are you using? We are open sourcing a similar parsing library in Java for Maven dependencies next week (it's public on GitHub now if anyone is interested).

I'm using JavaScript with Node.js.  I'd love to see your Java project, as well!
 
Are you backing the queries with MySQL? It sounds like it is internal so I am guessing that you can be a little more lax about security, but how do you handle SQL injects? 

The backing data store is actually configurable with drivers.  I have drivers for CouchDB, MongoDB, and Usergrid (open source backend-as-a-service) at the moment.  In Java, you have JDBC, but this doesn't exist in the Node world.  I usually put security at the action level (i.e., Does the user have access to the operation mapped to POST /collection?).  Field-level security, in general, is more complicated.

I created a parser that generates an abstract syntax tree with the query parts.  Doing it this way means it's easier to sanitize.  For sanitizing input, I'm following the advice given by OWASP here: https://www.owasp.org/index.php/SQL_Injection_Prevention_Cheat_Sheet#MySQL_Escaping
 
How is it exposed in your API(s)? Internal, external? I can imagine performance concerns exposing it externally to customers. Do you have any permissions concerns? If so, do you just flatten the permissions in the data model and include a conditional in the query?

I've only deployed it for low-use special purpose at this time.  It's still pretty young.  These days, most of my work is in the open source or research and development space.  Performance concerns should be evaluated, unless you INDEX ALL THE THINGS, which is not doable for many implementations (though I believe it is exactly what Usergrid does using some nifty Cassandra tricks).  This can be somewhat mitigated by setting an upper threshold on the number of query results and being smart about indexes.  Queries could be auto-reworked to be more optimized prior to hitting the backing data store.

There are a number of concerns around permissions.  Some can be handled at the API server layer.  Others can be handled by the domain model.  Finally, the backing data store has the last say.  As complex as this all can get, taking advantage of all these security layers can be helpful.
 

@All,

Do you think FQL-like query languages are the future of flexible REST APIs or is it too difficult to implement, document and maintain?

FQL is somewhat more complex than what I'm using.  The pattern I'm using mostly follows Usergrid's: http://server/books?ql=select+*+where+title+contains+'breakfast' where "books" is the collection and the "ql" query parameter contains the query language for filtering that collection.  There's no cross-collection support via the API at this time, and I'm not even sure the cost-benefit breakdown would make it a viable option for most cases.



--
Kevin Swiber
Projects: https://github.com/kevinswiber
Twitter: @kevinswiber
Reply all
Reply to author
Forward
0 new messages