Quickstart: Using BigML API with ScraperWiki & Yahoo! Pipes

mendicott.com

unread,

May 9, 2012, 11:37:28 AM5/9/12

to BigML

I would like to request any "quickstart" documentation for using the
BigML API, especially videos, something like "Hello World". If this
already exists, please send a quick pointer.

In particular, my first reaction is, how would I automagically flip
ScraperWiki CSV data into BigML (with the BigML API) ?

Secondly, how can I use the BigML API to access results via Yahoo!
Pipes (Web Service Module) ?

TIA, Marcus Endicott http://meta-guide.com

PS In example, here is Yahoo! Pipes documentation from AlchemyAPI
http://t.co/7M3NVzwa & OpenCalais http://t.co/NUiKNBaX ..

Jose A. Ortega Ruiz

unread,

May 9, 2012, 12:30:02 PM5/9/12

to mendicott.com, BigML

Hi Marcus,

On Wed, May 09 2012, mendicott.com wrote:

> I would like to request any "quickstart" documentation for using the
> BigML API, especially videos, something like "Hello World". If this
> already exists, please send a quick pointer.

We don't have API-specific videos yet, but the API is fully documented
in

https://bigml.com/developers/

and, in particular, it contains a quick-start:

https://bigml.com/developers/quick_start

In addition, in the bash and python bindings READMEs there are several
simple usage examples:

https://github.com/bigmlcom/bigml-bash/blob/master/README.md
https://github.com/bigmlcom/python/blob/master/README.md

> In particular, my first reaction is, how would I automagically flip
> ScraperWiki CSV data into BigML (with the BigML API) ?

If you can grab the data and put it in a CSV file, you'll see in the
docs that you're in business: you upload the file to create a
datasource, and build from there.

[We are working on providing other ways of making data available to our
services, so stay tunned for even easier ways of pushing data to us]

> Secondly, how can I use the BigML API to access results via Yahoo!
> Pipes (Web Service Module) ?

As you'll see in the API documentation, our API is based on RESTful JSON
requests over HTTPS. My understanding is that Yahoo! Pipes operates
over RSS/Atom sources, so one would need to translate our JSON responses
to, say, Atom feeds before being able to use them in Pipes.

Currently, we don't provide out-of-the-box support for that; in
particular, it's not totally immediate what data to offer in the feeds
for a predictive model that is per se not evolving and requires input to
be used for making predictions.

For instance, it shouldn't be difficult to write a little script that
sends JSON requests to BigML.io and publishes in a feed the resulting
predictions. Or that periodically uploads data, creates a new model and
publishes some aspects of the resulting model (our trees are whitebox
and served as JSON documents). But one needs first to decide what
aspects of the tree to publish, what kind of predictions to make, etc..

We'll definitely consider the possibility of offering a standardized set
of RSS/Atom feeds... needless to say, any further feedback or ideas from
you or other users would be immensely useful and welcome!

Thanks!

Cheers,
jao

mendicott.com

unread,

May 10, 2012, 10:22:42 AM5/10/12

to BigML

> As you'll see in the API documentation, our API is based on RESTful JSON
> requests over HTTPS. My understanding is that Yahoo! Pipes operates
> over RSS/Atom sources, so one would need to translate our JSON responses
> to, say, Atom feeds before being able to use them in Pipes.

In fact YahooPipes can handle JSON input via the "Fetch Data" module
and also does JSON output.

I've now got working Pipes for the four basic steps using the sample
source "credit-application.csv":

1) source http://youtu.be/Nv-xTJmp1-M

2) dataset http://youtu.be/9EP-_R8nbXE

3) model http://youtu.be/k7_jDrBsjzY

4) prediction http://youtu.be/O74NRvtdyVE

If there were a public API key (sandbox) then I could provide working
examples.

I'm not yet quite sure how to manipulate them and put them to good
use....

I guess my theoretical questions are, is it possible to input data
dynamically and then get predictions dynamically (perhaps based on an
existing, pre-trained, model)??

In terms of tutorial, maybe someone could suggest a next step, what to
do with them??

The next video "BigML - How to interact with a Model" http://youtu.be/1xFfHzx5AJA
doesn't show how to interact programmatically via the API....

Jose A. Ortega Ruiz

unread,

May 10, 2012, 12:34:33 PM5/10/12

to mendicott.com, BigML

On Thu, May 10 2012, mendicott.com wrote:

>> As you'll see in the API documentation, our API is based on RESTful JSON
>> requests over HTTPS. My understanding is that Yahoo! Pipes operates
>> over RSS/Atom sources, so one would need to translate our JSON responses
>> to, say, Atom feeds before being able to use them in Pipes.
>
> In fact YahooPipes can handle JSON input via the "Fetch Data" module
> and also does JSON output.
>
> I've now got working Pipes for the four basic steps using the sample
> source "credit-application.csv":

Excellent!

>
>
> 1) source http://youtu.be/Nv-xTJmp1-M
>
> 2) dataset http://youtu.be/9EP-_R8nbXE
>
> 3) model http://youtu.be/k7_jDrBsjzY
>
> 4) prediction http://youtu.be/O74NRvtdyVE
>
> If there were a public API key (sandbox) then I could provide working
> examples.
>
> I'm not yet quite sure how to manipulate them and put them to good
> use....
>
> I guess my theoretical questions are, is it possible to input data
> dynamically and then get predictions dynamically (perhaps based on an
> existing, pre-trained, model)??

Yes, obtaining predictions from new data is definitely one of the main
uses of the generated models. Currently, one has to ask for predictions
one at a time by providing the input fields in the corresponding JSON
POST request, directed at the corresponding model, as detailed here:

https://bigml.com/developers/predictions

(There's also a short example at the end of quickstart) Once you've made
a prediction, it gets a unique resource id, so it can be retrieved
afterwards as many times as you wish. It's also possible to retrieve a
per model listing of all predictions.

> In terms of tutorial, maybe someone could suggest a next step, what to
> do with them??

Generating predictions from new data is, i would say, one of the main
programmatic uses of a model. Since the model and datasets themselves
are plain JSON, one can also explore them and put the useful information
about the data they contain to good use, but the API does not currently
go beyond giving you access to the generated models and using them for
making predicitions.

> The next video "BigML - How to interact with a Model"
> http://youtu.be/1xFfHzx5AJA doesn't show how to interact
> programmatically via the API....

Yes, the videos are more focused on interacting with the system using
the web interface. We'll see to add more API tutorials as time permits.

Cheers,
jao

Reply all

Reply to author

Forward