Prediction API v1.1 Now Available

159 views
Skip to first unread message

Travis Green

unread,
Sep 15, 2010, 12:01:06 PM9/15/10
to prediction-...@googlegroups.com
Today, we're announcing v1.1 of the Prediction API, which has new features to make your apps even smarter, including some of top requests from our Moderator page. We look forward to seeing the ways you can put the API to use, and thank you for your outstanding response to the API so far. 

More details about ways to use v1.1 can be found on the Google Codeblog, our revised documentation, and the description of changes from v1 (which remains completely available) below.

Thank you for your feedback and for already having made some fantastic apps. We look forward to seeing what you can do with these new features, and good luck!

-The Prediction API Team



Prediction API v1.1 New Features and Usage

Overview

This document outlines new functionality to be found in v1.1 of the Prediction API, including multiple category output, regression input/output, model deletion, and mixed text/numeric feature input.

The primary change to commands is the replacement of v1 with v1.1 within each request. Authentication remains the same. Example model training, training status, and prediction requests are shown below.

The Prediction API’s v1 remains completely available, and its commands are reviewed at the end of this document. We recommend making v1.1 calls as they give significantly enhanced functionality, but note that they are not backward compatible in two ways:
  • training data submitted with only numbers to v1.1 in the leftmost column will be treated as a continuous output problem (unlike v1)
  • the JSON returned from prediction calls has different syntax in v1.1 to accomodate the enhanced functionality

Model Training Request
curl -X POST  -H "Content-Type:application/json"  -d "{\"data\":{}}"  -H "Authorization: GoogleLogin auth=0123456789abcdefghijklmnopqrstuvwxyzPQRSTUVWXYZ0123456789...XYZ" https://www.googleapis.com/prediction/v1.1/training?data=mybucket%2Fmydata

Training Status Request
curl -H "Authorization: GoogleLogin auth=0123456789abcdefghijklmnopqrstuvwxyzPQRSTUVWXYZ0123456789...XYZ" https://www.googleapis.com/prediction/v1.1/training/mybucket%2Fmydata

Prediction Request
curl -X POST    -H "Content-Type:application/json"  -H "Authorization: GoogleLogin auth=0123456789abcdefghijklmnopqrstuvwxyzPQRSTUVWXYZ0123456789...XYZ"  -d "{\"data\" : { \"input\" : { \"text\" : [ \"La idioma mas fina\" ] }}}" https://www.googleapis.com/prediction/v1.1/training/mybucket%2Fmydata/predict

More details about using each new feature can be found below.

Multiple Category Output

In addition to the mostly likely category, v1.1 returns all categories and their relative, un-normalized scores by default when a prediction request is made (see above). The following is an example return from v1.1 of the API:

{"data":
 {"kind": "prediction#output",
  "outputLabel": "spam",
  "outputMulti": [ {"label": "spam", "score": 14.1},
                    "label": "ham", "score": 2.3} ]}}
*Note that outputLabel has replaced v1’s output_label field.

The outputLabel field continues to contain the most likely category for easy extraction, while the outputMulti field contains a list of tuples, each of which contains a label and its relative score.  The range of scores is not bounded, and the category with the highest score is considered to be most likely.

Regression

For uses requiring values along a continuous scale, v1.1 supports regression.  To submit a regression training data, all output should be numerical.  For example:

5, "I really like this product."
3, "This product is okay."
1, "I hate this product. It does not work."

*Note that numerical values in the leftmost column of all rows will automatically return regression values. if you intend to do classification, we recommend encasing those values within double quotes. For example, 5 indicates a regression value of 5 while “5” indicates a category labeled “5.”

Once a regression data (in CSV format) are sent to the Prediction API, users can make predictions on new data and get real-valued output in the outputValue field.  For example,

{"data”:
 {"kind”: "prediction#output",
  "outputValue": "0.99"}}

Mixed Feature

Training data now can now mix text and numeric features.  For example,

spam, "spam subject", 100, "spam body"
ham, "ham subject", 1, "ham body"

To send a mixed feature prediction request, a new field "mixture" has been added. For example:

{"data":
 {"input":
   {"mixture": [ "subject", 50, "body"]}}}

Note that text features inside the mixture field should remain double quoted.  The following JSON input is thus NOT correct:

{"data":
 {"input":
   {"mixture": [ subject, 50, body]}}}

Model Deletion

To delete a pre-trained model, please refer to the following command:

curl -X DELETE \
-H "Authroization: GoogleLogin auth=${AUTH}" \
https://www.googleapis.com/prediction/v1.1/training/mybucket%2Fmydata

Empty response is returned if a model exists and is successfully deleted, and a 400 Error if the model does not exist.

v1 Commands

Below can be found basic commands to access v1.

Model Training Request
curl -X POST  -H "Content-Type:application/json"  -d "{\"data\":{}}"  -H "Authorization: GoogleLogin auth=0123456789abcdefghijklmnopqrstuvwxyzPQRSTUVWXYZ0123456789...XYZ" https://www.googleapis.com/prediction/v1/training?data=mybucket%2Fmydata

Training Status Request
curl -H "Authorization: GoogleLogin auth=0123456789abcdefghijklmnopqrstuvwxyzPQRSTUVWXYZ0123456789...XYZ" https://www.googleapis.com/prediction/v1/training/mybucket%2Fmydata

Prediction Request
curl -X POST    -H "Content-Type:application/json"  -H "Authorization: GoogleLogin auth=0123456789abcdefghijklmnopqrstuvwxyzPQRSTUVWXYZ0123456789...XYZ"  -d "{\"data\" : { \"input\" : { \"text\" : [ \"La idioma mas fina\" ] }}}" https://www.googleapis.com/prediction/v1/training/mybucket%2Fmydata/predict

While the responses to training and training status requests remain the same, the values returned from prediction calls differ from v1.1, and an example is outlined below:

Prediction Response
{"data":{
"output":{
["output_label":"spanish"]}}}
Reply all
Reply to author
Forward
0 new messages