Is it possible to use WIT (WitConfigBuilder) with Sklearn, H2O, SparkML?

323 views
Skip to first unread message

vsing...@gmail.com

unread,
Aug 19, 2019, 4:13:53 PM8/19/19
to What-If Tool
Hi, 
I am wondering if WitConfigBuilder can be used with other ML libraries than Tensorflow. 
The API seems to have methods like set_estimator_and_feature_spec() that accept TF style classifier and feature specs only. 

Regards, 
Vaibhav Singh. 

Tolga Bolukbasi

unread,
Aug 19, 2019, 4:21:21 PM8/19/19
to vsing...@gmail.com, What-If Tool
Hi Vaibhav,
Yes, you can use WIT with any python predictor. The API you are looking for is "set_custom_predict_fn()". Here is an example notebook that uses this API to set keras models for text prediction: WIT text demo. Please let me know if you have any further questions!

Best

--
You received this message because you are subscribed to the Google Groups "What-If Tool" group.
To unsubscribe from this group and stop receiving emails from it, send an email to what-if-tool...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/what-if-tool/62ebba49-24e4-48b5-b76a-ae01fcd27c77%40googlegroups.com.


--
Tolga Bolukbasi
Software Engineer, Google Brain PAIR
tol...@google.com

vsing...@gmail.com

unread,
Aug 20, 2019, 12:34:53 PM8/20/19
to What-If Tool
Hi, 
Thanks for the prompt response. 
Per the example and method signature, set_custom_predict_fn() still seems to accept only tf examples as the data input. 
Is there a way to pass the conventional feature vectors (2D), or will we have to convert to tf examples, then back to original 2D before passing to sklearn model.predict() in the custom predict fn, eg.? Hope I make sense. 

Vaibhav Singh


On Monday, August 19, 2019 at 3:21:21 PM UTC-5, Tolga Bolukbasi wrote:
Hi Vaibhav,
Yes, you can use WIT with any python predictor. The API you are looking for is "set_custom_predict_fn()". Here is an example notebook that uses this API to set keras models for text prediction: WIT text demo. Please let me know if you have any further questions!

Best

On Mon, Aug 19, 2019 at 4:13 PM <vsin...@gmail.com> wrote:
Hi, 
I am wondering if WitConfigBuilder can be used with other ML libraries than Tensorflow. 
The API seems to have methods like set_estimator_and_feature_spec() that accept TF style classifier and feature specs only. 

Regards, 
Vaibhav Singh. 

--
You received this message because you are subscribed to the Google Groups "What-If Tool" group.
To unsubscribe from this group and stop receiving emails from it, send an email to what-i...@googlegroups.com.

vsing...@gmail.com

unread,
Aug 20, 2019, 12:37:29 PM8/20/19
to What-If Tool
This is what I am referring to when saying it accepts only tf examples. 

WitConfigBuilder(examples[:num_datapoints]).set_custom_predict_fn(
  custom_predict_1)

James Wexler

unread,
Aug 20, 2019, 12:54:14 PM8/20/19
to vsing...@gmail.com, What-If Tool
You are correct that currently this method only accepts tf.Examples. So you will need to convert your data to tf.Examples to pass to WitConfigBuilder, then convert them back from tf.Examples in your custom_predict_fn to send to your model. I know its not ideal. In the future, I could imagine a change where if you provide JSON objects to WitConfigBuilder instead of tf.Examples, then the custom_predict_fn is provided the examples in JSON format as opposed to tf.Example. But currently that isn't how it works.

To unsubscribe from this group and stop receiving emails from it, send an email to what-if-tool...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/what-if-tool/f373907a-48fa-48cd-842b-2381c310dca6%40googlegroups.com.
Message has been deleted

vsing...@gmail.com

unread,
Aug 20, 2019, 4:33:17 PM8/20/19
to What-If Tool
Was able to get it working with Sklearn. Thank you for the prompt responses.

Vaibhav Singh


On Tuesday, August 20, 2019 at 11:54:14 AM UTC-5, James Wexler wrote:
You are correct that currently this method only accepts tf.Examples. So you will need to convert your data to tf.Examples to pass to WitConfigBuilder, then convert them back from tf.Examples in your custom_predict_fn to send to your model. I know its not ideal. In the future, I could imagine a change where if you provide JSON objects to WitConfigBuilder instead of tf.Examples, then the custom_predict_fn is provided the examples in JSON format as opposed to tf.Example. But currently that isn't how it works.

vsing...@gmail.com

unread,
Aug 20, 2019, 5:21:08 PM8/20/19
to What-If Tool
Hi again, 
However, unlike the Keras example, I am not getting the 'Inference label/score' option on the filter dropdown. 
Have tried SVM and SGDClassifier. What may I be missing? 


Relevant code snippets:

# Converts a list of tf.Example protos into a dataframe.
def examples_to_df(examples, columns=None):
    records = []
    if columns == None:
        raise Exception('Columns must be supplied.')
        
    for example in examples:
        record = []
        for col in columns:
            if col in example.features.feature:
                if hasattr(example.features.feature[col], 'int64_list'):
                    record.append(example.features.feature[col].int64_list.value[0])
                elif hasattr(example.features.feature[col], 'float_list'):
                    record.append(example.features.feature[col].float_list.value[0])
                else:
                    record.append(example.features.feature[col].bytes_list.value[0])
        records.append(record)
    examples_df = pd.DataFrame(records, columns=columns)
    return examples_df

def custom_predict(examples_to_infer):
    model_ins = examples_to_df(examples_to_infer)
    preds = clf.predict_proba(model_ins)
    return preds

# Setup the tool with the test examples and the trained classifier
test_csv_path = 'adult.test'
test_df = pd.read_csv(test_csv_path, names=csv_columns, skipinitialspace=True, skiprows=1)
label_encode(test_df)
x_test = test_df.drop([y_col], axis=1)
y_test = test_df[y_col]
test_examples = df_to_examples(x_test)

from witwidget.notebook.visualization import WitWidget, WitConfigBuilder
num_datapoints = 1000  #@param {type: "number"}
tool_height_in_px = 720  #@param {type: "number"}

# Setup the tool with the test examples and the trained classifier
config_builder = WitConfigBuilder(test_examples[0:num_datapoints]).set_custom_predict_fn(
    custom_predict).set_label_vocab(['Under 50K', 'Over 50K'])
WitWidget(config_builder, height=tool_height_in_px)

Regards,
Vaibhav Singh

James Wexler

unread,
Aug 20, 2019, 5:29:09 PM8/20/19
to vsing...@gmail.com, What-If Tool
Do you see filled in inference results in the bottom left when you click on an example dot in the visualization? Do you see any error text in the top-right of the visualization?

To unsubscribe from this group and stop receiving emails from it, send an email to what-if-tool...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/what-if-tool/dc09f638-9c1c-44c3-aafc-1cc84ad437d1%40googlegroups.com.

vsing...@gmail.com

unread,
Aug 21, 2019, 11:00:52 AM8/21/19
to What-If Tool
Answer to both is no. 
Also tried using predict_proba() and decision_function() predictors on model. Still no go. 

Screen Shot 2019-08-21 at 9.59.06 AM.png



On Tuesday, August 20, 2019 at 4:29:09 PM UTC-5, James Wexler wrote:
Do you see filled in inference results in the bottom left when you click on an example dot in the visualization? Do you see any error text in the top-right of the visualization?

James Wexler

unread,
Aug 21, 2019, 11:11:03 AM8/21/19
to vsing...@gmail.com, What-If Tool
The top-right shows an error occurred during calling of your function "Columns must be supplied". It's possible that your custom predict function isn't converting the tf.Examples to the appropriate format before sending them to the model. Perhaps print out your examples in your custom prediction function after converting from tf.Example but before sending them to the model to see if the format has an issue? As far as what the custom prediction function should return you should have a 2D array of numbers such as:
[[example0Class0Score, example0Class1Score], [example1Class0Score, example1Class1Score]], ...]

To unsubscribe from this group and stop receiving emails from it, send an email to what-if-tool...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/what-if-tool/0fb24f3a-68b5-4334-ba25-0d013b707d5f%40googlegroups.com.

vsing...@gmail.com

unread,
Aug 21, 2019, 1:09:57 PM8/21/19
to What-If Tool
Alright, so I notice an error msg when clicking on the Run Inference button. Will work on it and get back if needed. Thanks. 

vsing...@gmail.com

unread,
Aug 22, 2019, 10:42:22 AM8/22/19
to What-If Tool
Hi James, 
Thank you again for the prompt help. 
So I have been able to resolve the bug with custom_predict(). Have also tested it separately by passing all test_examples - same argument that's passed to 

WitConfigBuilder(test_examples[:num_datapoints]).set_custom_predict_fn(
    custom_predict)

However, now I am getting the following error on WIT.

object of type 'numpy.float64' has no len()

Here's the new example to df converter and custom_predict() for reference.

# Converts a list of tf.Example protos into a dataframe.
def examples_to_df(examples, columns=None):
    records = []
    if columns == None:
        raise Exception('Columns must be supplied.')
        
    # Handle single example case
    if not isinstance(examples, list):
        examples = [examples]

    for example in examples:
        record = []
        for col in columns:
            val = None
            if col in example.features.feature:
                if len(example.features.feature[col].int64_list.value) == 1:
                    record.append(example.features.feature[col].int64_list.value[0])
                elif len(example.features.feature[col].float_list.value) == 1:
                    record.append(example.features.feature[col].float_list.value[0])
                elif len(example.features.feature[col].bytes_list.value) == 1:
                    record.append(example.features.feature[col].bytes_list.value[0].decode('utf-8'))
                else:
                    raise Exception('No value found in example for: ' + col)
        records.append(record)

    examples_df = pd.DataFrame(records, columns=columns)
    return examples_df

def custom_predict(examples_to_infer):
    model_ins = examples_to_df(examples_to_infer, csv_columns[:-1])
    preds = clf.predict(model_ins)
    return preds

Snapshot of standalone test:

Screen Shot 2019-08-22 at 9.39.58 AM.png

How do I debug this issue? Do we have any stack traces/logs generated by WIT somewhere?

Regards, 
Vaibhav Singh

On Wednesday, August 21, 2019 at 10:11:03 AM UTC-5, James Wexler wrote:
The top-right shows an error occurred during calling of your function "Columns must be supplied". It's possible that your custom predict function isn't converting the tf.Examples to the appropriate format before sending them to the model. Perhaps print out your examples in your custom prediction function after converting from tf.Example but before sending them to the model to see if the format has an issue? As far as what the custom prediction function should return you should have a 2D array of numbers such as:
[[example0Class0Score, example0Class1Score], [example1Class0Score, example1Class1Score]], ...]

James Wexler

unread,
Aug 22, 2019, 10:52:49 AM8/22/19
to vsing...@gmail.com, What-If Tool
Thanks for the update! Glad that column issue is taken care of.

From the snapshot you sent of the standalone test, it seems your custom_predict_fn is still not returning its data in the right format. It will always be passed an array of examples, much like your first call "custom_predict(test_examples)". But your custom predict looks to be returning its results as a 1-D array of numbers as shown in your screenshot. Instead, it needs to return a 2-D array of numbers. The outer array should have one entry for each example in the provided test_examples.

So if test_examples was of length 2 (meaning it contains two examples to run through the model), and your model was binary classification, and for the first example it returned a score for the positive class of .3 and for the second it returned a score for the positive class of .9, then the returned result should be formatted as [[0.7, 0.3], [0.1, 0.9]].

If instead, your model is a regression model (so it only returns a single number for each example, not a set of class scores), then your 1-D array is the right format for the return of the custom_predict_fn but you must call .set_model_type('regression') on WitConfigBuilder to set it to regression model mode, instead of the default classification model mode.

To unsubscribe from this group and stop receiving emails from it, send an email to what-if-tool...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/what-if-tool/92774ee5-8fc0-4d92-96b6-d3c21c8b3ae5%40googlegroups.com.

vsing...@gmail.com

unread,
Aug 22, 2019, 12:31:18 PM8/22/19
to What-If Tool
Got it working with LogisticRegression and predict_proba(), which returns probabilities in the desired format. 
Thank you!

Regards, 
Vaibhav Singh

On Thursday, August 22, 2019 at 9:52:49 AM UTC-5, James Wexler wrote:
Thanks for the update! Glad that column issue is taken care of.

From the snapshot you sent of the standalone test, it seems your custom_predict_fn is still not returning its data in the right format. It will always be passed an array of examples, much like your first call "custom_predict(test_examples)". But your custom predict looks to be returning its results as a 1-D array of numbers as shown in your screenshot. Instead, it needs to return a 2-D array of numbers. The outer array should have one entry for each example in the provided test_examples.

So if test_examples was of length 2 (meaning it contains two examples to run through the model), and your model was binary classification, and for the first example it returned a score for the positive class of .3 and for the second it returned a score for the positive class of .9, then the returned result should be formatted as [[0.7, 0.3], [0.1, 0.9]].

If instead, your model is a regression model (so it only returns a single number for each example, not a set of class scores), then your 1-D array is the right format for the return of the custom_predict_fn but you must call .set_model_type('regression') on WitConfigBuilder to set it to regression model mode, instead of the default classification model mode.

Reply all
Reply to author
Forward
0 new messages