Is it possible to use WIT (WitConfigBuilder) with Sklearn, H2O, SparkML?

vsing...@gmail.com

unread,

Aug 19, 2019, 4:13:53 PM8/19/19

to What-If Tool

Hi,

I am wondering if WitConfigBuilder can be used with other ML libraries than Tensorflow.

The API seems to have methods like set_estimator_and_feature_spec() that accept TF style classifier and feature specs only.

Regards,

Vaibhav Singh.

Tolga Bolukbasi

unread,

Aug 19, 2019, 4:21:21 PM8/19/19

to vsing...@gmail.com, What-If Tool

Hi Vaibhav,

Yes, you can use WIT with any python predictor. The API you are looking for is "set_custom_predict_fn()". Here is an example notebook that uses this API to set keras models for text prediction: WIT text demo. Please let me know if you have any further questions!

Best

--
You received this message because you are subscribed to the Google Groups "What-If Tool" group.
To unsubscribe from this group and stop receiving emails from it, send an email to what-if-tool...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/what-if-tool/62ebba49-24e4-48b5-b76a-ae01fcd27c77%40googlegroups.com.

--

Tolga Bolukbasi
Software Engineer, Google Brain PAIR
tol...@google.com

vsing...@gmail.com

unread,

Aug 20, 2019, 12:34:53 PM8/20/19

to What-If Tool

Hi,

Thanks for the prompt response.

Per the example and method signature, set_custom_predict_fn() still seems to accept only tf examples as the data input.

Is there a way to pass the conventional feature vectors (2D), or will we have to convert to tf examples, then back to original 2D before passing to sklearn model.predict() in the custom predict fn, eg.? Hope I make sense.

Vaibhav Singh

On Monday, August 19, 2019 at 3:21:21 PM UTC-5, Tolga Bolukbasi wrote:

Hi Vaibhav,
Yes, you can use WIT with any python predictor. The API you are looking for is "set_custom_predict_fn()". Here is an example notebook that uses this API to set keras models for text prediction: WIT text demo. Please let me know if you have any further questions!

Best

On Mon, Aug 19, 2019 at 4:13 PM <vsin...@gmail.com> wrote:

Hi,
I am wondering if WitConfigBuilder can be used with other ML libraries than Tensorflow.
The API seems to have methods like set_estimator_and_feature_spec() that accept TF style classifier and feature specs only.

Regards,
Vaibhav Singh.

--
You received this message because you are subscribed to the Google Groups "What-If Tool" group.

To unsubscribe from this group and stop receiving emails from it, send an email to what-i...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/what-if-tool/62ebba49-24e4-48b5-b76a-ae01fcd27c77%40googlegroups.com.

vsing...@gmail.com

unread,

Aug 20, 2019, 12:37:29 PM8/20/19

to What-If Tool

This is what I am referring to when saying it accepts only tf examples.

WitConfigBuilder(examples[:num_datapoints]).set_custom_predict_fn(

custom_predict_1)

James Wexler

unread,

Aug 20, 2019, 12:54:14 PM8/20/19

to vsing...@gmail.com, What-If Tool

You are correct that currently this method only accepts tf.Examples. So you will need to convert your data to tf.Examples to pass to WitConfigBuilder, then convert them back from tf.Examples in your custom_predict_fn to send to your model. I know its not ideal. In the future, I could imagine a change where if you provide JSON objects to WitConfigBuilder instead of tf.Examples, then the custom_predict_fn is provided the examples in JSON format as opposed to tf.Example. But currently that isn't how it works.

To unsubscribe from this group and stop receiving emails from it, send an email to what-if-tool...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/what-if-tool/f373907a-48fa-48cd-842b-2381c310dca6%40googlegroups.com.

Message has been deleted

vsing...@gmail.com

unread,

Aug 20, 2019, 4:33:17 PM8/20/19

to What-If Tool

Was able to get it working with Sklearn. Thank you for the prompt responses.

Vaibhav Singh

On Tuesday, August 20, 2019 at 11:54:14 AM UTC-5, James Wexler wrote:

You are correct that currently this method only accepts tf.Examples. So you will need to convert your data to tf.Examples to pass to WitConfigBuilder, then convert them back from tf.Examples in your custom_predict_fn to send to your model. I know its not ideal. In the future, I could imagine a change where if you provide JSON objects to WitConfigBuilder instead of tf.Examples, then the custom_predict_fn is provided the examples in JSON format as opposed to tf.Example. But currently that isn't how it works.

To view this discussion on the web visit https://groups.google.com/d/msgid/what-if-tool/f373907a-48fa-48cd-842b-2381c310dca6%40googlegroups.com.

vsing...@gmail.com

unread,

Aug 20, 2019, 5:21:08 PM8/20/19

to What-If Tool

Hi again,

However, unlike the Keras example, I am not getting the 'Inference label/score' option on the filter dropdown.

Have tried SVM and SGDClassifier. What may I be missing?

Adapting from notebook: https://colab.research.google.com/github/tensorflow/tensorboard/blob/master/tensorboard/plugins/interactive_inference/What_If_Tool_Notebook_Usage.ipynb

Relevant code snippets:

# Converts a list of tf.Example protos into a dataframe.

def examples_to_df(examples, columns=None):

records = []

if columns == None:

raise Exception('Columns must be supplied.')

for example in examples:

record = []

for col in columns:

if col in example.features.feature:

if hasattr(example.features.feature[col], 'int64_list'):

record.append(example.features.feature[col].int64_list.value[0])

elif hasattr(example.features.feature[col], 'float_list'):

record.append(example.features.feature[col].float_list.value[0])

else:

record.append(example.features.feature[col].bytes_list.value[0])

records.append(record)

examples_df = pd.DataFrame(records, columns=columns)

return examples_df

def custom_predict(examples_to_infer):

model_ins = examples_to_df(examples_to_infer)

preds = clf.predict_proba(model_ins)

return preds

# Setup the tool with the test examples and the trained classifier

test_csv_path = 'adult.test'

test_df = pd.read_csv(test_csv_path, names=csv_columns, skipinitialspace=True, skiprows=1)

label_encode(test_df)

x_test = test_df.drop([y_col], axis=1)

y_test = test_df[y_col]

test_examples = df_to_examples(x_test)

from witwidget.notebook.visualization import WitWidget, WitConfigBuilder

num_datapoints = 1000 #@param {type: "number"}

tool_height_in_px = 720 #@param {type: "number"}

# Setup the tool with the test examples and the trained classifier

config_builder = WitConfigBuilder(test_examples[0:num_datapoints]).set_custom_predict_fn(

custom_predict).set_label_vocab(['Under 50K', 'Over 50K'])

WitWidget(config_builder, height=tool_height_in_px)

Regards,

Vaibhav Singh

James Wexler

unread,

Aug 20, 2019, 5:29:09 PM8/20/19

to vsing...@gmail.com, What-If Tool

Do you see filled in inference results in the bottom left when you click on an example dot in the visualization? Do you see any error text in the top-right of the visualization?

To unsubscribe from this group and stop receiving emails from it, send an email to what-if-tool...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/what-if-tool/dc09f638-9c1c-44c3-aafc-1cc84ad437d1%40googlegroups.com.

vsing...@gmail.com

unread,

Aug 21, 2019, 11:00:52 AM8/21/19

to What-If Tool

Answer to both is no.

Also tried using predict_proba() and decision_function() predictors on model. Still no go.

Screen Shot 2019-08-21 at 9.59.06 AM.png

On Tuesday, August 20, 2019 at 4:29:09 PM UTC-5, James Wexler wrote:

Do you see filled in inference results in the bottom left when you click on an example dot in the visualization? Do you see any error text in the top-right of the visualization?

To view this discussion on the web visit https://groups.google.com/d/msgid/what-if-tool/dc09f638-9c1c-44c3-aafc-1cc84ad437d1%40googlegroups.com.

James Wexler

unread,

Aug 21, 2019, 11:11:03 AM8/21/19

to vsing...@gmail.com, What-If Tool

The top-right shows an error occurred during calling of your function "Columns must be supplied". It's possible that your custom predict function isn't converting the tf.Examples to the appropriate format before sending them to the model. Perhaps print out your examples in your custom prediction function after converting from tf.Example but before sending them to the model to see if the format has an issue? As far as what the custom prediction function should return you should have a 2D array of numbers such as:

[[example0Class0Score, example0Class1Score], [example1Class0Score, example1Class1Score]], ...]

To unsubscribe from this group and stop receiving emails from it, send an email to what-if-tool...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/what-if-tool/0fb24f3a-68b5-4334-ba25-0d013b707d5f%40googlegroups.com.

vsing...@gmail.com

unread,

Aug 21, 2019, 1:09:57 PM8/21/19

to What-If Tool

Alright, so I notice an error msg when clicking on the Run Inference button. Will work on it and get back if needed. Thanks.

vsing...@gmail.com

unread,

Aug 22, 2019, 10:42:22 AM8/22/19

to What-If Tool

Hi James,

Thank you again for the prompt help.

So I have been able to resolve the bug with custom_predict(). Have also tested it separately by passing all test_examples - same argument that's passed to

WitConfigBuilder(test_examples[:num_datapoints]).set_custom_predict_fn(

custom_predict)

However, now I am getting the following error on WIT.

object of type 'numpy.float64' has no len()

Here's the new example to df converter and custom_predict() for reference.

# Converts a list of tf.Example protos into a dataframe.

def examples_to_df(examples, columns=None):

records = []

if columns == None:

raise Exception('Columns must be supplied.')

# Handle single example case

if not isinstance(examples, list):

examples = [examples]

for example in examples:

record = []

for col in columns:

val = None

if col in example.features.feature:

if len(example.features.feature[col].int64_list.value) == 1:

record.append(example.features.feature[col].int64_list.value[0])

elif len(example.features.feature[col].float_list.value) == 1:

record.append(example.features.feature[col].float_list.value[0])

elif len(example.features.feature[col].bytes_list.value) == 1:

record.append(example.features.feature[col].bytes_list.value[0].decode('utf-8'))

else:

raise Exception('No value found in example for: ' + col)

records.append(record)

examples_df = pd.DataFrame(records, columns=columns)

return examples_df

def custom_predict(examples_to_infer):

model_ins = examples_to_df(examples_to_infer, csv_columns[:-1])

preds = clf.predict(model_ins)

return preds

Snapshot of standalone test:

Screen Shot 2019-08-22 at 9.39.58 AM.png

How do I debug this issue? Do we have any stack traces/logs generated by WIT somewhere?

Regards,

Vaibhav Singh

On Wednesday, August 21, 2019 at 10:11:03 AM UTC-5, James Wexler wrote:

The top-right shows an error occurred during calling of your function "Columns must be supplied". It's possible that your custom predict function isn't converting the tf.Examples to the appropriate format before sending them to the model. Perhaps print out your examples in your custom prediction function after converting from tf.Example but before sending them to the model to see if the format has an issue? As far as what the custom prediction function should return you should have a 2D array of numbers such as:
[[example0Class0Score, example0Class1Score], [example1Class0Score, example1Class1Score]], ...]

To view this discussion on the web visit https://groups.google.com/d/msgid/what-if-tool/0fb24f3a-68b5-4334-ba25-0d013b707d5f%40googlegroups.com.

James Wexler

unread,

Aug 22, 2019, 10:52:49 AM8/22/19

to vsing...@gmail.com, What-If Tool

Thanks for the update! Glad that column issue is taken care of.

From the snapshot you sent of the standalone test, it seems your custom_predict_fn is still not returning its data in the right format. It will always be passed an array of examples, much like your first call "custom_predict(test_examples)". But your custom predict looks to be returning its results as a 1-D array of numbers as shown in your screenshot. Instead, it needs to return a 2-D array of numbers. The outer array should have one entry for each example in the provided test_examples.

So if test_examples was of length 2 (meaning it contains two examples to run through the model), and your model was binary classification, and for the first example it returned a score for the positive class of .3 and for the second it returned a score for the positive class of .9, then the returned result should be formatted as [[0.7, 0.3], [0.1, 0.9]].

If instead, your model is a regression model (so it only returns a single number for each example, not a set of class scores), then your 1-D array is the right format for the return of the custom_predict_fn but you must call .set_model_type('regression') on WitConfigBuilder to set it to regression model mode, instead of the default classification model mode.

To unsubscribe from this group and stop receiving emails from it, send an email to what-if-tool...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/what-if-tool/92774ee5-8fc0-4d92-96b6-d3c21c8b3ae5%40googlegroups.com.

vsing...@gmail.com

unread,

Aug 22, 2019, 12:31:18 PM8/22/19

to What-If Tool

Got it working with LogisticRegression and predict_proba(), which returns probabilities in the desired format.

Thank you!

Regards,

Vaibhav Singh

On Thursday, August 22, 2019 at 9:52:49 AM UTC-5, James Wexler wrote:

Thanks for the update! Glad that column issue is taken care of.

From the snapshot you sent of the standalone test, it seems your custom_predict_fn is still not returning its data in the right format. It will always be passed an array of examples, much like your first call "custom_predict(test_examples)". But your custom predict looks to be returning its results as a 1-D array of numbers as shown in your screenshot. Instead, it needs to return a 2-D array of numbers. The outer array should have one entry for each example in the provided test_examples.

So if test_examples was of length 2 (meaning it contains two examples to run through the model), and your model was binary classification, and for the first example it returned a score for the positive class of .3 and for the second it returned a score for the positive class of .9, then the returned result should be formatted as [[0.7, 0.3], [0.1, 0.9]].

If instead, your model is a regression model (so it only returns a single number for each example, not a set of class scores), then your 1-D array is the right format for the return of the custom_predict_fn but you must call .set_model_type('regression') on WitConfigBuilder to set it to regression model mode, instead of the default classification model mode.

To view this discussion on the web visit https://groups.google.com/d/msgid/what-if-tool/92774ee5-8fc0-4d92-96b6-d3c21c8b3ae5%40googlegroups.com.

Reply all

Reply to author

Forward