Issue with Instant BQML Prediction Pipeline

42 views
Skip to first unread message

Alena Shambaleva

unread,
Mar 12, 2025, 6:31:24 AMMar 12
to instant-bqml...@googlegroups.com

Hello,

I hope you're doing well. I'm reaching out regarding an issue I encountered while running the prediction pipeline using the BQML v1.5 templates on Instant BQML.

I used templates for Churn Predictions from Instant BQML(version 1.5). Training pipeline finished successfully and created a training dataset and model. However predictions pipeline failed at the FormatScores step with the following error:

Prediction Pipeline Error:

  • Job: Format Scores
  • Worker Class: BQWaiter
  • Execution failed: WorkerException: Query error: Name client_id not found inside mp at [89:6]

I’ve reviewed my setup but haven't identified any missing steps that could have caused this issue. Could you please help me troubleshoot this error or let me know if there's any additional configuration required? Attaching generated template and SQL that failed.

Thanks in advance for your assistance!

Best regards,
Alena

FormatScores.sql
prediction_pipeline_msg_churn_prob.json

Pat Grady

unread,
Mar 12, 2025, 10:06:55 AMMar 12
to Alena Shambaleva, instant-bqml...@googlegroups.com
At first glance, for some reason, the query aliases the predict.user_pseudo_id as app_instance_id, which makes little sense to me.  

I'd change that line from:
Predict.user_pseudo_id AS app_instance_id,
to
Predict.user_pseudo_id AS client_id,

The part that throws the error looks to the `_measurement_protocol_formatted_ios` table for `client_id` but does not find it because it is not there. The `user_pseudo_id` is the GA4 bq placeholder for the GA4 client ID, so the above fix will likely solve your problem.



--
You received this message because you are subscribed to the Google Groups "Instant BQML and Vertex Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to instant-bqml-verte...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/instant-bqml-vertex-users/CAM8E7MdUeb3RnW5NHpdzV9LD7cWPQqxe6_d-%3DYuzkK%2B%2B2imgzg%40mail.gmail.com.

Alena Shambaleva

unread,
Mar 12, 2025, 12:22:06 PMMar 12
to Pat Grady, instant-bqml...@googlegroups.com
Hello Pat!
Thanks a lot for your response. I changed app_instance_id to client_id as you suggested and FormatScores step has passed however Events to GA4 is failing now:
Job: Events to GA4, Worker Class: BQToMeasurementProtocolProcessorGA4
Unexpected error Traceback (most recent call last): File "/workspace/jobs_app.py", line 73, in start_task workers_to_enqueue = worker_inst.execute() File "/workspace/jobs/workers/worker.py", line 119, in execute self._execute() File "/workspace/jobs/workers/bigquery/bq_to_measurement_protocol_ga4.py", line 168, in _execute self._stream_rows(first_page, url_param) File "/workspace/jobs/workers/bigquery/bq_to_measurement_protocol_ga4.py", line 147, in _stream_rows payload = template.substitute(dict(row.items())) File "/layers/google.python.runtime/python/lib/python3.9/string.py", line 121, in substitute return self.pattern.sub(convert, self.template) File "/layers/google.python.runtime/python/lib/python3.9/string.py", line 114, in convert return str(mapping[named]) KeyError: 'app_instance_id'


Meanwhile, what version would you suggest to use for model templates? I used v1.5 https://instant-bqml.appspot.com
Thanks in advance!

Pat Grady

unread,
Mar 12, 2025, 1:02:13 PMMar 12
to Alena Shambaleva, instant-bqml...@googlegroups.com
I was afraid of downstream dependency. I had assumed that the app_instance_id alias was a fluke in your example.   I'm unsure about the reasoning behind using app_instance_id in the iBQML code. IMO, it should not have been used. To resolve the issue, you'd flip the script, revert the previous change, and then change the client_id to what is being expected downstream, 'app_instance_id':

CREATE OR REPLACE TABLE
  `{{ CRMINT_PROJECT_ID }}.{{ BQ_DATASET }}.{{ BQ_NAMESPACE }}_measurement_protocol_formatted_session_attribution_ios` AS
SELECT
  mp.parameter_name,
   -- mp.client_id,  # replace below:
  mp.app_instance_id,
  mp.e_value,
  mp.last_event_timestamp,
  mp.session_id


I think the downstream consumer is described in your prediction pipeline JSON:
{
            "description": null,
            "value": "{\n  \"app_instance_id\": \"${app_instance_id}\",\n  \"timestamp_micros\": \"${last_event_timestamp}\",\n  \"consent\": {\n      \"ad_user_data\": \"GRANTED\",\n      \"ad_personalization\": \"GRANTED\"\n  },\n  \"events\": [\n    {\n      \"name\": \"${parameter_name}_iBQML\",\n      \"params\": {\n        \"value\": \"${e_value}\",\n        \"session_id\": \"${session_id}\"\n      }\n    }\n  ]\n}",
            "label": "GA4 Measurement Protocol JSON template",
            "is_required": false,
            "type": "text",
            "name": "template"
  }


PS. And Google, cover your ears if this offends you... but setting `session_id` in an offline prediction over measurement protocol is always "Nope" in our book of business. So, IMO, be ready to hose your GA4 measurement if you proceed with that MP template... That's all I'll say about that.

Instant BQML and Vertex Users

unread,
Mar 12, 2025, 1:28:09 PMMar 12
to Instant BQML and Vertex Users

Hi everyone,

Thank you for bringing the issue with iOS and Android environment handling to our attention.

We've identified a bug in Instant BQML/Vertex related to this, and we're pleased to announce that a fix will be released in v1.6.

We'll share an update as soon as the release is available.

Thank you for your patience and for helping us improve Instant BQML/Vertex.


Instant BQML and Vertex Users

unread,
Mar 12, 2025, 1:49:21 PMMar 12
to Instant BQML and Vertex Users

Hi everyone,

We're pleased to announce that version 1.6 of Instant BQML/Vertex has been released!

This update addresses the issue with iOS and Android environment handling that some of you reported. The key change is:

  • Dynamic Client/App Instance ID in Prediction Pipeline Jobs: The Format Scores and Scored Users Log jobs in the Prediction Pipeline have been updated to dynamically set client_id or app_instance_id based on the environment. This ensures accurate user identification across different deployment scenarios (web, iOS app, Android app).

Upgrade Recommendation:

If you were experiencing issues, we strongly recommend that you migrate/upgrade your pipelines. Please consult the upgrade guides for both Instant BQML and Instant Vertex to ensure a smooth transition:

We encourage all users to review the full changelog for a complete list of changes and improvements in v1.6.

Thank you again for your valuable feedback, which helps us continuously improve Instant BQML and Vertex.

Reply all
Reply to author
Forward
0 new messages