I have been working on implementing a pipeline which will push data from AppEngine into BigQuery. I am defining my mapper pipeline class in the same manner as in this article
except that I am passing a "filters" param to the input_reader. Specifically, I am passing this:
params={
"input_reader":{
"entity_kind": entity_type,
"filters": [('pid', '=', pid), ('tags', '=', '2000')]
},
"output_writer":{
"filesystem": "gs",
"gs_bucket_name": settings.GOOGLE_CLOUD_STORAGE_BUCKET_NAME,
"output_sharding": "input"
}
}