DirectRunner vs DataflowRunner

132 views
Skip to first unread message

Luis Chinchilla-Garcia

unread,
Jun 8, 2020, 12:28:48 PM6/8/20
to TensorFlow Extended (TFX)
I've been having some issues when using a TFX Pipeline with DirectRunner vs DataflowRunner. DirectRunner works perfectly but DataflowRunner errors out with "module not found", whether the modulefile is local or in a GCS bucket. Here is a very simple example of this where Direct Runner runs well and errors out with DataflowRunner: https://colab.research.google.com/drive/1MZKvjXixKRX3JcLfkFPtUhfjm-w43J5c?usp=sharing.  

This brings up the question: Is this a problem with my setup? If not, in general, how much we can expect DirectRunner to be reflective of whether DataflowRunner will run or not?

Charles Chen

unread,
Jun 10, 2020, 4:48:38 PM6/10/20
to TensorFlow Extended (TFX), Luis Chinchilla-Garcia
Hey Luis, could you provide the error message (with stacktrace if possible) on DataflowRunner?

Luis Chinchilla-Garcia

unread,
Jun 13, 2020, 11:49:51 PM6/13/20
to TensorFlow Extended (TFX), Charles Chen, Luis Chinchilla-Garcia
Just in case anyone also has this issue, I'll put the errors here as well:



{

  "textPayload": "Error message from worker: Traceback (most recent call last):\n  File \"/usr/local/lib/python3.7/site-packages/apache_beam/internal/pickler.py\", line 279, in loads\n    return dill.loads(s)\n  File \"/usr/local/lib/python3.7/site-packages/dill/_dill.py\", line 317, in loads\n    return load(file, ignore)\n  File \"/usr/local/lib/python3.7/site-packages/dill/_dill.py\", line 305, in load\n    obj = pik.load()\n  File \"/usr/local/lib/python3.7/site-packages/dill/_dill.py\", line 474, in find_class\n    return StockUnpickler.find_class(self, module, name)\nModuleNotFoundError: No module named 'transform'\n\nDuring handling of the above exception, another exception occurred:\n\nTraceback (most recent call last):\n  File \"/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py\", line 650, in do_work\n    work_executor.execute()\n  File \"/usr/local/lib/python3.7/site-packages/dataflow_worker/executor.py\", line 150, in execute\n    test_shuffle_sink=self._test_shuffle_sink)\n  File \"/usr/local/lib/python3.7/site-packages/dataflow_worker/executor.py\", line 116, in create_operation\n    is_streaming=False)\n  File \"apache_beam/runners/worker/operations.py\", line 932, in apache_beam.runners.worker.operations.create_operation\n  File \"apache_beam/runners/worker/operations.py\", line 766, in apache_beam.runners.worker.operations.create_pgbk_op\n  File \"apache_beam/runners/worker/operations.py\", line 822, in apache_beam.runners.worker.operations.PGBKCVOperation.__init__\n  File \"/usr/local/lib/python3.7/site-packages/apache_beam/internal/pickler.py\", line 283, in loads\n    return dill.loads(s)\n  File \"/usr/local/lib/python3.7/site-packages/dill/_dill.py\", line 317, in loads\n    return load(file, ignore)\n  File \"/usr/local/lib/python3.7/site-packages/dill/_dill.py\", line 305, in load\n    obj = pik.load()\n  File \"/usr/local/lib/python3.7/site-packages/dill/_dill.py\", line 474, in find_class\n    return StockUnpickler.find_class(self, module, name)\nModuleNotFoundError: No module named 'transform'\n",

  "insertId": "170d7tpc820",

  "resource": {

    "type": "dataflow_step",

    "labels": {

      "job_id": "2020-05-28_18_10_06-2507353518970581886",

      "region": "us-central1",

      "job_name": "lgc-from-20200401-to-20200401-run-at-2020-05-29-004632",

      "project_id": "633268511115",

      "step_id": ""

    }


Reply all
Reply to author
Forward
0 new messages