Increasing bazel time outs during a build?

1,751 views
Skip to first unread message

wdi...@us.ibm.com

unread,
Sep 24, 2018, 4:59:19 PM9/24/18
to SIG Build
This change: https://github.com/tensorflow/tensorflow/commit/7229d08f0b25e24e6dd4833a94a27f404b27a350, unintentionally broke the ppc64le build by causing a time out.

What is happening is it reads from https://github.com/tensorflow/tensorflow/blob/master/tensorflow/requirements.txt, figures out the dependencies for those packages (numpy, keras, h5py. scipy, six, pyyaml) and proceeds to download the whl files for each package from pypi, and if missing it builds the whl file.

For x86, the only package that doesn't have a whl file is pyyaml. That wheel files builds pretty quick.

For ppc64le, it has to build whl files for numpy, h5py, scipy, and pyyaml. The build rule times out before it finishes building the whl file for scipy, after 10 minutes.


Any ideas how I can resolve this? Is there a command line option I can pass the bazel build?



Here is the condensed output:

skipping keras-applications, due to already being wheel.
Skipping keras-preprocessing, due to already being wheel.
Skipping keras, due to already being wheel.
Skipping six, due to already being wheel.
Building wheels for collected packages: numpy, h5py, scipy, pyyaml
  Running setup.py bdist_wheel for numpy: started
  Running setup.py bdist_wheel for numpy: finished with status 'done'
  Running setup.py bdist_wheel for h5py: started
  Running setup.py bdist_wheel for h5py: finished with status 'done'
  Running setup.py bdist_wheel for scipy: started
ERROR: error loading package '': Encountered error while reading extension file 'requirements.bzl': no such package '@pip_deps//': pip_import failed:  (Timed out)
ERROR: error loading package '': Encountered error while reading extension file 'requirements.bzl': no such package '@pip_deps//': pip_import failed:  (Timed out)
INFO: Elapsed time: 600.940s

Gunhan Gulsoy

unread,
Sep 24, 2018, 5:01:43 PM9/24/18
to wdi...@us.ibm.com, Michael Case, bu...@tensorflow.org
+cc mikecase, author of the change.

We included these in our workspace to ensure that the build environment can satisfy all the requirements.
What happens if you preinstall all of these in your environment?

--
You received this message because you are subscribed to the Google Groups "SIG Build" group.
To unsubscribe from this group and stop receiving emails from it, send an email to build+un...@tensorflow.org.
Visit this group at https://groups.google.com/a/tensorflow.org/group/build/.

wdi...@us.ibm.com

unread,
Sep 24, 2018, 5:21:18 PM9/24/18
to SIG Build
I'm doing a build using the dockerfile so everything should be preinstalled.

./tensorflow/tools/ci_build/ci_build.sh gpu --dockerfile tensorflow/tools/ci_build/Dockerfile.gpu.ppc64le pip list

shows I have (among other packages):

h5py (2.8.0)
Keras-Applications (1.0.5)
Keras-Preprocessing (1.0.3)
numpy (1.15.2)
scipy (1.1.0)
six (1.11.0)

Don't have keras or pyyaml already installed, so I can try to install them, but I don't think the logic takes into account what is already installed.

wdi...@us.ibm.com

unread,
Sep 25, 2018, 9:28:31 AM9/25/18
to SIG Build
For the record doing a pip install of keras (which installs pyyaml) in the dockerfile didn't make a difference.

Also when you pip install keras, it down-levels keras-applications and keras-preprocessing. For this reason, the install from source instructions (https://www.tensorflow.org/install/source), have you using the --no-deps flag when doing the pip install of keras-applications and keras-preprocessing. That parameter is not valid for the requirements.txt file.

Is there a timeout I can add to the bazel command line so this step of generating the whl files would not time out ?

Gunhan Gulsoy

unread,
Sep 25, 2018, 5:48:09 PM9/25/18
to Michael Case, wdi...@us.ibm.com, bu...@tensorflow.org
Michael, looks like this is not working as we intended it to.
Should we roll it back?

On Tue, Sep 25, 2018 at 11:04 AM Michael Case <mike...@google.com> wrote:
Hmm, I don't have a solution for you off the top of my head. Will look into this, bug the Bazel team, and get back to you. Sorry for the breakage.

Gunhan Gulsoy

unread,
Sep 26, 2018, 4:33:02 AM9/26/18
to Michael Case, wdi...@us.ibm.com, bu...@tensorflow.org
Also, just to check. The installation of keras_applications and friends are rather involved.
I had to first install liblapack-dev, libblas-dev, gfortran and libhdf5-dev just to be able to pip install keras_applications and friends.
Is it possible the installation of these pip packages somehow failed?
Because on my x86 workstation (admittedly non-docker) It does not try to reinstall these packages.

On Tue, Sep 25, 2018 at 2:57 PM Michael Case <mike...@google.com> wrote:
Sounds good to me. I'm afk for a few hours. Feel free to roll it back in the meantime.

Koan-Sin Tan

unread,
Sep 26, 2018, 4:37:10 AM9/26/18
to Gunhan Gulsoy, Michael Case, wdi...@us.ibm.com, bu...@tensorflow.org
FYI. This also caused problems on other platforms, e.g., native build on RPI 3B.

wdi...@us.ibm.com

unread,
Sep 26, 2018, 7:44:41 AM9/26/18
to SIG Build
Thanks for rolling back the change.


Gunhan, to check what is happening on your system, I should have mentioned how I got the extra logging:

[This is from my debug on x86 on Monday]

After running it, you should have the python rules in the bazel cache:
wdirons@xxxxx:~/tensorflow/bazel-ci_build-cache/.cache/bazel/_bazel_wdirons/eab0d61a99b6696edb3d2aff87b585e8/external/io_bazel_rules_python/python$ ls -l
total 20
-rw-rw-r-- 1 wdirons wdirons  760 Apr 16 17:07 BUILD
-rw-rw-r-- 1 wdirons wdirons 3005 Sep 24 14:56 pip.bzl
-rw-rw-r-- 1 wdirons wdirons 1243 Apr 16 17:07 python.bzl
-rw-rw-r-- 1 wdirons wdirons   71 Apr 16 17:07 requirements.txt
-rw-rw-r-- 1 wdirons wdirons 2567 Apr 16 17:07 whl.bzl


You can modify line 32 of pip.bzl to include the quiet=False parameter. That  will cause additional logging to show what it is doing.
wdirons@xxxxx:~/tensorflow/bazel-ci_build-cache/.cache/bazel/_bazel_wdirons/eab0d61a99b6696edb3d2aff87b585e8/external/io_bazel_rules_python/python$ vi pip.bzl


  # To see the output, pass: quiet=False
  result = repository_ctx.execute([
    "python", repository_ctx.path(repository_ctx.attr._script),
    "--name", repository_ctx.attr.name,
    "--input", repository_ctx.path(repository_ctx.attr.requirements),
    "--output", repository_ctx.path("requirements.bzl"),
    "--directory", repository_ctx.path(""),
  ],quiet=False)



You'll also find the whl files it downloaded or built in the cache after the build.
wdirons@xxxxx:~/tensorflow/bazel-ci_build-cache/.cache/bazel/_bazel_wdirons/eab0d61a99b6696edb3d2aff87b585e8/external/pip_deps$ ls -l
total 46716
-rwxr-xr-x 1 wdirons wdirons        0 Sep 24 14:57 BUILD
-rw-r--r-- 1 wdirons wdirons  2725575 Sep 24 14:57 h5py-2.8.0-cp27-cp27mu-manylinux1_x86_64.whl
-rw-r--r-- 1 wdirons wdirons   299262 Sep 24 14:57 Keras-2.2.2-py2.py3-none-any.whl
-rw-r--r-- 1 wdirons wdirons    44214 Sep 24 14:57 Keras_Applications-1.0.5-py2.py3-none-any.whl
-rw-r--r-- 1 wdirons wdirons    28326 Sep 24 14:57 Keras_Preprocessing-1.0.3-py2.py3-none-any.whl
-rw-r--r-- 1 wdirons wdirons 13835039 Sep 24 14:57 numpy-1.15.2-cp27-cp27mu-manylinux1_x86_64.whl
-rw-r--r-- 1 wdirons wdirons    44230 Sep 24 14:57 PyYAML-3.13-cp27-cp27mu-linux_x86_64.whl
-rw-r--r-- 1 wdirons wdirons     2934 Sep 24 14:57 requirements.bzl
-rw-r--r-- 1 wdirons wdirons 30828433 Sep 24 14:57 scipy-1.1.0-cp27-cp27mu-manylinux1_x86_64.whl
-rw-r--r-- 1 wdirons wdirons    10702 Sep 24 14:57 six-1.11.0-py2.py3-none-any.whl
-rw-r--r-- 1 wdirons wdirons      103 Sep 24 14:57 WORKSPACE

Gunhan Gulsoy

unread,
Sep 26, 2018, 6:48:15 PM9/26/18
to wdi...@us.ibm.com, bu...@tensorflow.org
Thanks for the additional debugging information.
I investigated a little more, and looks like these pip packages are not supposed to be our build time dependencies.
So I rolled the change back to add pip dependencies.
Reply all
Reply to author
Forward
0 new messages