Updating LLVM/MLIR in TF for local testing

168 views
Skip to first unread message

Uday Bondhugula

unread,
May 25, 2020, 10:31:17 AM5/25/20
to MLIR
Hi all,

Is there a typically workflow for updating the LLVM/MLIR version being used by tensorflow cloned from git? This is mainly for local testing purposes where one would like to apply commits on top of the LLVM/MLIR commit ID used before waiting for such MLIR commits to appear upstream and for TensorFlow to bump its LLVM version.  I notice that it's tensorflow/workspace.bzl that has the LLVM commit ID and and URLs to download from, and it's updated by commits with a title like "Bump open source llvm revision to ...". But the git commit history has other commits with titles like " Integrate LLVM at https://github.com/llvm/llvm-project/commit/2e499eee5884" and these aren't really updating commit references anywhere in the build config (example appended below), and these commit IDs are different than and appear to be ahead of that in workspace.bzl. Is the bazel build config just getting its commit ID from workspace.bzl? In that case what do the commit IDs in "Integrate LLVM at ..." mean?

Back to the question of patching the LLVM/MLIR version being used, I can see the source tree in the bazel cache under external/llvm-project/mlir/, but that isn't a git checkout and so one can't apply a commit and just rebuild. So just patching the tree over there was the best I could figure out, but this automatically generated directory isn't meant for modification (the README just above external/ warns against this). Even if it's modified, I couldn't tell how to trigger the LLVM/MLIR rebuild. 

Thanks,
Uday


----------------------------
$ git show  3245c2f87e4347347542f3f8181d2024ced68287
commit 3245c2f87e4347347542f3f8181d2024ced68287
Author: A. Unique TensorFlower <gard...@tensorflow.org>
Date:   Tue May 19 09:44:36 2020 -0700

    
    PiperOrigin-RevId: 312297705
    Change-Id: I0487894138d9a80b9e0d288808bedd7fc9ba6780

diff --git a/third_party/mlir/BUILD b/third_party/mlir/BUILD
index 1ad94212dc..5ebcbb6e3d 100644
--- a/third_party/mlir/BUILD
+++ b/third_party/mlir/BUILD
@@ -2680,6 +2680,7 @@ cc_binary(
     srcs = ["tools/mlir-cuda-runner/cuda-runtime-wrappers.cpp"],
     linkshared = True,
     deps = [
+        ":mlir_c_runner_utils",
         "//third_party/gpus/cuda:cuda_headers",
         "//third_party/gpus/cuda:cuda_runtime",
         "//third_party/gpus/cuda:libcuda",
-------------------------------------------------------

Alex Zinenko

unread,
May 25, 2020, 10:50:46 AM5/25/20
to Uday Bondhugula, MLIR
Hi Uday,

you should be able to change workspace.bzl to point to a different commit and even a different repository. You'll need to update LLVM_COMMIT, LLVM_SHA256 and eventually LLVM_URLS if you want a different repository. The LLVM_SHA256 should contain the sha256sum of the .tar.gz containing the LLVM code, just download the file yourself. This is exactly what "bump" commits do.

The "integrate" commits happen when some version of LLVM is pulled into our central repository, which also contains tensorflow, and usually includes API and build fixes that reflect LLVM's current state _other_ than updating workspace.bzl. Some of those don't require a change of Tensorflow, so are not necessarily followed by a "bump" commit.

--
You received this message because you are subscribed to the Google Groups "MLIR" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mlir+uns...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/3d81f133-8304-402d-8fff-4d5e68c776a6%40tensorflow.org.


--
Alex

Uday Kumar Reddy Bondhugula

unread,
May 25, 2020, 11:06:54 AM5/25/20
to Alex Zinenko, Uday Bondhugula, MLIR
Hi Alex,

On Mon, 25 May 2020 at 20:20, 'Alex Zinenko' via MLIR <ml...@tensorflow.org> wrote:
Hi Uday,

you should be able to change workspace.bzl to point to a different commit and even a different repository. You'll need to update LLVM_COMMIT, LLVM_SHA256 and eventually LLVM_URLS if you want a different repository. The LLVM_SHA256 should contain the sha256sum of the .tar.gz containing the LLVM code, just download the file yourself. This is exactly what "bump" commits do.
 
Thanks, but this way, if you are working on a tree to update LLVM/MLIR the way it's needed, you'd have to keep pushing it to the repo, update commit ID/hash and the bazel build will download/untar/rebuild (I hope the compilation cache helps here - it's otherwise impractical in the dev cycle). But would there be a way to avoid this roundtrip overhead and do this update/rebuild locally? Perhaps by not using a tf_http_archive but setting it up to use something local?

- Uday

Mehdi AMINI

unread,
May 25, 2020, 1:43:28 PM5/25/20
to Uday Kumar Reddy Bondhugula, Alex Zinenko, Uday Bondhugula, MLIR
Can you try --override_repository=llvm-project=<path> ?

-- 
Mehdi

Uday Bondhugula

unread,
May 25, 2020, 2:08:55 PM5/25/20
to MLIR
Thanks very much - this is almost perfect except that another tree won't work without the WORKSPACE and BUILD files that tensorflow/workspace.bzl is specifying. This can be circumvented though by first copying the tree out from the bazel cache and then basing changes on that, and then doing using the override you suggest. bazel build does correctly build by using files from this override.

- Uday
 

-- 
Mehdi

Yong Tang

unread,
May 25, 2020, 11:45:58 PM5/25/20
to MLIR
Hi Uday,

Bazel has the `new_local_repository` (see https://docs.bazel.build/versions/master/be/workspace.html#new_local_repository) which allows you to use a local directory as a repo.

You will need to change the `tf_http_archive(name = "llvm-project",...)` section in tensorflow/workspace.bzel into new_local_repository, and add necessary BUILD files for llvm, mlir, and mlir/tests directory in your local llvm directory.

The BUILD files could be copied from the tensorflow repo (see `tf_http_archive(name = "llvm-project"...)`).

Thanks
Yong

Mehdi AMINI

unread,
May 25, 2020, 11:59:59 PM5/25/20
to Yong Tang, MLIR
Hey Yong,


On Mon, May 25, 2020 at 8:46 PM Yong Tang <yong.tan...@outlook.com> wrote:
Hi Uday,

Bazel has the `new_local_repository` (see https://docs.bazel.build/versions/master/be/workspace.html#new_local_repository) which allows you to use a local directory as a repo.

You will need to change the `tf_http_archive(name = "llvm-project",...)` section in tensorflow/workspace.bzel into new_local_repository, and add necessary BUILD files for llvm, mlir, and mlir/tests directory in your local llvm directory.

Is there a practical advantage to modify the workspace to us local_repository instead of using `--override_repository=llvm-project=<path>` on the command line? (I'm no bazel expert, so curious about it)

-- 
Mehdi
 

The BUILD files could be copied from the tensorflow repo (see `tf_http_archive(name = "llvm-project"...)`).

Thanks
Yong

On Monday, May 25, 2020 at 11:08:55 AM UTC-7, Uday Bondhugula wrote:


On Monday, May 25, 2020 at 11:13:28 PM UTC+5:30, Mehdi AMINI wrote:


On Mon, May 25, 2020 at 8:06 AM Uday Kumar Reddy Bondhugula <ud...@polymagelabs.com> wrote:
Hi Alex,

On Mon, 25 May 2020 at 20:20, 'Alex Zinenko' via MLIR <ml...@tensorflow.org> wrote:
Hi Uday,

you should be able to change workspace.bzl to point to a different commit and even a different repository. You'll need to update LLVM_COMMIT, LLVM_SHA256 and eventually LLVM_URLS if you want a different repository. The LLVM_SHA256 should contain the sha256sum of the .tar.gz containing the LLVM code, just download the file yourself. This is exactly what "bump" commits do.
 
Thanks, but this way, if you are working on a tree to update LLVM/MLIR the way it's needed, you'd have to keep pushing it to the repo, update commit ID/hash and the bazel build will download/untar/rebuild (I hope the compilation cache helps here - it's otherwise impractical in the dev cycle). But would there be a way to avoid this roundtrip overhead and do this update/rebuild locally? Perhaps by not using a tf_http_archive but setting it up to use something local?

Can you try --override_repository=llvm-project=<path> ?

Thanks very much - this is almost perfect except that another tree won't work without the WORKSPACE and BUILD files that tensorflow/workspace.bzl is specifying. This can be circumvented though by first copying the tree out from the bazel cache and then basing changes on that, and then doing using the override you suggest. bazel build does correctly build by using files from this override.

- Uday
 

-- 
Mehdi

--
You received this message because you are subscribed to the Google Groups "MLIR" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mlir+uns...@tensorflow.org.

Uday Kumar Reddy Bondhugula

unread,
May 26, 2020, 12:40:35 AM5/26/20
to Mehdi AMINI, Yong Tang, MLIR
On Tue, 26 May 2020 at 09:29, Mehdi AMINI <joke...@gmail.com> wrote:
Hey Yong,


On Mon, May 25, 2020 at 8:46 PM Yong Tang <yong.tan...@outlook.com> wrote:
Hi Uday,

Bazel has the `new_local_repository` (see https://docs.bazel.build/versions/master/be/workspace.html#new_local_repository) which allows you to use a local directory as a repo.

You will need to change the `tf_http_archive(name = "llvm-project",...)` section in tensorflow/workspace.bzel into new_local_repository, and add necessary BUILD files for llvm, mlir, and mlir/tests directory in your local llvm directory.

Is there a practical advantage to modify the workspace to us local_repository instead of using `--override_repository=llvm-project=<path>` on the command line? (I'm no bazel expert, so curious about it)

Thanks, Yong! It's perhaps a recent feature. I have bazel 3.0.0, and new_local_repository isn't supported (error below). But it looks like that would work as well without needing a cmd-line override. The caveat is that removing tf_http_archive would mean that part wouldn't get updated during subsequent upstream pulls, and will have to be manually adjusted in case the vanilla repo is to be tried.

----------------
...
ERROR: /data/tensorflow/tensorflow/workspace.bzl:676:5: name 'new_local_repository' is not defined
ERROR: error loading package '': Extension 'tensorflow/workspace.bzl' has errors
...

- Uday

 

Mehdi AMINI

unread,
May 26, 2020, 12:49:14 AM5/26/20
to Uday Kumar Reddy Bondhugula, Yong Tang, MLIR
On Mon, May 25, 2020 at 9:40 PM Uday Kumar Reddy Bondhugula <ud...@polymagelabs.com> wrote:


On Tue, 26 May 2020 at 09:29, Mehdi AMINI <joke...@gmail.com> wrote:
Hey Yong,


On Mon, May 25, 2020 at 8:46 PM Yong Tang <yong.tan...@outlook.com> wrote:
Hi Uday,

Bazel has the `new_local_repository` (see https://docs.bazel.build/versions/master/be/workspace.html#new_local_repository) which allows you to use a local directory as a repo.

You will need to change the `tf_http_archive(name = "llvm-project",...)` section in tensorflow/workspace.bzel into new_local_repository, and add necessary BUILD files for llvm, mlir, and mlir/tests directory in your local llvm directory.

Is there a practical advantage to modify the workspace to us local_repository instead of using `--override_repository=llvm-project=<path>` on the command line? (I'm no bazel expert, so curious about it)

Thanks, Yong! It's perhaps a recent feature. I have bazel 3.0.0, and new_local_repository isn't supported (error below). But it looks like that would work as well without needing a cmd-line override. The caveat is that removing tf_http_archive would mean that part wouldn't get updated during subsequent upstream pulls, and will have to be manually adjusted in case the vanilla repo is to be tried.

----------------
...
ERROR: /data/tensorflow/tensorflow/workspace.bzl:676:5: name 'new_local_repository' is not defined
ERROR: error loading package '': Extension 'tensorflow/workspace.bzl' has errors
...


I think you need to use `native.` as a prefix: so `native.new_local_repository` (don't ask me why :))

-- 
Mehdi

Uday Bondhugula

unread,
May 26, 2020, 12:55:09 AM5/26/20
to MLIR


On Tuesday, May 26, 2020 at 10:19:14 AM UTC+5:30, Mehdi AMINI wrote:


On Mon, May 25, 2020 at 9:40 PM Uday Kumar Reddy Bondhugula <ud...@polymagelabs.com> wrote:


On Tue, 26 May 2020 at 09:29, Mehdi AMINI <joke...@gmail.com> wrote:
Hey Yong,


On Mon, May 25, 2020 at 8:46 PM Yong Tang <yong.ta...@outlook.com> wrote:
Hi Uday,

Bazel has the `new_local_repository` (see https://docs.bazel.build/versions/master/be/workspace.html#new_local_repository) which allows you to use a local directory as a repo.

You will need to change the `tf_http_archive(name = "llvm-project",...)` section in tensorflow/workspace.bzel into new_local_repository, and add necessary BUILD files for llvm, mlir, and mlir/tests directory in your local llvm directory.

Is there a practical advantage to modify the workspace to us local_repository instead of using `--override_repository=llvm-project=<path>` on the command line? (I'm no bazel expert, so curious about it)

Thanks, Yong! It's perhaps a recent feature. I have bazel 3.0.0, and new_local_repository isn't supported (error below). But it looks like that would work as well without needing a cmd-line override. The caveat is that removing tf_http_archive would mean that part wouldn't get updated during subsequent upstream pulls, and will have to be manually adjusted in case the vanilla repo is to be tried.

----------------
...
ERROR: /data/tensorflow/tensorflow/workspace.bzl:676:5: name 'new_local_repository' is not defined
ERROR: error loading package '': Extension 'tensorflow/workspace.bzl' has errors
...


I think you need to use `native.` as a prefix: so `native.new_local_repository` (don't ask me why :))


Unfortunately, this doesn't work as well (error below). I checked the manual but there's nothing else to the contrary. With:

  native.new_local_repository(
        name = "llvm-project",
        path = "/data/tensorflow/third_party/llvm-project"
    )

-=====================
ERROR: /data/tensorflow/tensorflow/workspace.bzl:676:5: Traceback (most recent call last):
File "/data/tensorflow/WORKSPACE", line 19
tf_repositories()
File "/data/tensorflow/tensorflow/workspace.bzl", line 676, in tf_repositories
native.new_local_repository(<2 more arguments>)
new_local_repository rule //external:llvm-project's name field must be a legal workspace name; workspace names may contain only A-Z, a-z, 0-9, '-', '_' and '.'
ERROR: error loading package '': Encountered error while reading extension file 'repositories/repositories.bzl': no such package '@bazel_toolchains//repositories': error loading package 'external': Could not load //external package
INFO: Elapsed time: 2.228s
INFO: 0 processes.
FAILED: Build did NOT complete successfully (0 packages loaded)
=============

I do have a WORKSPACE file in llvm-project/ and the BUILD files in llvm, mlir, and mlir/test/.








 

-- 
Mehdi


 
- Uday

 

-- 
Mehdi
 

The BUILD files could be copied from the tensorflow repo (see `tf_http_archive(name = "llvm-project"...)`).

Thanks
Yong

On Monday, May 25, 2020 at 11:08:55 AM UTC-7, Uday Bondhugula wrote:


On Monday, May 25, 2020 at 11:13:28 PM UTC+5:30, Mehdi AMINI wrote:


On Mon, May 25, 2020 at 8:06 AM Uday Kumar Reddy Bondhugula <ud...@polymagelabs.com> wrote:
Hi Alex,

On Mon, 25 May 2020 at 20:20, 'Alex Zinenko' via MLIR <ml...@tensorflow.org> wrote:
Hi Uday,

you should be able to change workspace.bzl to point to a different commit and even a different repository. You'll need to update LLVM_COMMIT, LLVM_SHA256 and eventually LLVM_URLS if you want a different repository. The LLVM_SHA256 should contain the sha256sum of the .tar.gz containing the LLVM code, just download the file yourself. This is exactly what "bump" commits do.
 
Thanks, but this way, if you are working on a tree to update LLVM/MLIR the way it's needed, you'd have to keep pushing it to the repo, update commit ID/hash and the bazel build will download/untar/rebuild (I hope the compilation cache helps here - it's otherwise impractical in the dev cycle). But would there be a way to avoid this roundtrip overhead and do this update/rebuild locally? Perhaps by not using a tf_http_archive but setting it up to use something local?

Can you try --override_repository=llvm-project=<path> ?

Thanks very much - this is almost perfect except that another tree won't work without the WORKSPACE and BUILD files that tensorflow/workspace.bzl is specifying. This can be circumvented though by first copying the tree out from the bazel cache and then basing changes on that, and then doing using the override you suggest. bazel build does correctly build by using files from this override.

- Uday
 

-- 
Mehdi

--
You received this message because you are subscribed to the Google Groups "MLIR" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ml...@tensorflow.org.

--
You received this message because you are subscribed to the Google Groups "MLIR" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ml...@tensorflow.org.

Yong Tang

unread,
May 26, 2020, 1:11:48 AM5/26/20
to Mehdi AMINI, Uday Kumar Reddy Bondhugula, Yong Tang, MLIR
Hi Mehdi,

I am not a bazel expert as well, and I haven’t used override_repository before. 😅 The new_local_repository does allow user to specify external workspace and build file, if the other repo does not have bazel build files natively. This might be help if build and workspace files are located somewhere else. (Though tf+llvm could not use this feature directly because tf patches several build files to llvm).

Thanks
Yong



On May 25, 2020, at 9:49 PM, Mehdi AMINI <joke...@gmail.com> wrote:



Yong Tang

unread,
May 26, 2020, 1:21:52 AM5/26/20
to Uday Kumar Reddy Bondhugula, Mehdi AMINI, MLIR
Ah that might be because the new_repository is supposed to be used in WORKSPACE file, but tensorflow places llvm repo inside tensorflow/workspace.bzl and load the def from workspace. In Bazel the bzl file's behavior is different from workspace file (I don't know exactly the difference). I think if you place the repo declaration in WORKSPACE file, similar to `http_archive(name = "speech_commands",` then it will work.

Thanks
Yong


From: Uday Kumar Reddy Bondhugula <ud...@polymagelabs.com>
Sent: Monday, May 25, 2020 9:40 PM
To: Mehdi AMINI <joke...@gmail.com>
Cc: Yong Tang <yong.tan...@outlook.com>; MLIR <ml...@tensorflow.org>
Subject: Re: [mlir] Updating LLVM/MLIR in TF for local testing
 

Uday Kumar Reddy Bondhugula

unread,
May 26, 2020, 1:38:31 PM5/26/20
to Yong Tang, Mehdi AMINI, MLIR
On Tue, 26 May 2020 at 10:51, Yong Tang <yong.tan...@outlook.com> wrote:
Ah that might be because the new_repository is supposed to be used in WORKSPACE file, but tensorflow places llvm repo inside tensorflow/workspace.bzl and load the def from workspace. In Bazel the bzl file's behavior is different from workspace file (I don't know exactly the difference). I think if you place the repo declaration in WORKSPACE file, similar to `http_archive(name = "speech_commands",` then it will work.

Thanks again, but this isn't entirely clear to me and complex enough to just use the override_repository workflow. :) That works for now and also lets me get changes to the LLVM commit ID used since tensorflow/workspace.bzl would be kept unmodified.

~ Uday

Yong Tang

unread,
May 26, 2020, 1:49:11 PM5/26/20
to Uday Kumar Reddy Bondhugula, Mehdi AMINI, MLIR
Hi Uday,

The override_repository would be a good choice if it already worked. Bazel is not exactly an easy tool. It probably not worth a lot to find out different or alternative options inside bazel. If one solution works then that is all we need I think.

Thanks
Yong

From: Uday Kumar Reddy Bondhugula <ud...@polymagelabs.com>
Sent: Tuesday, May 26, 2020 10:38 AM
To: Yong Tang <yong.tan...@outlook.com>
Cc: Mehdi AMINI <joke...@gmail.com>; MLIR <ml...@tensorflow.org>
Reply all
Reply to author
Forward
0 new messages