Extending file copy functionality

885 views
Skip to first unread message

Roger Vaughn

unread,
Feb 20, 2015, 4:41:01 PM2/20/15
to gradl...@googlegroups.com
I'm currently trying to implement some builds in Gradle in which I need to do some binary file transforms (digital signing, etc.) during a file copy. Though Gradle includes several text file transforms, I can't seem to find a good way to do general transforms during a copy. Basically, what I'm trying to do is this:

    foreach file f in some set S
      copy f to destination, creating f'
      transform f'
      if transform failed
          delete f'
          fail the build

I've previously coded up a Rake extension to do this, similar to Gradle's Copy task, but I can't see how to make Gradle do it out of the box. Nor do I see how it could be added via plugin. CopySpec.eachFile() is almost the right solution, but it executes before the file copy, rather than after.

So what I propose to do is add a CopySpec.afterEachFile() method to the Gradle core, following the same model as CopySpec.eachFile(). I would also want to alias CopySpec.eachFile() as CopySpec.beforeEachFile() just for the parallelism and clarity.

Does this seem like a useful and reasonable addition to anybody else? Am I missing anything obvious that might make this unnecessary?

roger

Sterling Greene

unread,
Feb 20, 2015, 5:12:30 PM2/20/15
to gradl...@googlegroups.com

--
You received this message because you are subscribed to the Google Groups "gradle-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gradle-dev+...@googlegroups.com.
To post to this group, send email to gradl...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gradle-dev/f96c4916-0a63-44ab-9689-6327358b31b7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Roger Vaughn

unread,
Feb 21, 2015, 3:55:51 PM2/21/15
to gradl...@googlegroups.com
No, because that operates on the contents of the file as a text stream - it doesn't allow you to operate on the file as a whole, for instance passing its path to an external utility. That's really what I need to do in this case - call external utils on the destination file.

Sterling Greene

unread,
Feb 22, 2015, 9:16:03 PM2/22/15
to gradl...@googlegroups.com
Ah, yeah.  If you're going out to an external program, FilterReader is probably harder to do.  I think using Copy is a little weird for this once you get beyond just filtering the content of a file.

Usually, you see this done with a custom task like:

class TransformerTask extends SourceTask {
   @OutputDirectory outputDir
   @TaskAction
   public void transform() {
       getSource().getFiles().each { file ->
           // do something to file (e.g., you could do project.exec{} now)
       }
   }
}

This has some differences with Copy since it makes it harder to keep the same file structure in the outputDir.  Would that work for you?

-Sterling


Roger Vaughn

unread,
Feb 23, 2015, 12:03:27 PM2/23/15
to gradl...@googlegroups.com
Hi Sterling,

Thanks for sticking with me here. :) This *might* work in concert with a CopyTask, but separating the two concerns raises issues of its own. (I'll explain below.) What I'm trying to do fits naturally in with the copy operation, even if it sort of changes the semantics of a copy. I suppose the best approach is to define a new TransformTask that piggybacks on the file copy code. I still need that copy in there and it wouldn't make a whole lot of sense to duplicate the copy code for this.

Let me explain why I want to modify CopyTask, and the unique challenges I run into doing this.

* I'm copying entire trees of files at a time, using wildcard selectors.
* The output directory structure is different than the input structure.
* For hysterical reasons, our binaries are compiled in a previous step, but get branded and signed just before creating installers.
* We don't want to modify the input binaries because of the possibility of failure during modification. That would require recompiling the whole thing again. (These are big, slow legacy builds with very poor incremental build support.)
* Only some of the copied files get branded and signed. This fits into the DSL nicely with a few minor additions to CopySpec (via conventions).
* It's possible to split the problem in two: do the tree copy, and then modify files in the destination tree. The problem comes in when a modification fails. How can you (or Gradle) tell what has been processed and what has not ? My best solution to this is to delete the whole destination tree and start over. That's ugly.
* It's possible to work around the above by adding an intermediate copy - i.e. copy the source files to the destination directory structure in an intermediate location. Then one by one copy the intermediate files to the final destination, transforming them on the way. Because the second copy doesn't have to filter or restructure the directory tree, it's much simpler and doesn't need the power of CopyTask, so simple copy code will do. But this approach adds a third copy of each file and doubles the number of copy operations, so it doesn't feel right either.
* CopyTask and all of its internals don't seem to be very amenable to customization (eg. to make a TransformTask plugin). CopySpec can be extended via conventions, but the processing sequence is baked into the class hierarchy. In order to add processing after each file, it looks like I'm going to have to modify core classes and interfaces.

Basically, every approach I've tried with existing primitives feels clunky and compromises incremental build times. Adding that hook into the copy just feels like the best way to do this. As I mentioned, I've already done it in Rake and it works out beautifully - a little bit of DSL sugar in my copy specs gets the job done. Unfortunately I have not yet had the time to actually implement and test this change in Gradle.

Gary Hale

unread,
Feb 23, 2015, 3:36:03 PM2/23/15
to gradl...@googlegroups.com
Roger,

What's not clear to me from your explanation is why this particular transformation is different than every other transformation in your build up to that point.  That is, why does it make more sense to treat it as part of the Copy task rather than as just another transformation task prior to the copy?

I mean, although Copy has some limited transformation capabilities, I'm not sure I like the idea of adding additional non-copy-related stuff onto Copy.  If this problem could be extracted into a general purpose Transform task (or some such) that simplifies the "transformation" capability, I'm sure it would at the very least be a nice plugin if not considered for being a core task.  

But that's just my opinion, others may have a different perspective.

Gary

Gary Hale

unread,
Feb 23, 2015, 4:06:24 PM2/23/15
to gradl...@googlegroups.com
Although reading your original email again, you were proposing an afterEachFile hook, and not necessarily a transformation-specific action.  I guess the question is, what actions (other than transformation) might require a post-copy hook like that (as opposed to just using eachFile)?  (Not suggesting there aren't any, just wondering what they might be)

Gary

Roger Vaughn

unread,
Feb 23, 2015, 6:10:13 PM2/23/15
to gradl...@googlegroups.com
Hi Gary,

I've been anticipating these questions - they're good ones. :) Let me take them in order:

1. Why not a separate transformation task prior to the copy?

Mostly because I need to modify the destination files *after* the copy, rather than the original files. So why not a separate transform after the copy? That's possible, but raises headaches with failure modes and incremental builds. I'm copying and modifying whole file trees, so if the (separate) transform task fails in the middle of the tree, it will not know where to resume later on - I have to delete the whole destination tree and rerun the entire transform *and* copy tasks. This is solved by combining the copy and transform inside a single tree walk - then any files left in the destination tree are guaranteed to be successfully modified, making incremental builds easy.

This is always going to be an issue with in-place transformations. Writing outputs to a different destination or filename is always preferable, but in my case I don't have that option. One of the tools I'm dealing with it Microsoft signtool, which only does in-place signing. As I explained to Sterling, I could do (and have done) this with intermediate files, but that just feels ugly - it creates a third copy of and a second copy operation for every file.

2. I'm not sure I like the idea of adding additional non-copy-related stuff onto Copy.

I understand the objection and to a degree, I agree. I might counter that the same could be said of the few text transforms already built into the task, but I think the real answer is: it's convenient. In essence, I'm combining copy + transform to simulate the "write the output to a different location" functionality that's missing from my external tools. Ideally the tree-walking behavior (CopySpec) embedded in the Copy task would be reusable outside the context of an AbstractCopyTask, but that's not the current reality. It may make more sense for me to create all new tasks instead of modifying copy, but I still need CopySpec-like functionality and the actual file copy operation, so it seems a real shame to duplicate all of that existing code. I'm leaning towards thinking we should really have an FileTreeTask instead, with Copy simply being one of the actions that can be applied to it.

3. You were proposing an afterEachFile hook, and not necessarily a transformation-specific action.

Right. I prefer generic and reusable when possible. :) I'm looking for minimal required modification to the Gradle core here. My build-specific work would be encapsulated in an Action. As for examples, I've already mentioned signtool. Another example is icon modification. We compile a binary once, but then inject different icons into it depending on what product it's going into, what OEM we're white-labelling it for, etc. (We really do this, yes.) In all these cases, I need to modify a destination (copied) file, but leave the original compiled binary pristine. eachFile is unfortunately unsuitable, since it fires before the copy has happened.

One other point I've neglected to mention up until now, because I'm shooting for generic, is that the builds I'm dealing with aren't Java. They're Windows binaries that were compiled from C++ or C# in an earlier, separate build step. (Not with Gradle - yet.) I'm trying to get to a point where we can use Gradle across the whole company, and this is just the first roadblock I'm trying to get past.

roger

Gary Hale

unread,
Feb 24, 2015, 1:29:17 PM2/24/15
to gradl...@googlegroups.com
I think the salient point is that the transformation tool can only process files in place and this is why you want to do this as part of a Copy task as opposed to doing this in another task.  Outside of an in-place transformation, though, I'm still struggling to think of other scenarios where you would need a post-copy hook instead of just using a subsequent task.  I mean, I can think of lots of things that I might want to do after a copy, but not many that would require it to be done as part of the copy task itself.

Even though eachFile occurs before the copy, it's still an action that is fired for each file and you can still take advantage of the copy capability.  Your in-place transformation seems like it could still be accomplished with eachFile, you just have to do the copy explicitly.  For example:

task copyAndTransformSomeFiles(type: Copy) {
  ...
  eachFile { details ->
    // Only transform certain files
    if (some-condition) {
      def copiedFile = details.relativePath.getFile(destinationDir) // get a File object for the destination file
      details.exclude() // inhibit the "normal" copy
      details.copyTo(copiedFile) // explicitly copy the file to the destination
      doTransformation(copiedFile) // in-place transformation
    }
  }
}

That's obviously not a robust solution, but it is a way to do what you are describing with eachFile rather than an additional hook.

Gary 

Roger Vaughn

unread,
Feb 24, 2015, 2:15:31 PM2/24/15
to gradl...@googlegroups.com
Thank you. The "details.exclude()" call is the bit I was missing. I actually considered using eachFile like you described, but assumed that the FileCopyAction would later overwrite anything I did. That exclude call is the key... but it still feels a little hackish to me. A Copy task where we explicitly disable the copy?

Anyway, I realized last night that I can derive a custom task from AbstractCopyTask and provide my own CopyAction, sidestepping the whole mess. I think that's the way to go - it doesn't require core changes, doesn't sully the Copy task, and makes it clear that my task is different. That's a win to me.

Tobias Schulte

unread,
Feb 25, 2015, 4:04:50 PM2/25/15
to gradl...@googlegroups.com
Hi,

I do have one example, where I have the same problem as the OP: https://github.com/tschulte/gradle-jnlp-plugin/blob/master/gradle-jnlp-plugin/src/main/groovy/de/gliderpilot/gradle/jnlp/SignJarsTask.groovy

In that task I copy all dependencies into the destination. During copy the jars are being unsigned (removing files from the jar) and the manifest.mf is altered (e.g. adding entries required for webstart). After the files are copied, they are signed using ant.signJar.

The first part could be done using a FilterReader, the signing could be done using a doLast on the task. But for jarsigning using incremental build capabilities is important, especially for huge projects with lots of dependencies (That's the reason I use gpars for signing). But in my tests eachFile() was always called with every file, even with the up-to-date ones.

Tobias

Sterling Greene

unread,
Feb 25, 2015, 6:00:44 PM2/25/15
to gradl...@googlegroups.com
Yes, there are two types of 'incremental'.  One that is incremental for the whole task (if none of the inputs/outputs have changed, we do nothing).  The other is incremental for the inputs of a task (if a subset of inputs have changed, we only have to reprocess those).  Copy and most other tasks are the first type.  Some of the new compiler tasks are the second type.  If you look for tasks that use IncrementalTaskInputs, they usually have the smarts to handle subsets of inputs changing.

So for now, if any inputs change to a Copy, we recopy everything.  

Roger Vaughn

unread,
Feb 25, 2015, 6:12:03 PM2/25/15
to gradl...@googlegroups.com
That's an interesting wrinkle that I hadn't really thought about yet, but would have bumped into eventually - I also only want to copy new or changed files. Windows code signing can be expensive, particularly since we're calling out to an external timestamp server for every signature.

I'm going to try to tackle this in a plugin over the next few days. I'll report back on how it ends up.
...

Julia Willson

unread,
Oct 12, 2018, 10:34:40 PM10/12/18
to gradle-dev
So, couple things here (yeah years later, but this will help others).

You will have to create a binary plugin to do something that acts like a diff-merge (and maybe tack on a transform as an option of this).
This is because the runtime api does not support incremental inputs.  While in the runtime api you can define an updateWhen, it will not provide the details needed to implement something like this.

Also, generally it's not a good idea it perform the copy+transform task the way provided in this thread, which has the copy task performing the copy task at configuration time and not at execution time.  I'd recommend looking for examples that help do things like preserve modified times for files (without using the ant.copy task) for examples of how to operate on each file at execution time.

Why is this important?  eachFile, while it operates at execution time, well, that isn't obvious to the average gradle script maintainer.  So for maintainability's sake, one should include all action in the doLast block.

It's very easy to get the input.files of a task in the doLast.
Reply all
Reply to author
Forward
0 new messages