The actual permissions change is occurring in the Utils.fetchFile() function (just `git grep chmod` to find it); I'm not sure what use-case originally motivated it.
The motivation behind the SparkFiles API was to avoid polluting the driver's current working directory with downloaded files and to fix a bug where calling addFile() on certain files could cause the original files to be deleted after the job completed (see
https://github.com/mesos/spark/pull/394 for the relevant discussions).
My motivation was to leave original files untouched by addFile(), so this behavior is a bug that I'd like to fix.
It looks like the problem is that Utils.addFile() symlinks local files into the target directory, so the permissions change to the target file affects the original file via the symlink. I think the idea behind the symlink was to avoid an expensive local copy of a large file when adding it. I think we could probably just perform the extra copy for safety, since users should be storing large files in HDFS (which should also work with addFile()). If users add large local files, then the driver will have to broadcast those files to all workers, so the cost of one extra copy shouldn't have a huge relative performance impact. If the file is small, there will be negligible impact. I can take a pass at fixing this later.