I've found a bug in emacs 22.1 and posted a bug report with my distro:
https://bugs.gentoo.org/show_bug.cgi?id=189627
The Gentoo devs figured they didn't patch that part of the sources, so
this is an issue with the official 21.1 release sources as well.
The bug occurs when there is a backup file FILE~ for some FILE the user
modified and wants to save. If the user doesn't have the privileges to
unlink FILE~, it will try over and over again, resulting in an infinite
loop. Pressing C-g gets it out of the loop, but the modifications still
haven't been saved. Backtraces always mention backup-buffer-copy, I
guess that's where the actual loop happens.
Steps to reproduce this:
1. mkdir emacs-bug
2. cd emacs-bug
3. echo foo > foo.txt
4. echo bar > foo.txt~
5. chmod u-w . foo.txt~
6. emacs foo.txt
7. Change contents from "foo" to "baz"
8. C-x C-s to save
Backtrace looks like this:
copy-file(".../foo.txt" ".../foo.txt~" nil t)
byte-code("..." [from-name to-name nil (delete-file to-name)
((file-error)) copy-file t] 5)
backup-buffer-copy(".../foo.txt" ".../foo.txt~" 420)
byte-code("..." [file-precious-flag backup-by-copying modes
real-file-name backup-by-copying-when-linked
backup-by-copying-when-mismatch 0 logand 3072 file-writable-p
file-name-directory file-nlinks 1 file-attributes 2 9
file-ownership-preserved-p backup-buffer-copy rename-file t
backup-by-copying-when-privileged-mismatch attr backupname setmodes] 4)
byte-code("..." [targets delete-old-versions real-file-name
buffer-file-name modes buffer-backed-up t nil y-or-n-p format "Delete
excess backup versions of %s? " file-modes (byte-code "..."
[file-precious-flag backup-by-copying modes real-file-name
backup-by-copying-when-linked backup-by-copying-when-mismatch 0 logand
3072 file-writable-p file-name-directory file-nlinks 1 file-attributes 2
9 file-ownership-preserved-p backup-buffer-copy rename-file t
backup-by-copying-when-privileged-mismatch attr backupname setmodes] 4)
((file-error ...)) (byte-code "..." [targets delete-file] 2)
((file-error)) setmodes] 5)
backup-buffer()
basic-save-buffer-2()
basic-save-buffer-1()
basic-save-buffer()
save-buffer(1)
call-interactively(save-buffer)
Greetings,
Martin von Gagern
Posted some more info there. Keep an eye on that page.
To me the cause seems clear: backup-buffer-copy catches and discards the
file-error caused by delete-file, probably assuming it is due to a
non-existing backup file. Perhaps the file should only be deleted if it
does really exist, and any error to delete it be taken serious.
There are some people who have not been able to reproduce the issue.
Reason unknown so far, but have a look at the bug report for updates.
Greetings,
Martin von Gagern
I suppose the `file-error' handler of `backup-buffer-copy'
(while (condition-case ()
(progn
(condition-case nil
(delete-file to-name)
(file-error nil))
(copy-file from-name to-name nil t)
nil)
(file-already-exists t))
;; The file was somehow created by someone else between
;; `delete-file' and `copy-file', so let's try again.
nil))
will ignore any error in `delete-file'. The subsequent `copy-file' will
trigger a `file-already-exists' error and Emacs continues to loop. In
practice, the `file-already-exists' error is always due to a failure to
delete `to-name' and hardly ever to some strange power creating files in
between.
Likely the `file-error' handler of `backup-buffer' will encounter a
similar fate when `backupname' can't be deleted by `backup-buffer-copy'.
There now is a patch to fix this, provided by Ulrich Mueller.
Original URL: http://bugs.gentoo.org/attachment.cgi?id=128742
=================== emacs-22.1-backup-buffer.patch ===================
--- emacs-22.1.orig/lisp/files.el 2007-05-25 14:43:31.000000000 +0200
+++ emacs-22.1/lisp/files.el 2007-08-21 08:26:36.000000000 +0200
@@ -3119,9 +3119,8 @@ backup-buffer-copy (from-name to-name modes)
(set-default-file-modes ?\700)
(while (condition-case ()
(progn
- (condition-case nil
- (delete-file to-name)
- (file-error nil))
+ (and (file-exists-p to-name)
+ (delete-file to-name))
(copy-file from-name to-name nil t)
nil)
(file-already-exists t))
=============== end of emacs-22.1-backup-buffer.patch ===============
Greetings,
Martin von Gagern
Did you apply that patch? Suppose the file to-name exists but cannot be
deleted. `copy-file' will raise its `file-already-exists' error and you
remain trapped in that loop.
You have to either change the backup file's permissions from within the
`condition-case' or mandate error handling up to `backup-buffer' where
it attempts to do the (convert-standard-filename "~/%backup%~") stuff.
I can't test these solutions here since my file system doesn't provide
permissions.
Yes, I did, and even got it included in the relevant byte-compiled code.
Seems to work well enough here.
> Suppose the file to-name exists but cannot be
> deleted. `copy-file' will raise its `file-already-exists' error and you
> remain trapped in that loop.
No, if the file cannot be deleted, the delete-file will signal a
file-error so copy-file doesn't even get a chance to signal a
file-already-exists error because it doesn't get called at all. As the
outer condition-case doesn't catch file-error, the signal will propagate
up the call stack.
> You have to either change the backup file's permissions from within the
> `condition-case'
A possible solution in my simple test case, but not a solution for the
real world case where user A wants to edit a file in a dir belonging to
B, where B granted write permission to A only for that single file, not
for its backup and neither for the directory. Nothing A can da about it.
> or mandate error handling up to `backup-buffer' where
> it attempts to do the (convert-standard-filename "~/%backup%~") stuff.
Already happens like this.
> I can't test these solutions here since my file system doesn't provide
> permissions.
Tough luck.
Which system? Isn't even the MS-DOS readonly flag enough for this?
Greetings,
Martin von Gagern
I stand corrected. The file-error is caught in `backup-buffer' and
further file-errors don't get trapped.
*** files.el 8 Aug 2007 14:06:01 -0000 1.896.2.15
--- files.el 21 Aug 2007 19:25:34 -0000
***************
*** 3120,3126 ****
(file-error nil))))))
(defun backup-buffer-copy (from-name to-name modes)
! (let ((umask (default-file-modes)))
(unwind-protect
(progn
;; Create temp files with strict access rights. It's easy to
--- 3120,3131 ----
(file-error nil))))))
(defun backup-buffer-copy (from-name to-name modes)
! (let ((umask (default-file-modes))
! (dir (or (file-name-directory to-name)
! default-directory)))
! ;; Can't delete or create files in a read-only directory.
! (unless (file-writable-p dir)
! (signal 'file-error (list "Directory is not writable" dir)))
(unwind-protect
(progn
;; Create temp files with strict access rights. It's easy to
***************
*** 3129,3142 ****
(set-default-file-modes ?\700)
(while (condition-case ()
(progn
! (condition-case nil
! (delete-file to-name)
! (file-error nil))
(copy-file from-name to-name nil t)
nil)
(file-already-exists t))
;; The file was somehow created by someone else between
;; `delete-file' and `copy-file', so let's try again.
nil))
;; Reset the umask.
(set-default-file-modes umask)))
--- 3134,3149 ----
(set-default-file-modes ?\700)
(while (condition-case ()
(progn
! ;; Failure to delete an existing file is an error.
! (if (file-exists-p to-name)
! (delete-file to-name))
(copy-file from-name to-name nil t)
nil)
(file-already-exists t))
;; The file was somehow created by someone else between
;; `delete-file' and `copy-file', so let's try again.
+ ;; FIXME does that every actually happen in practice?
+ ;; This is a potential infloop, which seems bad...
nil))
;; Reset the umask.
(set-default-file-modes umask)))
I'm too silly to understand this. Why can't we use
(copy-file from-name to-name t t)
here as in Emacs 21? What was the rationale for this loop?
> Why can't we use
>
> (copy-file from-name to-name t t)
>
> here as in Emacs 21? What was the rationale for this loop?
I know no more than it says in the comment. rms added the loop
20050423, copied from make-temp-file I think. I think looping makes
more sense in that context, not sure it makes any sense in this
context. I'll ask on emacs-devel. But if what it says in the existing
comment is possible, then I guess we would actually need something
like this:
*** files.el 8 Aug 2007 14:06:01 -0000 1.896.2.15
--- files.el 21 Aug 2007 21:26:02 -0000
***************
*** 3120,3126 ****
(file-error nil))))))
(defun backup-buffer-copy (from-name to-name modes)
! (let ((umask (default-file-modes)))
(unwind-protect
(progn
;; Create temp files with strict access rights. It's easy to
--- 3120,3131 ----
(file-error nil))))))
(defun backup-buffer-copy (from-name to-name modes)
! (let ((umask (default-file-modes))
! (dir (or (file-name-directory to-name)
! default-directory)))
! ;; Can't delete or create files in a read-only directory.
! (unless (file-writable-p dir)
! (signal 'file-error (list "Directory is not writable" dir)))
(unwind-protect
(progn
;; Create temp files with strict access rights. It's easy to
***************
*** 3129,3134 ****
--- 3134,3144 ----
(set-default-file-modes ?\700)
(while (condition-case ()
(progn
+ ;; If we allow for the possibility of something
+ ;; creating the file between delete and copy
+ ;; (below), we must also allow for the
+ ;; possibility of something deleting it between
+ ;; a file-exists-p check and a delete.
(condition-case nil
(delete-file to-name)
(file-error nil))
***************
*** 3137,3142 ****
--- 3147,3154 ----
(file-already-exists t))
;; The file was somehow created by someone else between
;; `delete-file' and `copy-file', so let's try again.
+ ;; FIXME does that every actually happen in practice?
+ ;; This is a potential infloop, which seems bad...
nil))
Good question. Especially since this recreates FILE~ every time.
I can think of at least two scenarios where this could be a problem:
1. The dir is not writable, but the backup file is.
Here the current behaviour will loop and even with the suggested fix
it will fall back to ~/%backup%~ unnecessarily.
2. The backup file is a hard link at should remain such.
This could be wanted in cases where the primary file is a hard link
as well. Don't know how backup-by-rename would handle this.
I originally assumed that emacs would try backup-buffer-copy only after
figuring out that it could not write to the existing backup file, but it
seems I was wrong there, at least if I read backup-buffer correctly.
So I think we want both, first try to reuse the backup file, which
copy-file with ok-if-exists set to t seems to do well. If that fails, we
can assume the file exists, but we are not allowed to write it, so maybe
we can delete it and create it anew. For this we ned the delete followed
by a copy.
Whether we should do any looping in case something goes wrong is another
question. Another process touching the same file just at the critical
moment should be rare situations. I think having the backup fall back
instead of risking a loop would be acceptable in these cases. An
alternative might be to retry a fixed number of times, say 10, and
assume some permanent problem in the logic if we still don't succeed.
Problem is that this approach might bugs go unnoticed more easily, but
with all those different systems out there, there might always be a
combination that we didn't foresee, so a sane default there might be
worth it.
Greetings,
Martin von Gagern
This seems a good idea, as deleting a backup file we won't be able to
recreate would be a bad move. However I guess there are filesystems out
there where a file might be undeletable even if its directory is
writable. So be careful about assumptions. You should still be careful
about exception handling later on.
> + ;; If we allow for the possibility of something
> + ;; creating the file between delete and copy
> + ;; (below), we must also allow for the
> + ;; possibility of something deleting it between
> + ;; a file-exists-p check and a delete.
> (condition-case nil
> (delete-file to-name)
> (file-error nil))
You left the possible cause for the loop in place, again relying on
catching an error in the normal course of events when there is no
backup. I can see your point, but I still think this is dangerous.
One reason is given above, and the second reason is this:
If we keep thinking about other processes creating or deleting files in
the middle of the operation, we might as well consider other processes
changing permissions as well. So who says that the directory will still
be writable once we are here?
Is there some reliable way by which we could discern a file-error
because the file does not exist from a file-error because we can't
delete it? Because we can recover from one, but not from the other.
> + ;; FIXME does that every actually happen in practice?
> + ;; This is a potential infloop, which seems bad...
The more I think about it, the rarer this seems to me. In my last mail I
voted for a fixed maximum loop count, but by now I would even drop all
loops; they are simply not worth the effort I guess.
Greetings,
Martin von Gagern
We could check (file-writable-p to-name) here too.
>>+ ;; FIXME does that every actually happen in practice?
>>+ ;; This is a potential infloop, which seems bad...
>
>
> The more I think about it, the rarer this seems to me. In my last mail I
> voted for a fixed maximum loop count, but by now I would even drop all
> loops; they are simply not worth the effort I guess.
More so because saving and backing up can be emergency operations.
And how would you combine the results? There are filesystems where I can
delete a file even if I can't write it. Most FS I know behave this way.
You really don't want filesystem specific code.
No, I believe file-writable-p should be checked to determine whether you
want to delete and recreate the backup, or write to an existing backup.
Once you are determined that you want to delete the backup, you should
simply try to do so, as I guess that's the only truly portable way of
figuring out whether you are allowed to.
> I can think of at least two scenarios where this could be a problem:
> 1. The dir is not writable, but the backup file is. Here the current
> behaviour will loop and even with the suggested fix it will fall
> back to ~/%backup%~ unnecessarily.
> [...]
> I originally assumed that emacs would try backup-buffer-copy only
> after figuring out that it could not write to the existing backup
> file, but it seems I was wrong there, at least if I read
> backup-buffer correctly.
It seems that your assumption is the intended behaviour. At least the
problem was noticed and a fix was installed in March 2005:
<http://lists.gnu.org/archive/html/emacs-devel/2005-03/msg00842.html>
However, it no longer works due to the introduction of the loop in
backup-buffer-copy one month later (this change was in CVS revision
1.757 of files.el). The relevant part is:
(defun backup-buffer-copy (from-name to-name modes)
- (condition-case ()
- (copy-file from-name to-name t t)
- (file-error
- ;; If copying fails because file TO-NAME
- ;; is not writable, delete that file and try again.
- (if (and (file-exists-p to-name)
- (not (file-writable-p to-name)))
- (delete-file to-name))
- (copy-file from-name to-name t t)))
+ (let ((umask (default-file-modes)))
+ (unwind-protect
+ (progn
+ ;; Create temp files with strict access rights. It's easy to
+ ;; loosen them later, whereas it's impossible to close the
+ ;; time-window of loose permissions otherwise.
+ (set-default-file-modes ?\700)
+ (while (condition-case ()
+ (progn
+ (condition-case nil
+ (delete-file to-name)
+ (file-error nil))
+ (write-region "" nil to-name nil 'silent nil 'excl)
+ nil)
+ (file-already-exists t))
+ ;; the file was somehow created by someone else between
+ ;; `make-temp-name' and `write-region', let's try again.
+ nil)
+ (copy-file from-name to-name t t 'excl))
+ ;; Reset the umask.
+ (set-default-file-modes umask)))
(and modes
(set-file-modes to-name (logand modes #o1777))))
with the following ChangeLog entry:
2005-04-23 Richard M. Stallman <r...@gnu.org>
* files.el [...]
(backup-buffer-copy, basic-save-buffer-2): Take care against
writing thru an unexpected existing symlink.
[...]
There should be easier ways to achieve protection against writing
through symlinks.
Ulrich
Glenn's patch already throws a file-error when the directory is not
writable. Hence throwing a file-error when an backup file is not
writable doesn't seem to harm much. Ulrich mentions a change where the
user wants to back-up to a writable file in a non-writable directory.
Thus it might make sense to first check if the backup file is writable
and check whether the directory is writable iff the backup file is not.
> martin rudalics wrote:
>> We could check (file-writable-p to-name) here too.
>
> And how would you combine the results? There are filesystems where I can
> delete a file even if I can't write it.
And, on lots of filesystems there are directories where you can write a
file and its directory but still not delete the file. At least on GNU/Linux
(I don't know which "half" is responsible for it), you can set a directory
to "sticky" (commonly used on /tmp) which means that only the owner (of the
file) can delete files in there.
| STICKY DIRECTORIES
| When the sticky bit is set on a directory, files in that directory may
| be unlinked or renamed only by root or their owner. Without the sticky
| bit, anyone able to write to the directory can delete or rename files.
| The sticky bit is commonly found on directories, such as /tmp, that are
| world-writable.
Michael
--
#!/usr/bin/perl -I' # tekscribble.pl - start in an xterm and scribble with mouse
$|=1;$g="\35";sub g{getc}sub p{print@_}system"stty -icanon";p"\233?38h";for(;;){
p"$g\33\32";$_=g;$x=g;$X=g;$y=g;$Y=g;last if/q/;$k=$y.chr((ord$Y)+64).$x.chr((
ord$X)+32);p"\33\14"if/c/;p$g.(/ě/?$l:$k).$k;$l=$k;}p"\33\3";system"stty icanon"
I think that fix is actually correct.
It is copy-file that detects the race condition.
As long as copy-file is inside a loop, creation of
the file by another process will be dealt with.
So this change should be installed.
I think there is no need to explicitly check whether the directory
is writable. What would be the purpose of that?
> I think that fix is actually correct.
> So this change should be installed.
It has been installed, but ...
> I think there is no need to explicitly check whether the directory
> is writable. What would be the purpose of that?
... the previous change was not completely reverted, variable "dir"
still gets assigned but is not used.
So the following patch should be applied in addition:
--- files.el 24 Aug 2007 03:03:52 -0000 1.922
+++ files.el 13 Aug 2007 13:40:58 -0000 1.919
@@ -3172,9 +3172,7 @@
(file-error nil))))))
(defun backup-buffer-copy (from-name to-name modes)
- (let ((umask (default-file-modes))
- (dir (or (file-name-directory to-name)
- default-directory)))
+ (let ((umask (default-file-modes)))
(unwind-protect
(progn
I didn't write that patch, but I guess there could be a purpose for it.
Namely it could help deleting a file in vain. If there are Filesystems
that let you delete a writable file even though you can't create new
ones in a non-writable dir, then without those lines, emacs would delete
an old backup only to find out that it can't create a new one.
Most likely these would be cases where the old backup file was writable,
so if you had some logic to write to existing files and only delete them
if they are not writable, then the check whether the directory is
writable should really be superfluous. Otherwise I'm not so sure.
Greetings,
Martin von Gagern
That sounds like a good reason.