I ran into a problem with copying files and it turns out to be due to
the fact
that I was copying all files in a partcular directory into another
directory that
was a subdirectory of the original one, like:
Directory contents:
.
..
work/
example.inp
Script:
foreach f [glob *] {
file copy $f work
}
which led to an almost endless loop (the path became too long, that is
why it stopped)
So I wonder how I can best avoid this situation. For instance: if I
have the strings "example"
and "../example/work" and I know that the directory "example" exists
(but not necessarily the
other one), how can I tell that the second is a subdirectory of the
first?
Is normalizing both and then examine the result via [file split] the
most robust way?
Something along these lines:
set dir1 [file normalize "example"]
set dir2 [file normalize "../example/work"]
foreach p1 [file split $dir1] p2 [file split $dir2] {
if { $p1 != $p2 && $p1 != "" && $p2 != "" } {
puts "Subdirectories!"
}
}
(Well, the above does not quite work, but I hope my question is clear
enough. It could probably
serve as a basis)
Regards,
Arjen
I don't think you need to split the components.
Given variables sourcedir and destdir
if { [string match [file normalize $sourcedir]/* [file normalize
$destdir]] } {
return -code error "Cannot copy into a subdirectory of source dir"
Right! That seems quite useful and certainly concise.
Thanks,
Arjen
Comparing the normalized source and target (each with a slash appended!)
for substring-equality should suffice, unless you're on a platform that
supports symlinks, where you'd first have to resolve all symlinks along
the path, which is a non-trivial task.
Linux's "cp" detects such situations:
cp: cannot copy a directory, `work', into itself, `work/work'
(but still creates an empty directory `work/work')
Solaris's "cp" runs into the same trouble as tcl.
So, if you're on linux, you could [exec cp -r ...]
Perhaps a TIP to get this protection into tcl's file command might
be adequate (as I'd expect symlinks to be another possible cause of
trouble).
I'm not sure, if the problem is principially solveable at all.
Perhaps by keeping an array of the created inodes and refuse to
re-copy any of these?
A quick check does indeed show that with symlinks the discussed
algorithm fails miserably (that is: [file normalize] does not
return the name of the file being linked to, so you end up with
two different names for the same thing).
The usecase I have in mind should not be this involved, but it
would be nice to crush this problem before it really becomes
a problem.
Regards,
Arjen
If you only want to find files: glob -type f -- *
--
Glenn Jackman
Write a wise saying and your name will live forever. -- Anonymous
Ah, no, the problem arose with the application I put on the
Wiki: http://wiki.tcl.tk/22011. It is definitely copying the
contents of a directory (and perhaps subdirectories).
Regards,
Arjen
Appropriate use of [file type] and [file link] (or [catch] and [file link])
as well as normalize will get you the "true" absolute paths to the source
and destination.
That being said, in the subdirectories of either could be links to elsewhere
in the tree.
--
+------------------------------------------------------------------------+
| Gerald W. Lester |
|"The man who fights for his ideals is the man who is alive." - Cervantes|
+------------------------------------------------------------------------+
the (partially pseudo-)code would be like that:
set fnl [file split $fname]; set ready 0
while {!$ready} {
set ready 1
for {set i 0} {$i<[llength $fnl} {incr i} {
set sfn [file join {*}[lrange $fnl 0 $i]]
if $sfn is symlink {
set fnl [file join+normalize+split \
[link-target of $sfn] [lrange $fnl $i+1 end]]
set ready 0; break
}
}
puts ... [file join {*}$fnl]
}
set fname [file join {*}$fnl]
Rather than splitting, one could also successively search for
occurrances of file separator characters.
At some point one should also check for possible cycles...
But even without loops, one can face exponential (in directory
depth) effort to resolve all symlinks in worst case.
Here's a little testcase:
set aux {8 9 7 8 6 7 5 6 4 5 3 4 2 3 1 2 0 1}; set dir ""
for {set i 0} {$i < $N} {incr i} {
file mkdir 9; foreach {l t} $aux {exec ln -s $dir$t $l}
cd "9"; set dir "../${dir}0/";
}
# btw., tcl's [file link] normalizes the target, thus is not
# usable for this misuse (and other (less mis-)uses).
> Arjen Markus wrote:
>>...
>> A quick check does indeed show that with symlinks the discussed
>> algorithm fails miserably (that is: [file normalize] does not
>> return the name of the file being linked to, so you end up with
>> two different names for the same thing).
The problem here is that 'file normalize' doesn't touch the last
segment in the path, i.e. the 'file', only the 'directories', right ?
>> The usecase I have in mind should not be this involved, but it
>> would be nice to crush this problem before it really becomes
>> a problem.
> Appropriate use of [file type] and [file link] (or [catch] and [file
> link]) as well as normalize will get you the "true" absolute paths to
> the source and destination.
A simpler solution is to add a dummy path segment to the path in
question, normalize, then strip the dummy off again. The dummy takes
the part of the untouched 'file', causing the real 'file' to be
resolved should it be a symlink.
proc fullnormalize {path} {
# SNARFED from tcllib, fileutil.
return [file dirname [file normalize [file join $path __dummy__]]]
}
--
So long,
Andreas Kupries <akup...@shaw.ca>
<http://www.purl.org/NET/akupries/>
Developer @ <http://www.activestate.com/>
-------------------------------------------------------------------------------