when files are marked as 'removed' in working context:
- entry in 'dirstate' is dropped, and
- if there is no other files in target directory:
- entry in 'dirstate._dirs' is dropped, and
- directory itself is removed from filesystem
so, 'f in dirstate.dirs()' can't examine whether specified pattern is
related to removed largefiles or not correctly, if it is 'directory
pattern'.
this patch adds 'dirstate.rdirs()' (removed-dirs) to know what
directories are related to removed files.
'dirstate.rdirs()' is not for serious purpose, but for checking of
directory matching to removed files, so it is updated only in
'_droppath()'.
in almost all cases, 'dirstate.rdirs()' is probably invoked after
examination with 'dirstate.dirs()', so 'dirstate._rdirs' property is
built in 'dirstate._dirs()' to prevent manifest from being scanned twice.
diff -r d09da50da3b2 -r 200e12e1d1f1 mercurial/dirstate.py
--- a/mercurial/dirstate.py Thu Mar 22 23:58:47 2012 +0900
+++ b/mercurial/dirstate.py Thu Mar 22 23:58:47 2012 +0900
@@ -46,6 +46,15 @@
return
del dirs[base]
+def _addrdirs(dirs, path):
+ pos = path.rfind('/')
+ while pos != -1:
+ path = path[:pos]
+ if path in dirs:
+ break # dirs already contains this and above
+ dirs.add(path)
+ pos = path.rfind('/')
+
class dirstate(object):
def __init__(self, opener, ui, root, validate):
@@ -113,9 +122,13 @@
@propertycache
def _dirs(self):
dirs = {}
+ rdirs = set()
for f, s in self._map.iteritems():
if s[0] != 'r':
_incdirs(dirs, f)
+ else:
+ _addrdirs(rdirs, f)
+ self._rdirs = rdirs
return dirs
def dirs(self):
@@ -261,8 +274,9 @@
self._pl = p
def invalidate(self):
- for a in ("_map", "_copymap", "_foldmap", "_branch", "_pl", "_dirs",
- "_ignore"):
+ for a in ("_map", "_copymap", "_foldmap", "_branch", "_pl",
+ "_dirs", "_rdirs",
+ "_ignore"):
if a in self.__dict__:
delattr(self, a)
self._lastnormaltime = 0
@@ -285,8 +299,10 @@
return self._copymap
def _droppath(self, f):
- if self[f] not in "?r" and "_dirs" in self.__dict__:
- _decdirs(self._dirs, f)
+ if self[f] not in "?r":
+ if "_dirs" in self.__dict__:
+ _decdirs(self._dirs, f)
+ _addrdirs(self._rdirs, f)
def _addpath(self, f, check=False):
oldstate = self[f]
@@ -425,8 +441,9 @@
def clear(self):
self._map = {}
- if "_dirs" in self.__dict__:
- delattr(self, "_dirs")
+ for a in ("_dirs", "_rdirs"):
+ if a in self.__dict__:
+ delattr(self, a)
self._copymap = {}
self._pl = [nullid, nullid]
self._lastnormaltime = 0
@@ -742,3 +759,14 @@
return (lookup, modified, added, removed, deleted, unknown, ignored,
clean)
+
+ @propertycache
+ def _rdirs(self):
+ '''not for strict purpose, but for checking of directory
+ matching to removed files'''
+
+ self._dirs
+ return self._rdirs
+
+ def rdirs(self):
+ return self._rdirs
_______________________________________________
Mercurial-devel mailing list
Mercuri...@selenic.com
http://selenic.com/mailman/listinfo/mercurial-devel
original implementation uses '__contains__()' and 'rdirs()' of
dirstate to examine whether specified pattern is related to largefiles
in target context.
but the directory removed recursively by 'hg remove' is not recognized
in both, so 'directory pattern' for it can't be recognized correctly.
this patch uses 'dirstate.rdirs()' for 'directory pattern' matched
against removed directory.
diff -r 200e12e1d1f1 -r dac978818d9b hgext/largefiles/reposetup.py
--- a/hgext/largefiles/reposetup.py Thu Mar 22 23:58:47 2012 +0900
+++ b/hgext/largefiles/reposetup.py Thu Mar 22 23:58:47 2012 +0900
@@ -140,7 +140,8 @@
if working:
sf = lfutil.standin(file)
dirstate = repo.dirstate
- if sf in dirstate or sf in dirstate.dirs():
+ if (sf in dirstate or sf in dirstate.dirs() or
+ sf in dirstate.rdirs()):
return sf
return file
@@ -173,7 +174,8 @@
def sfindirstate(f):
sf = lfutil.standin(f)
dirstate = repo.dirstate
- return sf in dirstate or sf in dirstate.dirs()
+ return (sf in dirstate or sf in dirstate.dirs() or
+ sf in dirstate.rdirs())
match._files = [f for f in match._files
if sfindirstate(f)]
# Don't waste time getting the ignored and unknown
diff -r 200e12e1d1f1 -r dac978818d9b tests/test-largefiles.t
--- a/tests/test-largefiles.t Thu Mar 22 23:58:47 2012 +0900
+++ b/tests/test-largefiles.t Thu Mar 22 23:58:47 2012 +0900
@@ -1014,3 +1014,26 @@
$ cd ..
+
+test of 'hg status' with 'directory pattern' for recursively removed
+directory.
+
+ $ hg init removeddirpat
+ $ cd removeddirpat
+
+ $ mkdir -p sub/sub/sub
+ $ echo a > sub/sub/sub/a.large
+ $ echo b > b.large
+ $ hg add --large sub/sub/sub/a.large b.large
+ $ hg commit -m '#0'
+ Invoking status precommit hook
+ A b.large
+ A sub/sub/sub/a.large
+
+ $ hg remove sub/sub/sub/a.large
+ $ test ! -d sub/sub/sub
+ $ hg status -A b.large sub/sub/sub
+ R sub/sub/sub/a.large
+ C b.large
+
+ $ cd ..
original implementation examines whether specified pattern is related
to largefiles in target context only with dirstate.
but this can't recognize patterns related to largefiles only known in
non-working context.
this patch checks also another context for such case.
this patch defines 'finctx()' and 'dinctx()' separately, because
building up 'dirs()' and 'rdirs()' of dirstate/changectx can be
avoided, if all specified patterns match against largefiles directly.
this patch treats '.' as special character, because:
- '.' is not recognized in dirstate/changectx:
this patch avoids unnecessary examination by finctx/dinctx.
- it means 'matches all files in working directory':
this patch chooses(or avoids) 'performance boost' route correctly
(also related to issue 3327)
diff -r 0218251e818b -r b2eb2c3bfd88 hgext/largefiles/reposetup.py
--- a/hgext/largefiles/reposetup.py Thu Mar 22 23:58:47 2012 +0900
+++ b/hgext/largefiles/reposetup.py Thu Mar 22 23:58:47 2012 +0900
@@ -120,14 +120,31 @@
if match is None:
match = match_.always(self.root, self.getcwd())
+ dirstate = repo.dirstate
+ if parentworking:
+ finctx = lambda f: f in dirstate
+ dinctx = lambda d: (d in dirstate.dirs() or
+ d in dirstate.rdirs())
+ elif working:
+ finctx = lambda f: f in dirstate or f in ctx1
+ dinctx = lambda d: (d in dirstate.dirs() or
+ d in dirstate.rdirs() or
+ d in ctx1.dirs())
+ else:
+ finctx = lambda f: f in ctx2 or f in ctx1
+ dinctx = lambda d: d in ctx2.dirs() or d in ctx1.dirs()
+
# First check if there were files specified on the
# command line. If there were, and none of them were
# largefiles, we should just bail here and let super
# handle it -- thus gaining a big performance boost.
lfdirstate = lfutil.openlfdirstate(ui, self)
if match.files() and not match.anypats():
- for f in lfdirstate:
- if match(f):
+ for f in match.files():
+ if f == '.':
+ break
+ sf = lfutil.standin(f)
+ if finctx(sf) or dinctx(sf):
break
else:
return super(lfiles_repo, self).status(node1, node2,
@@ -137,11 +154,9 @@
# Create a copy of match that matches standins instead
# of largefiles.
def tostandin(file):
- if working:
+ if working and file != '.':
sf = lfutil.standin(file)
- dirstate = repo.dirstate
- if (sf in dirstate or sf in dirstate.dirs() or
- sf in dirstate.rdirs()):
+ if finctx(sf) or dinctx(sf):
return sf
return file
diff -r 0218251e818b -r b2eb2c3bfd88 tests/test-largefiles.t
--- a/tests/test-largefiles.t Thu Mar 22 23:58:47 2012 +0900
+++ b/tests/test-largefiles.t Thu Mar 22 23:58:47 2012 +0900
@@ -1059,3 +1059,62 @@
C b.large
$ cd ..
+
+Test 'hg status' for largefiles known only in another revision.
+
+ $ hg init knownanother
+ $ cd knownanother
+
+ $ echo a > a.normal
+ $ hg add a.normal
+ $ hg commit -m '#0'
+ Invoking status precommit hook
+ A a.normal
+ $ echo b > b.normal
+ $ hg add b.normal
+ $ hg commit -m '#1'
+ Invoking status precommit hook
+ A b.normal
+ $ mkdir -p sub/sub/sub
+ $ echo c > sub/sub/sub/c.large
+ $ hg add --large sub/sub/sub/c.large
+ $ hg commit -m '#2'
+ Invoking status precommit hook
+ A sub/sub/sub/c.large
+
+ $ hg update 1
+ 0 files updated, 0 files merged, 1 files removed, 0 files unresolved
+ getting changed largefiles
+ 0 largefiles updated, 1 removed
+ $ test ! -d sub/sub/sub
+
+'file pattern' for the file known also in working context should be
+specified for certain test, because it prevents dirstate from full
+scan which may show correct status unexpectedly.
+
+ $ hg status --rev 2 sub/sub/sub/c.large a.normal
+ R sub/sub/sub/c.large
+ $ hg status --rev 0 --rev 2 sub/sub/sub/c.large a.normal
+ A sub/sub/sub/c.large
+
+ $ hg status --rev 2 sub/sub/sub a.normal
+ R sub/sub/sub/c.large
+ $ hg status --rev 0 --rev 2 sub/sub/sub a.normal
+ A sub/sub/sub/c.large
+
+ $ cd ..
+
+treat '.' as the pattern (maybe) related to largefiles (issue3327)
+
+ $ hg init issue3327
+ $ echo a > issue3327/large
+ $ hg -R issue3327 add --large issue3327/large
+ $ hg -R issue3327 commit -m '#0' issue3327/large
+ Invoking status precommit hook
+ A large
+ $ hg --cwd issue3327 -R . status --all
+ C large
+ $ hg --cwd issue3327 -R . status --all .
+ C large
+ $ hg --cwd issue3327 -R . status --all -I '*' .
+ C large
refactoring/new-features are minimized than previous post, so this is
posted with STABLE flag, too.
original implementation queries whether specified pattern is related
or not to largefiles, to target context.
but changectx/workingctx returns False about relationship with files
marked as removed.
So, 'hg status' with 'file pattern' for removed file shows unexpected
warning message in below process:
1. 'tostandin()' returns non-STANDIN filename for removed file,
because changectx/workingctx returns False about relationship
with it
2. 'match.files()' contains non-STANDIN filename, which is already
removed from working directory
3. 'dirstate.walk()' invoked via 'localrepository.status()' treats
non-STANDIN filename as bad filename, because there is no entry
for it in dirstate: only STANDIN is managed in dirstate
4. 'dirstate.walk()' invokes 'match.bad()', which is defined in
'localrepository.status()' as 'bad()'
5. 'bad()' shows warning message for non-STANDIN, because it is
not related to source context: only STANDIN is related to it
this patch queries to dirstate instead of changectxt/workingctx,
because dirstate returns expected result for removed files.
'match.files()' is used by 'localrepository.status()' only in
'working' case, so this patched code also works correctly in
non-'working' case.
diff -r 2338ab19b236 -r 447a62ee6b76 hgext/largefiles/reposetup.py
--- a/hgext/largefiles/reposetup.py Sun Mar 18 18:21:32 2012 -0500
+++ b/hgext/largefiles/reposetup.py Thu Mar 22 23:58:47 2012 +0900
@@ -137,7 +137,7 @@
# Create a copy of match that matches standins instead
# of largefiles.
def tostandin(file):
- if inctx(lfutil.standin(file), ctx2):
+ if working and lfutil.standin(file) in repo.dirstate:
return lfutil.standin(file)
return file
diff -r 2338ab19b236 -r 447a62ee6b76 tests/test-largefiles.t
--- a/tests/test-largefiles.t Sun Mar 18 18:21:32 2012 -0500
+++ b/tests/test-largefiles.t Thu Mar 22 23:58:47 2012 +0900
@@ -68,6 +68,8 @@
Remove both largefiles and normal files.
$ hg remove normal1 large1
+ $ hg status large1
+ R large1
$ hg commit -m "remove files"
Invoking status precommit hook
R large1
original implementation queries whether specified pattern is related
or not to largefiles in target context by 'dirstate.__contains__()'.
but this can't recognize 'directory pattern' correctly, so this patch
uses 'dirstate.dirs()' for it.
this patch uses dirstate instead of lfdirstate in 'working' route
(second patch hunk for 'hgext/largefiles/reposetup.py'), because
'dirs()' information may be already built for dirstate but not yet for
lfdirstate at this point. this prevents lfdirstate from building up
and having 'dirs()' information.
diff -r 447a62ee6b76 -r d09da50da3b2 hgext/largefiles/reposetup.py
--- a/hgext/largefiles/reposetup.py Thu Mar 22 23:58:47 2012 +0900
+++ b/hgext/largefiles/reposetup.py Thu Mar 22 23:58:47 2012 +0900
@@ -137,8 +137,11 @@
# Create a copy of match that matches standins instead
# of largefiles.
def tostandin(file):
- if working and lfutil.standin(file) in repo.dirstate:
- return lfutil.standin(file)
+ if working:
+ sf = lfutil.standin(file)
+ dirstate = repo.dirstate
+ if sf in dirstate or sf in dirstate.dirs():
+ return sf
return file
# Create a function that we can use to override what is
@@ -167,8 +170,12 @@
orig_ignore = lfdirstate._ignore
lfdirstate._ignore = _ignoreoverride
- match._files = [f for f in match._files if f in
- lfdirstate]
+ def sfindirstate(f):
+ sf = lfutil.standin(f)
+ dirstate = repo.dirstate
+ return sf in dirstate or sf in dirstate.dirs()
+ match._files = [f for f in match._files
+ if sfindirstate(f)]
# Don't waste time getting the ignored and unknown
# files again; we already have them
s = lfdirstate.status(match, [], False,
diff -r 447a62ee6b76 -r d09da50da3b2 tests/test-largefiles.t
--- a/tests/test-largefiles.t Thu Mar 22 23:58:47 2012 +0900
+++ b/tests/test-largefiles.t Thu Mar 22 23:58:47 2012 +0900
@@ -251,6 +251,13 @@
A sub2/large6
A sub2/large7
+Test "hg status" with combination of 'file pattern' and 'directory
+pattern' for largefiles:
+
+ $ hg status sub2/large6 sub2
+ A sub2/large6
+ A sub2/large7
+
Config settings (pattern **.dat, minsize 2 MB) are respected.
$ echo testdata > test.dat
Ok, let's hit pause here.
This patch just further confirms my impression that something is deeply
wrong with the largefiles implementation. And the rule is: extensions
are not allowed to infect the core with their broken design. The core
doesn't need or want any of this extra complexity.
Obviously, this could be done in the largefiles code. For instance, we
could subclass dirstate. But it really seems to me there ought to be a
way to fix the abstraction to avoid the need for this sort of hack
entirely.
For instance, I think we should never have to ask the question "is this
a largefile?" outside a very small set of functions. But we're checking
this or converting largefile paths to standin paths or vice-versa in
hundreds of places. That means there are probably dozens of places where
we've done it wrong or forgotten to do it.
--
Mathematics is the supreme nostalgia of our time.