Before I get my own hands dirty I wanted to ask if there are already
solutions out there. I have problems to find useful search terms for
google. Tried 'differences nested lists trees' and stuff like
that. Hints for more specific search terms are welcome.
Jens
i don't have answer for you, but here's some thoughts.
if your xml is valid xml, then the problem is threotecially trivial.
Basically, you just have tree A and B. It is easy to compare how 2
trees differ. You can implement this by using a xml parser, then just
decent on the tree ...
This solution doesn't deal with xml “attributes”...
another simple, practical, semi solution is simply to insert a newline
char to all tags so that each tag is on a line. Then, just use diff on
them.
depending on how many such files you need to compare or whether you
need this as algorithm implemented in a program ... but if just a few
files or only need manual process, then emacs can easily handle it.
i'm suppossing you need this as a algorithm in a program... i think
asking in XML parsing communities will help.
reading Wikipedia on XML and all associated articles about XML
transformation will probably turn up helpful leads.
Xah
∑ http://xahlee.org/
☄
You might try to find `mytrie.lisp' for a prototype of how I tried
dealing with this problem by using a prefix tree datastructure.
Specifically the COMM-TRIE function. I use this for implementing
functions that compare trees for display and manipulation, especially
directory trees.
--
Madhu
> I have different versions of nested lists generated from XML files and
> want to find ('highlight') the differences. The problem is similar to
> the history tag on wikipedia where differences of different versions of
> the same article are displayed.
Try this:
(defmacro let1 (defvar defparameter &body progn) "Shortcut for (let
((a b)) c) "
`(let ((,defvar ,defparameter)) ,@progn))
(defun struct-to-alist (struct)
(error
"If you want compare structures, you need to write one using your
implementation's introspection. Consult SLIME source"))
(defun tree-to-atree-2 (tree) "let tree contain lists, alists and
structs. tree-to-atree-2 makes alist where: structures are converted
to alist with keys=slot names. Lists are converted to alist with
keys=item indices. For proper alists, keys are kept and values are
processed recursively"
(cond
((typep (class-of tree) 'structure-class)
(tree-to-atree-2 (struct-to-alist tree)))
((alistp tree) (mapcar (lambda (x) `(,(car x) . ,(tree-to-atree-2
(cdr x)))) tree))
((consp tree) (tree-to-atree-2 (list-to-alist-2 tree)))
((numbers-in-string-p tree) (cons '(:nums . :nuums) (tree-to-
atree-2 (read-from-string (str+ "(" tree ")")))))
(t tree)))
(defun tree-weight (tree)
(cond
((atom tree) 0)
(t (+ 1 (tree-weight (car tree)) (tree-weight (cdr tree))))))
(defvar *compare-atrees-context* nil)
(defun compare-atrees (left right &key (test #'equalp))
"return nil if atrees are 'equal' (keys are the same and corresponding
values satisfy the test). Otherwise, returns mismatch-map"
(cond
((and (atom left) (atom right))
(let1 the-same (funcall test left right)
(if the-same nil
(if (and (numberp left) (numberp right) (/= 0 (* left right)))
(let1 err
(let ((left (abs left)) (right (abs right))) (- (/
(max left right) (min left right)) 1))
(setf *max-error* (max err *max-error*))
`(:l ,left :r ,right :err ,err))))))
(t
(let*
((lkeys (mapcar #'car left))
(rkeys (mapcar #'car right))
(keys-missing-on-right
(set-difference lkeys rkeys :test #'equalp))
(keys-missing-on-left
(set-difference rkeys lkeys :test #'equalp))
(kvs-different
(loop for key in (intersection lkeys rkeys :test #'equalp)
for lv = (cdr (assoc key left :test #'equalp))
for rv = (cdr (assoc key right :test #'equalp))
for mismatch = (compare-atrees lv rv :test test)
when (and (eq key 'cllib::name) ;
(equalp lv "Value")) ;
do (setf *compare-atrees-context* (mapcar (lambda (x) `
(,(car x) ,(cdadr x))) (cdr (assoc 'cllib::args left)))) ;
if mismatch
if (> (tree-weight mismatch) 10) ;
collect `(,key ,mismatch)
else ;
collect `(,*compare-atrees-
context* ,key ,mismatch))) ;
(result nil)
)
(when keys-missing-on-left
(push `(:miss-left ,@keys-missing-on-left) result))
(when keys-missing-on-right
(push `(:miss-right ,@keys-missing-on-right) result))
(when kvs-different
(push `(:diff ,@kvs-different) result))
(nreverse result)))))
This is cutted from "production" code, some bits may be missing. Write
if you have problems.
-----------------
$5/hour cl freelancer
| I have different versions of nested lists generated from XML files and
| want to find ('highlight') the differences. The problem is similar to
| the history tag on wikipedia where differences of different versions of
| the same article are displayed.
|
| Before I get my own hands dirty I wanted to ask if there are already
| solutions out there.
Or, you may be able to just use
<URL:http://www.foldr.org/~michaelw/lisp/diff-sexp.lisp>
--
Madhu
> | I have different versions of nested lists generated from XML files and
> | want to find ('highlight') the differences. The problem is similar to
> | the history tag on wikipedia where differences of different versions of
> | the same article are displayed.
> |
> | Before I get my own hands dirty I wanted to ask if there are already
> | solutions out there.
>
> Or, you may be able to just use
>
> <URL:http://www.foldr.org/~michaelw/lisp/diff-sexp.lisp>
This does exactly what I was looking for, thanx!
Jens
> This is cutted from "production" code, some bits may be missing. Write
> if you have problems.
Thanks for your code. I get some undefined variables:
*COMPARE-ATREES- *MAX-ERROR* ATREE-2
Jens
That is the same as the link already posted. But thanks anyway.
Jens
... (tree-to-<PASTE>
atree-2 (read-from-string (str+ "(" tree ")")))))
collect `(,*compare-atrees-<PASTE>
context*
You can set *MAX-ERROR* to 0 prior to comparing and then
it will return maximum mismatch between corresponding numeric values
(this code was used to compare results of numeric calculations which
were
loaded from XML)