Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

compare nested lists

17 views
Skip to first unread message

Jens Teich

unread,
Dec 21, 2008, 5:09:36 PM12/21/08
to
I have different versions of nested lists generated from XML files and
want to find ('highlight') the differences. The problem is similar to
the history tag on wikipedia where differences of different versions of
the same article are displayed.

Before I get my own hands dirty I wanted to ask if there are already
solutions out there. I have problems to find useful search terms for
google. Tried 'differences nested lists trees' and stuff like
that. Hints for more specific search terms are welcome.

Jens

Xah Lee

unread,
Dec 21, 2008, 6:41:11 PM12/21/08
to

i don't have answer for you, but here's some thoughts.

if your xml is valid xml, then the problem is threotecially trivial.
Basically, you just have tree A and B. It is easy to compare how 2
trees differ. You can implement this by using a xml parser, then just
decent on the tree ...
This solution doesn't deal with xml “attributes”...

another simple, practical, semi solution is simply to insert a newline
char to all tags so that each tag is on a line. Then, just use diff on
them.

depending on how many such files you need to compare or whether you
need this as algorithm implemented in a program ... but if just a few
files or only need manual process, then emacs can easily handle it.

i'm suppossing you need this as a algorithm in a program... i think
asking in XML parsing communities will help.

reading Wikipedia on XML and all associated articles about XML
transformation will probably turn up helpful leads.

Xah
http://xahlee.org/


Madhu

unread,
Dec 21, 2008, 10:09:28 PM12/21/08
to

* Jens Teich <m2wsdtt...@jensteich.de> :
Wrote on Sun, 21 Dec 2008 23:09:36 +0100:

You might try to find `mytrie.lisp' for a prototype of how I tried
dealing with this problem by using a prefix tree datastructure.
Specifically the COMM-TRIE function. I use this for implementing
functions that compare trees for display and manipulation, especially
directory trees.

--
Madhu

budden

unread,
Dec 22, 2008, 7:47:37 AM12/22/08
to
Hi!

> I have different versions of nested lists generated from XML files and
> want to find ('highlight') the differences. The problem is similar to
> the history tag on wikipedia where differences of different versions of
> the same article are displayed.

Try this:

(defmacro let1 (defvar defparameter &body progn) "Shortcut for (let
((a b)) c) "
`(let ((,defvar ,defparameter)) ,@progn))

(defun struct-to-alist (struct)
(error
"If you want compare structures, you need to write one using your
implementation's introspection. Consult SLIME source"))

(defun tree-to-atree-2 (tree) "let tree contain lists, alists and
structs. tree-to-atree-2 makes alist where: structures are converted
to alist with keys=slot names. Lists are converted to alist with
keys=item indices. For proper alists, keys are kept and values are
processed recursively"
(cond
((typep (class-of tree) 'structure-class)
(tree-to-atree-2 (struct-to-alist tree)))
((alistp tree) (mapcar (lambda (x) `(,(car x) . ,(tree-to-atree-2
(cdr x)))) tree))
((consp tree) (tree-to-atree-2 (list-to-alist-2 tree)))
((numbers-in-string-p tree) (cons '(:nums . :nuums) (tree-to-
atree-2 (read-from-string (str+ "(" tree ")")))))
(t tree)))


(defun tree-weight (tree)
(cond
((atom tree) 0)
(t (+ 1 (tree-weight (car tree)) (tree-weight (cdr tree))))))


(defvar *compare-atrees-context* nil)

(defun compare-atrees (left right &key (test #'equalp))
"return nil if atrees are 'equal' (keys are the same and corresponding
values satisfy the test). Otherwise, returns mismatch-map"
(cond
((and (atom left) (atom right))
(let1 the-same (funcall test left right)
(if the-same nil
(if (and (numberp left) (numberp right) (/= 0 (* left right)))
(let1 err
(let ((left (abs left)) (right (abs right))) (- (/
(max left right) (min left right)) 1))
(setf *max-error* (max err *max-error*))
`(:l ,left :r ,right :err ,err))))))
(t
(let*
((lkeys (mapcar #'car left))
(rkeys (mapcar #'car right))
(keys-missing-on-right
(set-difference lkeys rkeys :test #'equalp))
(keys-missing-on-left
(set-difference rkeys lkeys :test #'equalp))
(kvs-different
(loop for key in (intersection lkeys rkeys :test #'equalp)
for lv = (cdr (assoc key left :test #'equalp))
for rv = (cdr (assoc key right :test #'equalp))
for mismatch = (compare-atrees lv rv :test test)
when (and (eq key 'cllib::name) ;
(equalp lv "Value")) ;
do (setf *compare-atrees-context* (mapcar (lambda (x) `
(,(car x) ,(cdadr x))) (cdr (assoc 'cllib::args left)))) ;
if mismatch
if (> (tree-weight mismatch) 10) ;
collect `(,key ,mismatch)
else ;
collect `(,*compare-atrees-
context* ,key ,mismatch))) ;
(result nil)
)
(when keys-missing-on-left
(push `(:miss-left ,@keys-missing-on-left) result))
(when keys-missing-on-right
(push `(:miss-right ,@keys-missing-on-right) result))
(when kvs-different
(push `(:diff ,@kvs-different) result))
(nreverse result)))))

This is cutted from "production" code, some bits may be missing. Write
if you have problems.

-----------------
$5/hour cl freelancer

budden

unread,
Dec 22, 2008, 7:52:51 AM12/22/08
to
One missing thing is
(defun str+ (&rest args) (apply 'concatenate 'string (mapcar 'string
args))) (export 'str+)

Madhu

unread,
Dec 22, 2008, 10:43:50 AM12/22/08
to
* Jens Teich <m2wsdtt...@jensteich.de> :
Wrote on Sun, 21 Dec 2008 23:09:36 +0100:

| I have different versions of nested lists generated from XML files and


| want to find ('highlight') the differences. The problem is similar to
| the history tag on wikipedia where differences of different versions of
| the same article are displayed.
|
| Before I get my own hands dirty I wanted to ask if there are already
| solutions out there.

Or, you may be able to just use

<URL:http://www.foldr.org/~michaelw/lisp/diff-sexp.lisp>

--
Madhu

Albert Krewinkel

unread,
Dec 22, 2008, 5:01:11 PM12/22/08
to

Jens Teich

unread,
Dec 22, 2008, 6:25:36 PM12/22/08
to
Madhu <eno...@meer.net> writes:


> | I have different versions of nested lists generated from XML files and
> | want to find ('highlight') the differences. The problem is similar to
> | the history tag on wikipedia where differences of different versions of
> | the same article are displayed.
> |
> | Before I get my own hands dirty I wanted to ask if there are already
> | solutions out there.
>
> Or, you may be able to just use
>
> <URL:http://www.foldr.org/~michaelw/lisp/diff-sexp.lisp>

This does exactly what I was looking for, thanx!

Jens

Jens Teich

unread,
Dec 22, 2008, 6:47:35 PM12/22/08
to
budden <budde...@mail.ru> writes:

> This is cutted from "production" code, some bits may be missing. Write
> if you have problems.

Thanks for your code. I get some undefined variables:

*COMPARE-ATREES- *MAX-ERROR* ATREE-2

Jens

Jens Teich

unread,
Dec 22, 2008, 6:52:43 PM12/22/08
to
Albert Krewinkel <krew...@gmx.net> writes:
> http://lemonodor.com/archives/001226.html

Hallo Albert !

That is the same as the link already posted. But thanks anyway.

Jens

budden

unread,
Dec 23, 2008, 3:16:00 AM12/23/08
to
*Compare-atrees-
and
atree-2
are due to line auto-wrapping. It is easy to fix just by removing some
extra newlines:

... (tree-to-<PASTE>


atree-2 (read-from-string (str+ "(" tree ")")))))

collect `(,*compare-atrees-<PASTE>
context*

You can set *MAX-ERROR* to 0 prior to comparing and then
it will return maximum mismatch between corresponding numeric values
(this code was used to compare results of numeric calculations which
were
loaded from XML)

Jens Teich

unread,
Dec 23, 2008, 7:24:08 AM12/23/08
to
budden <budde...@mail.ru> writes:

I see, thank you again.

0 new messages