I have different versions of nested lists generated from XML files and want to find ('highlight') the differences. The problem is similar to the history tag on wikipedia where differences of different versions of the same article are displayed.
Before I get my own hands dirty I wanted to ask if there are already solutions out there. I have problems to find useful search terms for google. Tried 'differences nested lists trees' and stuff like that. Hints for more specific search terms are welcome.
On Dec 21, 2:09 pm, Jens Teich <spamt...@jensteich.de> wrote:
> I have different versions of nested lists generated from XML files and > want to find ('highlight') the differences. The problem is similar to > the history tag on wikipedia where differences of different versions of > the same article are displayed.
> Before I get my own hands dirty I wanted to ask if there are already > solutions out there. I have problems to find useful search terms for > google. Tried 'differences nested lists trees' and stuff like > that. Hints for more specific search terms are welcome.
> Jens
i don't have answer for you, but here's some thoughts.
if your xml is valid xml, then the problem is threotecially trivial. Basically, you just have tree A and B. It is easy to compare how 2 trees differ. You can implement this by using a xml parser, then just decent on the tree ... This solution doesn't deal with xml “attributes”...
another simple, practical, semi solution is simply to insert a newline char to all tags so that each tag is on a line. Then, just use diff on them.
depending on how many such files you need to compare or whether you need this as algorithm implemented in a program ... but if just a few files or only need manual process, then emacs can easily handle it.
i'm suppossing you need this as a algorithm in a program... i think asking in XML parsing communities will help.
reading Wikipedia on XML and all associated articles about XML transformation will probably turn up helpful leads.
* Jens Teich <m2wsdttc2n....@jensteich.de> : Wrote on Sun, 21 Dec 2008 23:09:36 +0100:
| I have different versions of nested lists generated from XML files and | want to find ('highlight') the differences. The problem is similar to | the history tag on wikipedia where differences of different versions of | the same article are displayed. | | Before I get my own hands dirty I wanted to ask if there are already | solutions out there. I have problems to find useful search terms for | google. Tried 'differences nested lists trees' and stuff like | that. Hints for more specific search terms are welcome.
You might try to find `mytrie.lisp' for a prototype of how I tried dealing with this problem by using a prefix tree datastructure. Specifically the COMM-TRIE function. I use this for implementing functions that compare trees for display and manipulation, especially directory trees.
> I have different versions of nested lists generated from XML files and > want to find ('highlight') the differences. The problem is similar to > the history tag on wikipedia where differences of different versions of > the same article are displayed.
Try this:
(defmacro let1 (defvar defparameter &body progn) "Shortcut for (let ((a b)) c) " `(let ((,defvar ,defparameter)) ,@progn))
(defun struct-to-alist (struct) (error "If you want compare structures, you need to write one using your implementation's introspection. Consult SLIME source"))
(defun tree-to-atree-2 (tree) "let tree contain lists, alists and structs. tree-to-atree-2 makes alist where: structures are converted to alist with keys=slot names. Lists are converted to alist with keys=item indices. For proper alists, keys are kept and values are processed recursively" (cond ((typep (class-of tree) 'structure-class) (tree-to-atree-2 (struct-to-alist tree))) ((alistp tree) (mapcar (lambda (x) `(,(car x) . ,(tree-to-atree-2 (cdr x)))) tree)) ((consp tree) (tree-to-atree-2 (list-to-alist-2 tree))) ((numbers-in-string-p tree) (cons '(:nums . :nuums) (tree-to- atree-2 (read-from-string (str+ "(" tree ")"))))) (t tree)))
* Jens Teich <m2wsdttc2n....@jensteich.de> : Wrote on Sun, 21 Dec 2008 23:09:36 +0100:
| I have different versions of nested lists generated from XML files and | want to find ('highlight') the differences. The problem is similar to | the history tag on wikipedia where differences of different versions of | the same article are displayed. | | Before I get my own hands dirty I wanted to ask if there are already | solutions out there.
Jens Teich <spamt...@jensteich.de> writes: > I have different versions of nested lists generated from XML files and > want to find ('highlight') the differences. The problem is similar to > the history tag on wikipedia where differences of different versions of > the same article are displayed.
> Before I get my own hands dirty I wanted to ask if there are already > solutions out there. I have problems to find useful search terms for > google. Tried 'differences nested lists trees' and stuff like > that. Hints for more specific search terms are welcome.
Madhu <enom...@meer.net> writes: > | I have different versions of nested lists generated from XML files and > | want to find ('highlight') the differences. The problem is similar to > | the history tag on wikipedia where differences of different versions of > | the same article are displayed. > | > | Before I get my own hands dirty I wanted to ask if there are already > | solutions out there.
*Compare-atrees- and atree-2 are due to line auto-wrapping. It is easy to fix just by removing some extra newlines:
... (tree-to-<PASTE> atree-2 (read-from-string (str+ "(" tree ")")))))
collect `(,*compare-atrees-<PASTE> context*
You can set *MAX-ERROR* to 0 prior to comparing and then it will return maximum mismatch between corresponding numeric values (this code was used to compare results of numeric calculations which were loaded from XML)
budden <budden-l...@mail.ru> writes: > *Compare-atrees- > and > atree-2 > are due to line auto-wrapping. It is easy to fix just by removing some > extra newlines:
> ... (tree-to-<PASTE> > atree-2 (read-from-string (str+ "(" tree ")")))))
> collect `(,*compare-atrees-<PASTE> > context*
> You can set *MAX-ERROR* to 0 prior to comparing and then > it will return maximum mismatch between corresponding numeric values > (this code was used to compare results of numeric calculations which > were > loaded from XML)