On Sun, Apr 05, 2026 at 04:26:25PM +0200, Grégory Vanuxem wrote:
I have now implemented new matching code. You can find it at:
http://wiki.fricas.org/public/match.spad
After compiling you can do things like:
a := parse_pattern("a*l or b*d", true)
LISP output:
(0 0 or (1 UNPRINTABLE . UNPRINTABLE) (1 UNPRINTABLE . UNPRINTABLE))
Type: Union(LogicalMatchingAutomaton,...)
(2) -> do_match(a, "arl")
(2) true
Type: Boolean
(3) -> do_match(a, "brd")
(3) true
Type: Boolean
(4) -> do_match(a, "brl")
(4) false
Type: Boolean
Note: I did not implement coercion to OutputForm, that is why there
is "LISP output" and UNPRINTABLE above.
Matching uses finite automation. Core automation code is general,
that is it supports arbitrary nondeterministic finite automaton.
However, there is parser for patterns and it supports only
kinds of patterns previously available in HyperDoc. In principle
we could add more features, like character ranges and general
regular expressions. Efficient support for character ranges
probably will need some change to core, other things are mostly
matter of appropriate parser and construction functions.
This code uses nondeterministic finite automaton, so can
support true regular expressions, but some things offered
by "regular expressions" in pupular libraries can not be
done by finite automaton and require more powerful
(and more expensive) paradigm like backtracking search.
Such extentions are beyond scope of current implementation.
As nondeterministic finite automatons go, this implementation
spends substantial time in SPADCALL-s, so probably is
significantly less efficient than lower level code. Still,
on nontrivial patterns it seem to be several times faster
than matching code in 'match.boot' that is currently used
by HyperDoc.
I have also patch to BRINFO, it is at:
http://wiki.fricas.org/public/BRINFO.diff
After compiling patched BRINFO things like:
search_operations("search_constructors")$BrowserInformation
search_operations("__*")$BrowserInformation
search_operations("_and")$BrowserInformation
should give sensible results. Underscore works as escape in the
pattern, so that following character have literal meaning. This
allows searching for multiplication and possibly (if we add them)
for other characters that have special meaning in patterns.
--
Waldek Hebisch