thinking about regexp-tokenizer

15 views
Skip to first unread message

Ben Hyde

unread,
Aug 31, 2006, 11:00:02 AM8/31/06
to montez...@googlegroups.com
I'm finding this useful.

(defclass new-regexp-tokenizer (regexp-tokenizer)
((pattern :initarg :pattern)
(scanner? :initform nil)))

(defmethod token-regexp ((self new-regexp-tokenizer))
(with-slots (pattern scanner?) self
(if scanner?
scanner?
(setf scanner?
(cl-ppcre:create-scanner pattern :multi-line-mode T)))))

I suspect that many of the subclasses of regexp-toeknizer could use
be replaced by this; and that the labors standard-analyzer goes thru
to limit calls on create-scanner would become unnecessary.

Of course it makes one a bit sad taht the resulting tokenizer
instances have a pattern, a scanner, and a string-scanner; so maybe
this isn't exactly what's best.

But it let's me casually create analyzers for things like Camelcase,
or to add or removed dash, colons, underscores, etc.; and then
implement a swiss army knife analyzer.

- ben

"the narcissism of small diferences."

Reply all
Reply to author
Forward
0 new messages