> Since you're using the term gesture, i assume a clockwise circle IS
> different from a CCW circle. To me, gesture implies dynamics.
Yes, in my case, that's true. (note that the term
gesture has many meanings if you search the literature.
That's why I tried to present "The Backstory")
Note, however, that my implementation doesn't *require*
these to be different! E.g., I could define "circle"
and "square" to mean "cut" and "paste".
"Does it matter *how* I draw the circle/square?"
"No. As long as it is recognizable as a circle/square"
This gives the user freedom in how he issues the gestures.
I.e., with the above command binding, why should circleCW
be different from circleCCW? Should one of these be
"correct" while the other signals an error -- "unrecognized
gesture"?
OTOH, there are times when there *is* some difference that
wants to be highlighted by the different *way* the gesture
was issued. E.g., imagine controlling an inspection
system with these gestures. There, circleCW might mean
"rotate the object/image 90 degrees in a clockwise direction
(so I can get better look at some part of it)" while the
circleCCW gesture causes a rotation in the opposite direction.
(like "horizontal stroke right-to-left" can mean "move to
the next page" while the mirror image gesture can mean
"move to the previous page")
You might also want to provide some flexibility to the
user in *where* the gesture begins. E.g., I tend draw
circles beginning from the 12-o'clock position. But,
I've often seen folks who start at 6-o'clock. So,
two different starting points, two different directions.
Four different gestures??
I define the "paths" that define the gesture -- i.e.,
circle, square, slash, cup, box, etc. In a sense,
these are "shapes" but since they also have some
sense of "time" embodied in them (i.e., the direction
in which it is drawn), I call them "paths".
Each path is named (internally and externally).
A path has certain "features" that can be computed
(compile time) from that static data. E.g., its
aspect ratio, center of mass, total angle, curviness,
start-to-end vector, etc. These features greatly
distill the essence of the path in an attempt to
quickly differentiate "circle" from "stroke".
Paths can be easily *transformed*. Simple rotations and
reflections are the most apparent. E.g., a reflected
circle becomes a circle "drawn in the reverse direction".
Note that most features can be easily transformed to
correspond with the transformed path! (e.g., reflect
a circle and its total angle is negated but it's aspect
ratio remains the same, etc.)
Transformed gestures are easily named in a canonical
form: circle reflected, box rotated, etc. So, once
you have an understanding of the transforms, you can
visualize what any named path would look/feel like.
The recognizer takes a list of paths, transforms, etc.
as the "set of recognizable gestures" at any given
point in time. It examines the current input data set
trying to fit the data to one of these gestures.
*If* it can, it passes a gesture-specific event to
the application -- the equivalent of a keystroke.
> And, it seems that, within those 'sets'/'sequences' that are allowed
> to follow each other, choosing shapes/gestures with maximum
> 'differences in form' would also be hlpful. As in staright lines
> follow curves, curves follow straight lines, etc. Semaphore comes to
> mind as well as Morse code for some reason when thinking about this
> device.
I see your point but that limits what the user can do
in the interface. E.g. you couldn't "move to next
page" followed by "move to next page".
Instead, what I do is stall the recognizer after it
emits a "keystroke" (i.e., if there is more "raw data"
to be processed, it just sits there, idle). Then, let
the application/UI process the keystroke and decide
what the *next* set of valid gestures should be.
These are then passed to the recognizer and it processes
any pending and subsequent input data in that context.
Since the recognizer is operating synchronously, it
has to be *fast* (the application/UI must be equally
quick in its decisions as to "what can come next").
> I once read where a signature verifying program weighted the TIME a
> signature was written more than the shape. Seems a forger would go
> quite slowly compared to the legitimate 'author'.
Yes, I think that was initially true (no idea if it remains
so). AFAICT, many of the "signature pads" don't do any
checking of the data but, rather, act simply as "paper
replacements".
> Dunno if your devices would be 'assigned' to the same user
> or not.
No. A big part of the motivation behind this particular
implementation was to provide "operator independence".
But, not just in allowing multiple users to share a
device! Instead, facilitating the transport of "operator
characteristics" between devices and technologies.
I.e., *you* should be able to use your friend's
device with the same level of recognition accuracy
that you get on your own!
> If it is, then 'training' and 'auto-correlating' should improve
> accuracy.
I don't want an explicit "training phase" in which the
device is "taught" how you issue each gesture. Rather,
I want you to be able to begin using the device
immediately. The device can *learn* how you issue each
gesture as you use it. E.g., each "recognition" that
you "accept" (as correct) can be used to identify
your particular "gesture style". Similarly, each
gesture that you *reject* (as incorrect) can be
*discounted* from the training set (I haven't
yet figured out an algorithmic way of using those
"misrecognitions" to identify aspects of the models
that are *incorrect/ambguous*)
> it's all over my head except for the basic concept. i applaud your
> efforts though. It seems to me some of what your working on could be
> useful for disabled folks with certain neurological or motor-control
> problems.
The approach is intended to be *inclusive* of folks with
these sorts of problems. But, not specfically geared
towards them at the expense of (ahem) "normal" people.
For example, most "recognizers" use training to modify
the templates against which future input data are compared.
I.e., tweek the machine's idea of a <circle> to match
what the user *draws* as a <circle>.
This *might* increase recognition accuracy for future
<circle>s. But, it does nothing for future "figures-of-8"!
To get a similar training benefit for those, the user
would have to issue a comparable number of examples
from which the recognizer could "learn" the deviations
from the model that the user introduces for *that*
gesture. If figures-of-8 are less common in the
interface, then this learning can take quite a long
time -- without an explicit "learning exercise"
(in which the user is unnaturally prompted to issue
gestures under the COMMAND of the device! IMO, a
device should never "drive" the user! :< )
If, instead, you can identify "deformations" to the
model that account for the user's "drawing habits",
then you can conceivably apply those to *other*
models -- regardless of the shapes involved -- to
improve their recognition as well!
E.g., if you tend to draw <vertical_stroke>s on a
slight angle, chances are you will draw other
paths with a similar bias. For example, that touchpad
mounted on the jacket lapel might be a little crooked.
*Or*, it might be perfectly straight but the position
of your arm across you chest as you write on it might
lend that sort of skew to ALL of your gestures!
If this bias can be extracted and applied to all
of the gesture models (for *you*), then recognition
of ALL of them can be improved.
[And, *your* "characteristics" can be codified in this
"transform" which can be ported to other similar devices
to summarily adjust their gesture models to your needs]
For certain physical characteristics (e.g., disabilities),
the nature of the transforms would change -- but would
still apply to *all* gestures (as there is a physical
or anatomical basis underlying the transform). E.g.,
folks with ET would tend to exhibit a "high" frequency
left-right oscillation that wouldn't be present in folks
without that problem. At the same time, there might
be *no* top-bottom component present.
> fun stuff to think about
<frown> Beats the hell out of "shell sorts"! ;-)