Computer Friendly Handwriting

63 views
Skip to first unread message

vasten

unread,
Jul 20, 2014, 8:25:28 AM7/20/14
to loj...@googlegroups.com
A logban user contacted me and says he is using this for lojban, figured you guys might like too :)
http://www.facebook.com/Dscripting  - see comments in the top post on this page for details ]

He says tendinitis in the right hand forces him to use his left hand and he find Cscript easier.
Seems he has mapped it to lojabn quite easily.

CSCRIPT - Computer / Human Bi-Friendly Writing System

http://dscript.ca/cscript.pdf

Cscript is designed to be both easy to read and write by humans as well as be digitally and programmatically read and written by computers.

There is of course a trade off between the two. Cscript could be thought of as "lying somewhere between QR codes and standard hand writing"

For Humans it easy to produce with standard lined paper and allows some intuitive "cursive" elements.

For Computers it removes entirely the need for glyph recognition and shape/vector analysis. It could read straight off the raster level as a string of absolute values. It eliminates the need for an entire level of OCR shape comparison, and drastically reduces ambiguity, requires FAR less processing power and increased accuracy.

It is not meant to be "perfect" for either, but instead offer a more balanced alternative.

**technically it would not be considered "OCR" as it does not actually require the "character recognition" level of the software at all.

***keep in mind the "value range zones" can even be flexible based on context (ie. assume top=1, bottom=0.. [0->0.33]=0, [0.34->0.66]=1, [0.67->1]=2... the values [0 , 0.1, 0.1] = "0-1-1" because there would be no corner point in 000

Another new project of mine that might be fun too....

CHEMICAL CALLIGRAPHY

http://dscript.org/chem.pdf

This is a mnemonic device and art form.

It is designed to allow simpler representation of bio-chem molecules with "less noise".

It drops some information, which is assumed is "obvious" to someone with basic chemistry knowledge. The missing information can usually be "filled in" with basic chemistry understanding.

It allows various forms of any one molecule (the larger the molecule the more possible forms), which adds greatly to the users ability to make artistic and aesthetic choices without altering the molecular and structural information.



Still have not rebuilt my lab, current apt too small :( So still just hacking with "pen and paper" ;)

Hopefully soon will be able to get back to my mad science  :)
chem.pdf
cscript.pdf

.arpis.

unread,
Jul 20, 2014, 3:17:39 PM7/20/14
to Lojban
Not to be unreasonably pessimistic, but I will believe that the design goal of computer legibility to be satisfied when I see a program that demonstrates it. Both Cscript and Dscript are cool, and I'll definitely play around with them on my next con-script kick, but I worry that you're underestimating the difficulties of OCR on real text written at a reasonable speed.


--
You received this message because you are subscribed to the Google Groups "lojban" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lojban+un...@googlegroups.com.
To post to this group, send email to loj...@googlegroups.com.
Visit this group at http://groups.google.com/group/lojban.
For more options, visit https://groups.google.com/d/optout.



--
mu'o mi'e .arpis.

Robin Lee Powell

unread,
Jul 21, 2014, 1:17:01 AM7/21/14
to loj...@googlegroups.com
Rather the opposite; http://en.wikipedia.org/wiki/Graffiti_(Palm_OS)
worked perfectly on CPUs that now probably are used to drive
*watches*; a modern smartphone is many many times more powerful.

My question is why the hell you would invent something as painfully
terrible on the human as Cscript when Graffiti is know to be
entirely workable in real time on what is in the modern day a
trivial processor.
http://intelligence.org/ : Our last, best hope for a fantastic future.
.i ko na cpedu lo nu stidi vau loi jbopre .i dafsku lu na go'i li'u .e
lu go'i li'u .i ji'a go'i lu na'e go'i li'u .e lu go'i na'i li'u .e
lu no'e go'i li'u .e lu to'e go'i li'u .e lu lo mamta be do cu sofybakni li'u

Matthew DeBlock

unread,
Jul 21, 2014, 8:54:28 AM7/21/14
to loj...@googlegroups.com
as per "underestimating" OCR requirements.
as I pointed out this would be more like QR codes, it only has to "trace" the line and it has all the data. OCR needs an extra step where it compares likeness to known glyphs. a whole leve lis gone.

I plan to build some readng software as soon as I have time. I am planning on first doing it in PHP so it is easy for others to test and modify, but to do a reall "speed test" I will need to do it in C

As per Palm OS graffiti
That only works if it is draw into the system in real time. not if it is scanned in after the fact. it relies on knowing the stroke direction, something that is lost in scan of written text.



--
You received this message because you are subscribed to a topic in the Google Groups "lojban" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/lojban/tP3LTTlV3x0/unsubscribe.
To unsubscribe from this group and all its topics, send an email to lojban+un...@googlegroups.com.

Robin Lee Powell

unread,
Jul 21, 2014, 2:22:58 PM7/21/14
to loj...@googlegroups.com
On Mon, Jul 21, 2014 at 08:54:26PM +0800, Matthew DeBlock wrote:
> As per Palm OS graffiti
> That only works if it is draw into the system in real time. not if it is
> scanned in after the fact. it relies on knowing the stroke direction,
> something that is lost in scan of written text.

Aaaah, good point.

.arpis.

unread,
Jul 21, 2014, 3:42:15 PM7/21/14
to Lojban
(Suppressing snobbery about language choice)

That sounds really cool! I hope you'll post on your facebook page when it's done.

Matthew DeBlock

unread,
Jul 22, 2014, 12:16:24 AM7/22/14
to loj...@googlegroups.com
Of course I will.

Ill try to remember to send you a copy of the code when I get around to it :)

Seth Turner

unread,
Jul 22, 2014, 1:07:46 AM7/22/14
to loj...@googlegroups.com
I'm not sure how you are supposed to tell the difference between letters of the same shape at different heights, like a and n, d and q. I'm otherwise quite a fan of all of your work

Matthew DeBlock

unread,
Jul 22, 2014, 8:22:58 AM7/22/14
to loj...@googlegroups.com
well using lined paper it should be pretty easy (middle vs. touching top or bottom)

also, assume top =100 bottom=0
the difference in hight can be more than half

eg.
0,1,1 would be 0-50-50
0,2,2 would be 0-100-100

but 011 could also be written 0-10-10
this would make the difference even greater adn easier to see (it woul dnot be confused with 000 because 000 would not have any corners)

if you mean for the computer reading..
then you are perhaps confusing this with glyph based OCR
the computer would never looks ta the "shape as a whole".. that is needed for glyph recognition(like latin alphabet) but that whole level of the software is not needed here

basically dont think of them like you would normal glyphs, you would never "compare shapes", just read "absolute hight values"

.arpis.

unread,
Jul 22, 2014, 9:13:27 AM7/22/14
to Lojban
How does the computer know the "absolute height value" that is in force at a particular glyph? For example, assume two different runs of the program, each trying to read the only character on a page; in one case, the character is "c" and in the other case, the character is "b"; the paper is unlined. I recognize that this is a problem for people too, but that doesn't make the computer version go away. The same problem arises for "bed" vs "cig" (short for cigarette) if it's the only word on lined paper.

What about someone like me, who has a bad habit of not writing at a consistent height? How will the computer keep from, partway through some of my handwriting, treating all my strokes as short?

What I'm trying to get at is that I think there's a trickier calibration problem (though by no means unsolvable) than you seem to be acknowledging.

Seth Turner

unread,
Jul 22, 2014, 10:53:36 AM7/22/14
to loj...@googlegroups.com
i was thinking without lined paper, written by a person. computer should have a very easy time with this stuff though

Matthew DeBlock

unread,
Jul 22, 2014, 1:19:13 PM7/22/14
to loj...@googlegroups.com
a single letter, on unlined paper is the probably the one possible scenario where it does have a problem, but that is easily solved.

you could use the ratio between line segment lengths (with a clear vertical range ratio is moot, horizontal lines can be longer or shorter than the vertical range with no effect of data... if you wanted to draw one character and scan it all by itself, just ensure the ratio can be used to determine this)

eg
vertical line is either the same length or shorter than the horizontal
or is longer/ double length

The greater problem for "single character scanning" in my mind would be rotation, what if it is upside down, but this is also a problem for standard latin alphabet (p vs d, q vs b)

im my mind there are three possible cases
1)human written on lined paper - No Problems
2)computer generated no line - No Problem (ratio would be perfect, and computer can generate perfect alignment without lines)
3)human written no lines - difficult and minor Problems

The minor problem is identifying the range, especially if your don't write in a straight line or write only a few characters that all happen to the same height

It would be easy enough to just draw a vertical line before the text t establish the range (eg. "| cig"). Or any other method where a marking is meant to establish a unit of length or height to be referred to

as per "sloppy handwriting". this problem exists with ALL computer reading of human written text, NO system can fix this
even human have trouble reading each others sloppy handwriting

The question should be "can it read sloppy all handwriting"?
but instead "what is the margin of error? how easily are errors made while writting? and how does this compare to other formats?"

without proper study i cant say.. but I think it is safe to guess Cscript will outperform latin script on lined paper

basically it is designed for lined paper
while it CAN be written by hand without lines, it is not optimized for that use

don't have lined paper???

just grab something straight and draw your own lines ;)

I plan to actually use the lines in the first versions of the scanning software, first i will scan for long line, establish rows, and scan them, much more accurate and faster than trying to auto-detect


Matthew DeBlock

unread,
Jul 22, 2014, 1:20:38 PM7/22/14
to loj...@googlegroups.com
*the question should NOT be

Andrew Browne

unread,
Jul 23, 2014, 8:58:37 AM7/23/14
to loj...@googlegroups.com

Glyphs are allocated in alphabetical order. Would it make sense to prioritize some (ie. vowels, maybe most common consonants - although consonant frequency will be much more language specific) and assign these the simplest glyphs?

Korean hangul does a nice job of prioritizing vowels for efficiency.



How does Cscript compare to:

- http://www.reddit.com/r/elianscript (see links on the right)
  Most of the examples of that elian are quite messy - but what is stopping Cscript from being written just as messily?
  One cool feature of elian is simple/robust rules for what order to read letters when they are fit together in different/creative ways.

  Dotsies appears to be more compact, but dots are may be harder to write than strokes.


If you want to make the claim that Cscript is easier to OCR than either of these, it would be a good experiment to write an OCR implementation for each and compare the performance/complexity/difficulties.

Matthew DeBlock

unread,
Jul 24, 2014, 2:16:04 AM7/24/14
to loj...@googlegroups.com
It most defintely makes sense to re-index the alphabet.

i would not jump to the conclusion of "vowel be prioritized". because what would that mean? which letter are "better"?

it would make better sense, as I point out in the pdf, to analyse letter combination in language and assure the most frequent combos are easse and efficiently connectible/compressible.

compared to elian script
Cscript compresses bettter and require less overal detail/pen stroke to wite the same text (downside it require single row format, and is best used on lined paper.. more "strict")

compared to dotsies
humans can easily write Cscript.. dotsies is only designed for reading.

while i do believe in the scientific method, i never "made such a claim"

you inferred it yourself ;)

I am only "half scientist".. im also half "artist"

I have no desire to "prove" anything.. the ideas/concepts are like artworks (as opposed to the individual drawing itself)

I have FAR to many things to build and idea I want to get out there.. i do not plant to waste massive amounts of time trying to "prove anything"

funny note:
I personally find that the more people ask me "prove it" the more the idea is likely to be valuable and useful.
(people just ignore bad ideas lol)

I will write software for it.. but on my own terms, when i feel like it and when I get around to it.
the idea is free, so like many of my other inventions, it is likely others will further the development before I do.. i in fact prefer it if they do.. because for me "people liking the idea, using it, and further evolving it themselves" is the best proof of "value".

I have received enough positive feedback for various sources that I think I could even just drop this project and never touch it again and expect other to do the rest ;)  this seed seems to have already taken root..

I wanna build as many "viral" ideas as I can in my lifetime.
to me "virality"(without hard selling) proves that the idea has an inherent value of its own


--
Reply all
Reply to author
Forward
0 new messages