project concept

25 views
Skip to first unread message

Free Beachler

unread,
Apr 18, 2012, 6:41:31 PM4/18/12
to boulder-hacke...@googlegroups.com
Hey Hackers,
Please excuse this long message.  I wanted to share a project concept that's been on my mind for several years.  What if we could talk to our computers to build programs instead of writing code?  What if we could talk to them in our native language - say English, Russian, or Swahili?

Here are 5 examples:

Hello World:
Me to Computer:  "create an Application named quote helloworld unquote that prints the message quote hello world unquote to the screen in the color red, go"
Computer to Me:  done.

result=  Computer builds an application named HelloWorld as described

Defining an Object:
Me to Computer:  "define an object with 8 members and go.  the first member is named quote foo unquote, go.  show available data types [pause]"
Computer:  shows a window in an IDE with available data types
Me:  touch screen or say, "the data type of the first member is string, go"
... and so on to define an object ...
Computer to me:  done

result=  Computer defines the object as described

Defining a Procedure:
Me to Computer:  "create a method named 'foo' with 3 inputs, go"
Computer to Me:  "describe the inputs"
Computer:  a window pops up in an IDE that shows the available data types
Me to Computer:  "input 1 is named 'bar' and it's an integer [pause - just got distracted by incoming Email]"
Computer to Me:  "describe the remaining two inputs"
Me to Computer:  "they are [finish description...] period"
... and so on to define a method signature ...
Computer to me:  done

result=  Computer defines the procedure as described -- we skipped how to bind the method to an object or something that can be executed

Implementing a Procedure (method):
Me to computer:  create a local variable named quote foo unquote of type string, set it's value to quote bar unquote, next
Me to computer:  return foo from [name of procedure]
Computer to me:  done

result=  Computer displays a visual description of the method I've just described and binds it to the appropriate object

Implementing a database connection within a Procedure:
Me to Computer:  "show available database connections"
Computer:  opens a window that shows the database connectors defined in my programming environment 
Me to computer:  "open a database connection to the MySQL server on this computer and call the file handle 'foo', the login username is 'bar', and I'll type the password, go"
Computer to Me:  "you need to install the MySQL binary and [driver/software interface] first"
...

Architecting this properly would ultimately require a team with deep skills in several disciplines - including at least a couple disciplines I know nothing about such as linguistics and lexical analysis.  Still, I think we could learn what we need to find natural and efficient ways to speak to a computer and generate code.  I think we can do this for initial prototypes without the need to build or purchase a massive lexical analysis engine - a small one should suffice at first.  We would prototype a system that allows humans to write code in their native language with the goal of being 100% translatable to both humans and machines.

Are any of you interested in working on this with me?  Do you know anybody with a linguistics background who might be interested?


Regards,
Free

aso...@gmail.com

unread,
Apr 18, 2012, 7:46:21 PM4/18/12
to boulder-hacke...@googlegroups.com
I've seen attempts at programming with natural languages before, especially for test systems.  Stuff like defining a test case as "if I call foo(27), the result should be 54".  The problem is that English (and probably all other natural languages) are inherently imprecise.  In order to unambiguously interpret English by machine, you have to place restrictions on what the user can say|type that make it less natural.  As your system becomes more and more flexible, your input becomes less and less natural.  By the time you can translate English into arbitrary programs, the user is probably speaking an actual programming language, not English.  But it's probably a clunky and difficult programming language.  I don't think that it will ever be possible to automatically generate complex programs based solely on a freeform English description.

Database operations are an interesting subset, though.  SQL statements already have a somewhat natural format, and nonprogrammers frequently work with databases.  I think that you might be able to get somewhere processing speech to database puts and queries.

Free Beachler

unread,
Apr 18, 2012, 8:02:10 PM4/18/12
to boulder-hacke...@googlegroups.com
Exactly.  The end-result of the project would be to deliver a human language dialect for building machine code, not a programming language in the existing sense.  Yes there would be constraints on the language constructs.  Lexical analysis would be de-coupled from the action of forming object-linker code before compilation.  Thus improved lexical analysis engines would impact the dialect supported by this engine.

Tim Mensch

unread,
Apr 18, 2012, 8:12:13 PM4/18/12
to boulder-hacke...@googlegroups.com
Not only is it extremely difficult to end up with something that turns requests into code, a big part of the problem is that, while "toy" natural language systems have been created (dating back to the '60s, IIRC), as soon as you get out of a very tight domain, you run into another problem entirely: The user typically doesn't know to a high level of precision what they really need the program to do.

All the way back into the '50s people have been predicting that strong AI (which is what you'd really need to interpret arbitrary natural language) would be here within 20 years. I think that's probably still a good estimate, and if you ask me again in 20 years, the answer might still be the same. :)

The problem CAN be solved for a narrow domain, though -- something like SQL queries could be amenable, for instance -- so if you pick the domain carefully, you could potentially create a really useful product. Good luck.

Tim

Free Beachler

unread,
Apr 18, 2012, 8:16:50 PM4/18/12
to boulder-hacke...@googlegroups.com
One more thing.  If you've read my examples and still think it's unreasonably difficult to build complex software from a (constrained, yet) naturally spoken language - and be _much_ more productive than typing it, then I probably didn't articulate the concept well.  It's definitely possible to generate complex programs from a natural dialect.  The dialect simply needs to provide a way to always be faster than typing the same code is.  Rich visual interfaces and touch screens are all "game" for this concept.  The 5 simple examples I provided are naturally spoken and imply massive productivity gains to the developer for each use-case.  They are just the beginning - I'm sure we can think of many more.  The fundamental examples also demand a rich visual interface, intentionally.  There is more to say on this topic methinks...

Free

Free Beachler

unread,
Apr 18, 2012, 8:56:14 PM4/18/12
to boulder-hacke...@googlegroups.com
It was never in my interest to debate the feasibility of the project.  I know it's feasible to build a system that achieves the stated goals, though I currently lack the skills to prove this mathematically.  There are many other skills that would go into making this a serious project that we might lack individually or even as the entire hacker group.  I doubt anybody who might be interested in doing this would possess all of the necessary skills.

The debate that seems to arise is - not whether it's possible to visually and aurally represent all the constructs one needs to build complex software - but whether it's practical?  Actually...I'm not sure if that is the criticism...but I'll take that on.  It certainly is practical.  Example #1 demonstrates a massive productivity increase for the simple use-case of "create a new executable application".  That should be enough proof - but the remaining 4 examples all deomnstrate similar productivity gains.  I'm not sure there is anything more holier in software engineering than productivity - and if you wish to debate that assertion you've lost me altogether.

"I don't know what the program is going to do yet".  True dat.  Indeed...we never do until we build it.  This system has to recognize this as a design goal.

"AI isn't mature enough".  I disagree.  Irrespective of that - AI isn't a huge part of this concept.  It's needed for lexical analysis.  The system needs to provide both aural and visual interfaces so the user can define and bind the constructs they need.  I want to bridge the gap between the definition of resources, bindings, and the constructs that go into complex programs - and a human dialect to describe them.

Ross Hendrickson

unread,
Apr 18, 2012, 9:35:21 PM4/18/12
to boulder-hacke...@googlegroups.com
Free,

I'm going to weigh in here, as Linguistics (specifically computational linguistics) is my area of expertise (well, one of them). I've done work with speech recognition engines, computational semantics, and machine translation. I have to take issue with your statement below.

"I know it's feasible to build a system that achieves the stated goals, though I currently lack the skills to prove this mathematically."

Simply put, current open source systems are not able to fulfill the basic input into your system, much less working towards translating the utterances into some sort of a semantic understanding of what the user "wants" to program which is then translated to some form of code. For a simple test, try using Windows ASR to control your computer / write a word document. It is nightmare to try and control / edit your document (at least IMHO). If you really want me to get into details I will, but you didn't' seem to want to debate feasibility. 

As for debating whether or not this can be done, or is practical I think you should take a look at  http://scratch.mit.edu/ and 
http://doublesvsoop.sourceforge.net/pwcthelp/main.htm for some examples of what is being done with just visually based systems (drag/drop components). There is also http://www.alice.org/http://www.limnor.com/ and http://www.tersus.com/. I believe Google also tried to make a simple app builder for android but couldn't find it.

The inherent issue at hand with all these projects is that usually programming is about creating something new, and if you constrain what people can do, they usually all end up making similar if not the same apps. It is easy to say "build me a photo gallery" but if that widget doesn't already exist, is it more productive to try and "talk" through all the steps that would create that gallery? The more you abstract the less variation you get in an ecosystem. I think by the time you are expert enough in the language / visual builder you will not be any faster at building complex new things than someone using a more traditional approach. 

As always, I applaud those who have ambitious goals, however I felt I should at least bring up some of the challenges I see for such an ambitious project. Echoing Tim, Good luck.

Best,

Ross

Free Beachler

unread,
Apr 18, 2012, 11:00:41 PM4/18/12
to boulder-hacke...@googlegroups.com
The concept is:
- A tool for serious and professional software engineers.
- A real challenge to build effectively.
- Feasible using existing technologies for speech recognition and lexical analysis - whether open-source or proprietary.

The concept is not:
- A "toy" programming language
- A tool for non-programmers, drag-and-drop or otherwise

Please allow me to restate what the basic concept is:

I want to be able to tell the computer - aurally and visually - how to build the symbolic/linker code for an application I want to build.  Then I want to be able to compile that linker/symbolic code to native bytecode - Linux or Windows x86 architecture to start.  Lastly, as a fundamental design goal, I want that system to perform the above for a Chinese or Russian speaker.  They'll have their own dialect in Chinese or Russian - I'm not sure what that is - and a linguistics expert would be needed to design that seriously.


Cheers,
Free

Nicholas Dale Farrow

unread,
Apr 19, 2012, 2:50:55 AM4/19/12
to boulder-hacke...@googlegroups.com
Free,

I love this idea, but I would have to agree with the predominant opinion expressed here: this is too challenging of a problem to be tackled in a reasonable amount of time. I have witnessed some natural language processing research to be used in robotics at CU. The idea of course, is that you shouldn't have to know programming to be able to program a robot.

A particular problem faced is how to parse an ambiguous command: "Put the red block on the blue block on the yellow block." -> 1) the red block is currently on the blue block and needs to go onto the yellow block, 2) the red block needs to go onto the blue block that is already on top of the yellow block. Even though this anecdote isn't exactly your concept, you can appreciate some of the problems people face with natural language processing.

I can only imagine the frustrations of the users of your programming interface while refactoring their code... "OK, now remember that structure 'bar' I told you about with the 3 objects, now forget about that, we need to do something differently... no, no dont delete 'bar', just split the string object out and make it a new parameter...the other two objects can now be processed by 'foo' but only before calling 'rhubarb', and now we can ignore them if 'carrot' is true...". If thats not bad enough, how would you debug such a program?

I do however, have a possible solution to one of your fundamental requirements, "I want that system to perform the above for a Chinese or Russian speaker. They'll have their own dialect in Chinese or Russian - I'm not sure what that is - and a linguistics expert would be needed to design that seriously." If you can implement all of the above for an English speaker, then for a Russian or Chineese user, all you need to do is first translate thier commands into English, then feed the English command into your coding algorithm.

Nick

Ben Burdette

unread,
Apr 19, 2012, 12:37:28 PM4/19/12
to boulder-hacke...@googlegroups.com
Hey Free;

I'm interested in audio-based computing - wouldn't it be great to be
able to do sophisticated tasks using an audio interface on a
smartphone? Or alternatively, connected to a server at home. I could
get an earpiece/mic and do work while I hike the flatirons!

Personally I'd settle for a vocabulary of key words that would allow me
to accomplish various tasks from an audio 'command line'. This shell
would be optimized for audio processing.

As far as natural language goes, I have my doubts as to the usefulness
of that. With a smaller vocabulary of useful keywords - essentially a
sort of programming language - the task of audio word recognition
becomes easier and more reliable, perhaps even doable on smartphone
hardware. And then you don't have to deal with the well known
difficulties inherent in natural language - ambiguity, context,
synonyms, etc etc.

I haven't seriously investigated all this, but maybe there are some
existing interfaces out there for sight impaired programmers? I've
heard of a few people out there that make their living programming
without the benefit of sight. What tools are they using?

- Ben

Tim Mensch

unread,
Apr 19, 2012, 1:23:59 PM4/19/12
to boulder-hacke...@googlegroups.com
On 4/18/2012 9:00 PM, Free Beachler wrote:
> The concept is not: - A "toy" programming language - A tool for
> non-programmers, drag-and-drop or otherwise


If it's not for non-programmers, then using "natural language" isn't an advantage. As has been mentioned several times, natural language is terribly imprecise.

By the time you have a "dialect" that it CAN process, you will have designed a programming language. Good programming language design is HARD. Additionally, there are lots of good languages out there you could start from. But if you really want to reinvent the wheel starting from first principles, then feel free.


> Example #1 demonstrates a massive productivity increase for the simple use-case of "create a new executable application".

Umm...if I'm creating new executable applications all the time, a "new project" script bound to, say, Ctrl-N, would be faster. But I don't create new projects that often, and really I'd like to have several options that I can set while creating new projects, so bringing up a dialog box and letting me click a few options would probably be a superior solution.

And in a sufficiently productive language/IDE, creating "Hello World" is faster than saying "create an Application named quote helloworld unquote that prints the message quote hello world unquote to the screen in the color red, go". For one thing, your text doesn't fully specify the problem: How do you want it capitalized? Should it put a carriage return at the end?

Where if I'm in a Lua IDE and I type:

print "Hello World"

And then hit F5 to run it, it's done, with exactly the capitalization I want.

A good programmer can get a higher bandwidth of information out of their ten fingers on a 101-key keyboard than out of a single stream of text from their mouth. So if you're not trying to create a tool for non-programmers, I think that you're trying to solve the wrong problem.


> It was never in my interest to debate the feasibility of the project.

Actually, I would think it would be in your interest to learn what challenges you would face when approaching a project. Honest, you're not the first person to try to solve this problem. But if you don't want to debate feasibility, then by all means, discover the challenges yourself. Who knows, maybe you'll be the one who comes at the problem with a completely fresh approach and solves it where everyone else has failed.

Good luck,

Tim

Free Beachler

unread,
Apr 19, 2012, 1:31:32 PM4/19/12
to boulder-hacke...@googlegroups.com
I worked heavily with assistive web technologies at CU but it was about 10 years ago now.  I used Lynx and built WAI compliant sites and witnessed some of the tools being used by .  I'm out of the loop but would guess that being a visually impaired programmer in 2012 is extremely challenging and inefficient.  If I was visually impaired I'd give my left sphero for a programming language based on any of these concepts.  The market for this demographic of society is tiny, so it isn't served well by the private sector.

After criticism and interest from you guys - thinking about it more - the concept has kind of morphed.  It remains essentially the same.  I'm really tired of using a keyboard and text to write what we call 'code'.  There are other interactions that I want today.  I want to talk about code - I want to touch objects, move them around (in a virtual space)...things like that.

That means ditching the keyboard and trackpad/mouse as primary means of input.  For example:  for a completely aural based programming language we would write a 100% aural metasyntax.  Then we would build an aural IDE for a compiler that supported our "language".  Yes, it would be constrained.  It could be elegantly constrained - like the text we write is.  The IDE could be controlled aurally.  Ross expresses doubts about the feasibility - but it seems weird to me to assert the technology isn't here, today, to achieve that goal.

Like any decent programmer - I'm not sure exactly how the project looks until I start building it.  What if we bring a Kinect into this concept?  Create a 3-D workspace (IDE) with visual, tactile, and aural interfaces?  It would let us touch the objects we create and define, talk about them, how they bind together, their sequence in methods, etc.

Correct me if I'm wrong.  There are currently no open, or known proprietary for that matter, 100% aural metasyntaxes that exist for a fully featured OOP language.

Are there any open, or known proprietary, 100% visual metasyntaxes that exist for a fully featured OOP language?

Regards,
Free

Tim Mensch

unread,
Apr 19, 2012, 1:38:47 PM4/19/12
to boulder-hacke...@googlegroups.com
On 4/19/2012 11:31 AM, Free Beachler wrote:
> Are there any open, or known proprietary, 100% visual metasyntaxes
> that exist for a fully featured OOP language?


Why restrict yourself to OOP, btw? I mentioned Lua; it's paradigm-agnostic, and can do OOP, but a lot of the time you can save thousands of keystrokes by NOT doing OOP.

Tim

Bitreaper

unread,
Apr 19, 2012, 3:10:40 PM4/19/12
to Boulder Hackerspace Public

First, I'd like to say that I admire your ambition, and I think that
it's awesome and in the hacker spirit.

I've found I agree mostly with what Tim has said, and wanted to add:

Most coding, no, all coding I've come across in my entire coding life
has always been an iterative refinement process. This means that you
need to return to the code that was written (or spoken, which would be
translated to your metasyntax) and ponder it. It means that as you
learn more and more about your problem domain that you're attempting
to solve, you refine what you were thinking and refactor/rework the
areas where you were wrong. If this is spoken, that process becomes
quite cumbersome I would imagine. "Strike that section out, no wait,
only part of it, now write this..." I just can't imagine it being any
faster than a keyboard, and can only imagine it being more
frustrating.

You can prototype this today, no equipment or software need to be
further developed. Start with a few programmers that know a language,
maybe python due to it's lack of extraneous formatting chars (like
curly braces), and talk through a problem while they type it out. If
you can work on code that way, then you might have something to work
towards. If it get cumbersome and starts bogging down, you will begin
to see what your true issues will be. And humans are a whole hell of
a lot more forgiving of gaps (assumptions) in your speech than a
computer will be.

That's my half nybble of opinion.

Bit.

Peter Klipfel

unread,
Apr 19, 2012, 3:29:50 PM4/19/12
to boulder-hacke...@googlegroups.com
Someone previously brought up using visual programming using the kinect or something.  I think that if you wanted to make a programming language that could be used for audible programming, the language paradigm should shift away from the ones that use text files and towards the ones that use jigsaw puzzles and visual elements for programming.  The first ones that comes to mind are MaxMSP (and i think PureData is visual as well) which are visual programming environments.  I think that the shift needs to be even farther than that though.  I was thinking recently about what the best way to present data audibly is.  For me, I would prefer to be able to build a "language" out of sounds of my choosing.  Rather than trying to represent code in words, we could represent it in user defined (or language defined) sounds.  Saying "semicolon" is frustratingly long.  There would be a separate skill involved programming with such languages, but so it is with all languages.  The user would end up listening to a type of music to program, and then input code by speaking.  This could be followed by a response from the program.  Perhaps the programming could be done by motion using feedback from the motion of an individual.  This could be translated to text if someone else needed to read it.

Seems like a fun challenge!

Peter

Bitreaper

unread,
Apr 19, 2012, 4:41:15 PM4/19/12
to Boulder Hackerspace Public
Again, I think the biggest hurdle to adoption of any new programming
paradigms like these will not be how it's used on an individual basis,
but how well it can be used by a team or by multiple people. The
harder it is for a team of people to grok and modify each other's
code, the less accepted it will be. If you require a set of sounds to
be learned by more than one person, the question the becomes, what if
you have a sound that someone in the team can't form properly? This
is compounded when you have speakers of different languages or
nationalities in one team, as many languages have native sounds that
are hard for non-native speakers to pronounce.

This reminds me of a discussion I had with a graphics guy I had back
in '99 when I asked him about creating a 3d window manager. He
mentioned that there had been efforts he knew of, but the problem
remained that most of what we already had worked just as good, if not
better than the concepts they came up with at the time. I'd have to
say that as much as I think it would be cool in some ways to not
purely use the keyboard, it's just damn hard to beat. It's been
around for a long time for a reason.

It might be that I've spent the entirety of my career on a keyboard
that taints me this way, but I know that if I tried to move to an
auditory coding scheme, I wouldn't be able to do it. Different parts
of my brain are wired to write than to speak, and the writing parts
are hooked to those parts that generate code. Maybe new people who
are introduced to this method first might be able to wire up their
brains to code by speaking, but I'm fairly sure I could not.

Bit.

Free Beachler

unread,
Apr 19, 2012, 5:22:34 PM4/19/12
to boulder-hacke...@googlegroups.com
Everything Peter said...not just audio but visual too.

Free

Free Beachler

unread,
Apr 19, 2012, 5:53:05 PM4/19/12
to boulder-hacke...@googlegroups.com
Yes, software has to be usable by a team in order to be useful.  The part that resonates with me is the ability to create a universal syntax any human could understand in their native tongue.

Free

Free Beachler

unread,
Apr 19, 2012, 11:35:54 PM4/19/12
to boulder-hacke...@googlegroups.com
It seems there might be enough interest to form a core team around the concept of creating a metasyntax for a fully-featured OOP (or functional) programming language based 100% on aural and/or visual syntaxes, along with an immersive audio+kinect IDE.  I have a server we can use to host anything we need, a couple of Shure microphones, and a Kinect that's begging to find a purpose in life.  I too can think of ways to handle some (all?) of the nighmarish use-cases imagined in the criticisms in this thread - purely with voice - but they aren't traditional.  Some of you are already suggesting and envisioning this apparently.  For the record, the original concept relied on visual interaction for more than one use-case.

I'm very interested in mashing this audio/visual concept up with a kinect and something attached to the fingers to provide a stimuli for touching 'items' and completing interactions.  Perhaps a vibe and a way to touch two fingers together for a click - like programming gloves.  Having a person sit or stand in front of a computer and build a program using an OOP language based on a visual+aural metasyntax is a compelling concept.

For some inspiration, here's a 3D IDE in Second Life SE 3 and its predecessor:

Searches for audio-based analogs pull up empty and heavily rephrased google searches.  That's because, from my experience, they either don't exist or are buried.  We can ask the Assistive Tech department at CU for help when the time comes.

Cheers,
Free

Bryant Hadley

unread,
Apr 20, 2012, 1:54:27 PM4/20/12
to boulder-hacke...@googlegroups.com
Love the concept Free! Thought you might want to check this out, it doesn't deal with programming that I know of but it's a form of communicating with computer based systems using visual cues. 


They have several interactive technologies that may be worth taking note of during your design phase. 


--
Bryant Hadley



Free Beachler

unread,
Apr 20, 2012, 7:46:11 PM4/20/12
to boulder-hacke...@googlegroups.com
Wow...very inspirational.  The link you shared sent me on a mini journey across the interwebs.  There's another awesome project at http://www.fit.fraunhofer.de/en/fb/cscw/projects/3d-multi-touch.html.  Funding and scope are the keys to successfully executing on this concept, not to mention legal concerns.  No problem is too large to solve with the right group of people - well, maybe some are, but you get my point.  Right now I'm mainly interested in ideas around scope and cost - what we could build, how much that costs, and ideas on how we might get funding.

Cheers,
Free

Liz Baumann

unread,
Apr 23, 2012, 2:29:00 PM4/23/12
to boulder-hacke...@googlegroups.com
In addition to the blind, people with repetitive stress issues from keyboard use might want this technology.

Re: programming gloves, I know there's more out there, but I echatted with the guy behind Keygloves about a year ago.
http://www.keyglove.net/

I like the hybrid idea of using multiple senses / capabilities based on type of task, rather than just switch from coding by typing & eyes to another set of input/output senses. What does each do better than the others? Gestures/movement seem to have a universal language quality to them, while aural or visual words do not. It would seem more natural to select code (for cut/copy/paste/deletion/insertion) by touch or gesture, use voice for commands like cut, copy, paste, insert, etc. (shift, control, command, alt etc?), combined with either typing or keyglove for the nitty gritty.

A practical issue with speaking to your computer to write code: you could become annoying to other people in the same room. Imagine coffee shops...

Lizz
--
Liz Baumann
805-428-4754 | l...@lizbaumann.com


Bryant Hadley

unread,
Apr 24, 2012, 12:10:47 PM4/24/12
to boulder-hacke...@googlegroups.com
You know it would be interesting to have a simulation consisting of multiple inputs such as body movements, visual cues, and also eye movements:


May speed up the process of selecting an image and whatnot. 

Keyglove looks awesome by the way=)
--
Bryant Hadley



Reply all
Reply to author
Forward
0 new messages