Making semicolons and/or braces optional - is it possible through annotations?

17 views
Skip to first unread message

Clay Sweetser

unread,
Dec 18, 2012, 8:50:37 AM12/18/12
to crack-l...@googlegroups.com
I would like to construct an annotation for crack programs which allows a file to be written without semicolons and braces, a la python.
Is this feasible with the current annotation api? If not, are there other ways this could be done, or will it be possible in the future?

Michael Muller

unread,
Dec 18, 2012, 9:33:21 AM12/18/12
to Clay Sweetser, crack-l...@googlegroups.com
It's almost possible - you can rewrite the token stream to mutate the syntax
any way you like. What you can't do at this time is read whitespace, which is
what you would need to do python-like syntax.

What needs to happen in order to make this possible is to provide a way for
the annotation system to read the input stream before tokenization.

Are you interested in taking a stab at this? I can point you at the code you
need to modify.

>


=============================================================================
michaelMuller = mmu...@enduden.com | http://www.mindhog.net/~mmuller
-----------------------------------------------------------------------------
In this book it is spoken of the Sephiroth, and the Paths, of Spirits and
Conjurations; of Gods, Spheres, Planes and many other things which may or
may not exist. It is immaterial whether they exist or not. By doing
certain things certain results follow. - Aleister Crowley
=============================================================================

Clay Sweetser

unread,
Dec 18, 2012, 10:18:50 AM12/18/12
to crack-l...@googlegroups.com, Clay Sweetser

What are you referring to, the modification of the source code to provide the chance to alter the input stream, or the creation of the actual annotation?
I'm confidant that I could create the annotation, given the right api, however I am less certain about being able to modify the source code - my experience with C++ and the crack source code is limited.

Michael Muller

unread,
Dec 18, 2012, 10:19:41 AM12/18/12
to Clay Sweetser, crack-l...@googlegroups.com, Clay Sweetser
I was referring to modifying the crack compiler to provide access to the raw
input stream. I think it should be fairly easy to do and it's a desirable
feature, but right now we don't really have any free hands to make the change.

Without it, you can still do mutations on the syntax, you just won't be able
to recognize whitespace. So a semicolon-free syntax is feasible, but an
indentation-based one is not.

>


=============================================================================
michaelMuller = mmu...@enduden.com | http://www.mindhog.net/~mmuller
-----------------------------------------------------------------------------
you and I are only different in our minds, the universe makes no such
distinction
=============================================================================

Clay Sweetser

unread,
Dec 18, 2012, 10:37:07 AM12/18/12
to crack-l...@googlegroups.com
On Tuesday, December 18, 2012 5:50:37 AM UTC-8, Clay Sweetser wrote:
> I would like to construct an annotation for crack programs which allows a file to be written without semicolons and braces, a la python.
> Is this feasible with the current annotation api? If not, are there other ways this could be done, or will it be possible in the future?

Well, as I said, I'm certainly willing to try my best. Could you give me an what files should be modified, and possibly what I should seek to do within the existing code?

Clay Sweetser

unread,
Dec 18, 2012, 10:55:59 AM12/18/12
to crack-l...@googlegroups.com
(Sorry if this is a duplicate)

Michael Muller

unread,
Dec 18, 2012, 7:06:49 PM12/18/12
to Clay Sweetser, crack-l...@googlegroups.com
- Add a readSource(char *buffer, size) method to the tokenizer
(parser/Toker.{h,cc})
- Add a readSource(buffer, size) method to compiler/CrackContext.{h,cc}
you'll also need to add a private static wrapper ("_readSource")
- Add method metadata for CrackContext::readSource() to compiler/init.cc
- Add a test for the new functionality to test/testann.crk
- Bask in the warm glow of your own awesomeness. :-)

Be aware that once we start reading from the tokenizer's source stream, it's
going to be very hard to put that data back. The only safe way to do this is
to read a byte at a time, then you can revert back to the original parser if
you need to based on some terminal condition. If you feel particularly
motivated, you might want to add byte getSourceChar() and
ungetSourceChar(byte) methods to the set of classes while youre at it. But
"readSource()" should let you accomplish what you have in mind.

>


=============================================================================
michaelMuller = mmu...@enduden.com | http://www.mindhog.net/~mmuller
-----------------------------------------------------------------------------
Government, like dress, is the badge of lost innocence; the palaces of kings
are built on the ruins of the bowers of paradise. - Thomas Paine
=============================================================================

Clay Sweetser

unread,
Dec 19, 2012, 1:08:10 AM12/19/12
to crack-l...@googlegroups.com
So, let me make sure I understand what's supposed to happen, and please correct me if you feel my understanding is faulty. The new function readSource() will take a pointer to a char array (or buffer), and fill up that buffer with characters from token readers istream, to the length specified. Since the annotation logic has access to this buffer, this will enable annotations to read the raw source code, and inject code into particular areas.

The part I don't quite understand is your warning about the trouble of putting back data read from the stream. Is this because, when read from internally, the pointer used to specify the current position within istream changes, or because of something else?

Reply all
Reply to author
Forward
0 new messages