Simple style grabber for STC

Skip to first unread message

Rob McMullen

Feb 24, 2007, 7:21:50 PM2/24/07

I'm looking through your explorer implementation and have incorporated
it as a minor mode in peppy. It was pretty straightforward to
incorporate, although I haven't grokked the internals yet. Still
reading the code. :)

I threw together this code that looks at the stc and grabs anything
that is of the requested styles -- it outputs a list of tuples: (start
pos, end pos, style type, and text). This is the totally naive
approach of just marching through the text and gathering styles that
are marked as the same.

In your Panel, for example, I added this method:

def updateExploreGeneric(self):
classes = { 'Class definition', 'Function or method',
getLexedItems(self.source, classes)

that calls this function:

def getLexedItems(stc, classes):
length = stc.GetTextLength()*2
text = stc.GetStyledText(0, length)
bits = (2**stc.GetStyleBits()) -1
print "seaching for %s" % classes

i=1 # styling bytes are the odd bytes
parsed = []
while i<length:
# get the style, stripping off the the indicator bits
style = ord(text[i]) & bits
if style in classes:
# it's a style we're interested in, so gather
# characters until the style changes.
found = i-1
while i < length:
s = ord(text[i]) & bits
if style != s:
parsed.append((found, i-1, style, text[found:i-1:2]))
print parsed

So... It's a start. I've got to figure out your hierarchies and I'll
see if I can merge a generic implementation into your Panel.


SPE Stani's Python Editor

Feb 28, 2007, 8:38:38 PM2/28/07
Hi Rob,

Good job!

Concerning the hierarchy in SPE that is most simple part of the
answer. As said before I use my own custom classes:

Their API is (nearly) identical to the wx.TreeCtrl and wx.ListCtrl.
There is a small, subtle difference in resetting them, but don't worry
about that. So I would suggest you implement it with normal a
wx.TreeCtrl (for hierarchy) or wx.ListCtrl (for sorted list). If you
manage to do that I'll help you porting them to my realtime classes if
you want to use them.

The lexer gives already a lot of information, but as you noticed it is
not enough for a tree. However maybe you should start with an index
list which can be sorted by its columns (by name, style or line
number). I think that is step 1.

Step 2 would be trying to find out the hierarchy. This is more
difficult, not so much for python, but to make it work in a generic
way for all the lexers is more difficult. For that probably we need to
use the folding information as well.

These are some issues I immediately notice, although I am not expert:
- the style bits are not the same across languages as in some
languages classes are not even highlighted (C++) For example:

* python
# Class name definition
# Function or method name definition

* C++
# UUIDs (only in IDL)
# Preprocessor

* Pascal
# Symbols
# Preprocessor

But languages as Python and Ruby do share the same style bits. So
maybe it boils down grouping the supported languages by Scintilla in
groups, which still might be less work than writing it yourself for
every language. (The grouping could be done automatically by mapping
the style bits with their labels in dictionaries and scan for
similarities.) But as classes and functions in C++ are not
highlighted, how can you extract them? Maybe you or someone else
(anyone?) on this list has better ideas. Otherwise it could be good to
ask Neil if this is utopia or realistic.

Even if this approach doesn't work it still would be good to develop a
generic hierarchic parser api, for which maybe with regular
expressions we can write plugins.



SPE Stani's Python Editor

Mar 1, 2007, 6:01:04 PM3/1/07
Hi Rob,

I thought about it longer and maybe we have to reverse our strategy.
Instead wanting to build a class explorer out of Scintilla, we should
do the opposite. Look what Scintilla does and shape that in an
interesting form. Basically what Scintilla does and which is
interesting or an editor is lexing and folding.

* folding
Folding is about hierarchy. So instead of a class explorer we could
rather think of a 'fold explorer' in which we group all folding nodes
in a tree. This offers another hierarchically view on the code, which
is not necessarily less interesting than a class explorer. It is maybe
something new, but can be very handy when you have to explore a
programming language or xml. Scintilla gives it all for free. The
problem of real class explorers is that they are language specific and
that Scintilla does not provide enough information without resorting
to writing custom scripts for every language.

What could be an interesting possiblility is that if this fold
explorer (as it will have start and end positions) supports drag and
drop, so you can reorganize your code with a click.

* lexing
I see lexing more as a way to identify. We could build a list of all
words which are styled from which we could generate all kinds of
lists, which can be represented in list controls with columns. There
could be options to turn certain styles (classes, methods, keywords,
...) on or off.

Folding and lexing can also be combined to provide the folding nodes
with style tags, which makes all kinds of visibility filters possible
for the 'fold explorer'.

For retrieving the lexing/styling information you have already a
prototype. A prototype to find folding nodes with their hierarchies
should also be doable.

With these two prototypes we can develop two minors: 'fold explorer'
and 'style index'.

What do you think?


Rob McMullen

Mar 1, 2007, 10:36:57 PM3/1/07
Hey Stani,

Just a quick note to let you know that I'm thinking about this. I'm
not a night person, and here on the east coast of the US it's getting
late. :)

Very interesting idea about the fold explorer -- do you mean that
scintilla can automatically fold code, and we don't have to tell it
any more about it than just specifying the lexer? I haven't ever
looked at code folding because I'd never used it, but I'll take a look
at it this weekend and further digest your note. I do think that your
idea is worth thinking about -- I'm still trying to wrap my head
around it.

So, this weekend hopefully I'll be at the place where I can ask you
more questions about it.


SPE Stani's Python Editor

Mar 2, 2007, 6:05:41 AM3/2/07
Hi Rob,

The folding is not necessarily done by the lexer. Scintilla
recommendeds to implement lexing and folding with different code, but
the lexer may contain the folder code. However these are Scintilla
internals we don't have to worry about. From a quick look to the
documentation, this is what matters:

"Generally, the fold points of a document are based on the
hierarchical structure of the document contents."

or in more detail:
"The fundamental operation in folding is making lines invisible or
visible. Line visibility is a property of the view rather than the
document so each view may be displaying a different set of lines. From
the point of view of the user, lines are hidden and displayed using
fold points. Generally, the fold points of a document are based on the
hierarchical structure of the document contents. In Python, the
hierarchy is determined by indentation and in C++ by brace characters.
This hierarchy can be represented within a Scintilla document object
by attaching a numeric "fold level" to each line. The fold level is
most easily set by a lexer, but you can also set it with messages."

And these are the relevant calls:

SCI_SHOWLINES(int lineStart, int lineEnd)
SCI_HIDELINES(int lineStart, int lineEnd)
SCI_SETFOLDLEVEL(int line, int level)
SCI_GETLASTCHILD(int line, int level) *
SCI_SETFOLDEXPANDED(int line, bool expanded)

The ones I masked with * can be of use for us. There are two
approaches: one brutal force: check the fold levels of all line
endings or try to build a smarter (this should be very easy), faster
way with SCI_GETFOLDPARENT and SCI_GETLASTCHILD or other routines, but
I don't know if that is possible.

A first step is too do it for the whole source of the document. A
second step could be to be able to work with small parts of the source
which are being edited. For example I presume that you only need to
scan for updates within the current working fold(level), which saves
cpu and memory resources while updating the fold explorer or the style
index. This is necessary for bigger files if you want to work with
real-time updating (it updates while you are typing your code).

There are already enough real class explorers for python in the
existing IDE's. So they can be ported to peppy anyway. I don't know if
one of the editors have implemented class explorers for other
languages. Maybe other people on this list can react on that.

The point of the new controls 'fold explorer' and 'style index' is
that they are generic out of the box and that immediately they can
make an IDE a nice environment for all the languages (and there are
many) that scintilla supports. As far as I know no open source IDE
does this at the moment, so this is a nice opportunity.


Reply all
Reply to author
0 new messages