Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

sed - please help me

0 views
Skip to first unread message

Szyk

unread,
Feb 3, 2012, 3:56:58 PM2/3/12
to
Hi

I want extract class names from C++ file sources. So I think sed will
perfeclty fit my needs. I wrote two almos identical sed commands, but
second completly does not work. Please tell me why?

First command:
$ sed -n '/^class[[:space:]]\+[[:alpha:]]\+.*$/p' ./User.h
class RoomState
class UserBase
class UserData : public UserBase
class User : public UserBase

And similiar but with simple transformation:
$ sed -n 's/^class[[:space:]]\+\([[:alpha:]]\+\).*$/\1\n/' ./User.h
[no output]


thank you
Szyk

Baho Utot

unread,
Feb 3, 2012, 5:59:18 PM2/3/12
to
Szyk wrote:

> Hi
>
> I want extract class names from C++ file sources. So I think sed will
> perfeclty fit my needs. I wrote two almos identical sed commands, but
> second completly does not work. Please tell me why?
>
> First command:
> $ sed -n '/^class[[:space:]]\+[[:alpha:]]\+.*$/p' ./User.h
^


> class RoomState
> class UserBase
> class UserData : public UserBase
> class User : public UserBase
>
> And similiar but with simple transformation:
> $ sed -n 's/^class[[:space:]]\+\([[:alpha:]]\+\).*$/\1\n/' ./User.h
^
> [no output]
>
>
> thank you
> Szyk


Missing /p at the end

Alexandre Aguiar

unread,
Feb 3, 2012, 7:42:03 PM2/3/12
to
on Sex 03 Fev 2012 18:56, Szyk said to people at <alt.comp.linux>:
> And similiar but with simple transformation:
> $ sed -n 's/^class[[:space:]]\+\([[:alpha:]]\+\).*$/\1\n/' ./User.h

Not sure why this doesn't do what you want. But definetely it must be doing
what you you told it to do. :-)

However, your method assumes too much about code formatting and that may
frustrate future use of your script. Below it is how I would do it.

cat ./somefile.h | \
# do not neglect lines with leading white spaces; these
# can be added by automatic indentation or intentionally
# get lines that (really) start with 'class'
grep -e '\(^ |^\t\)*class' | \
# now print the word class and the classname no matter
# how much space exists between them
awk '{ print $1 " " $2 }'

This will work with whatever combination of leading spaces and tabs and any
combination of them between words. This will *not* work in the case the
classname is not in the same line. Two sed scripts can resolve this.

cat ./somefile.h | \
sed -e 's/^[ \t]*//;s/[ \t]*$//;/^$/d' | \
sed -e :a -e '/class$/N; s/\n/ /; ta' | \
awk '{ print $1 " " $2 }'

The sed scripts above do:
1. delete all leading and trailing whitespaces;
2. delete all empty lines
3. if a line ends with 'class', append a space and the next line to it

There is the possible declaration of a class in the same line where another
statement ends. Actually, a commonly seen blurring technique is to put a
big number of statements in long lines. This can be eliminated simply by
replacing every ';' with '\n' just before removal of empties. This also
removes trailing ';' that can be with the class name like in 'class X;' and
avoiding this trailing ';' is included in the final output.

cat ./somefile.h | \
sed 's/;/\n/g' | \
sed -e 's/^[ \t]*//;s/[ \t]*$//;/^$/d' | \
sed -e :a -e '/class$/N; s/\n/ /; ta' | \
awk '{ print $1 " " $2 }'

Note that you can use this same script to extract any other type
declarations. All you have to do is to subst the desired type into 'class'
in the last sed script.

The most bizarre blurring of code won't resist confessing all of its types
declarations. :-)

HTH.

Please, let me know if you find some bug or uncovered possibility.

--

Alexandre

Szyk

unread,
Feb 4, 2012, 9:18:18 AM2/4/12
to
Thank you all!
I found solution of my problem: Grep was needed and sed must be calling
without -n!

$ grep -i -E "^class[[:space:]]+[[:alnum:]|_]+.*$" ./User.h|sed
's/^class[[:space:]]\+\([[:alnum:]|_]\+\).*$/\1/'
Room_State
UserBase
UserData
User

Alexandre Aguiar

unread,
Feb 5, 2012, 8:07:10 AM2/5/12
to
on Sáb 04 Fev 2012 12:18, Szyk said to people at <alt.comp.linux>:

> I found solution of my problem: Grep was needed and sed must be calling
> without -n!

Great.

> $ grep -i -E "^class[[:space:]]+[[:alnum:]|_]+.*$" ./User.h|sed

As C and C++ are case sensitive, you may not need '-i' grep option.

--

Alexandre

0 new messages