Search for CR, LF & TAB in TSE 2.5 for DOS?

35 views
Skip to first unread message

Clueless in Seattle

unread,
Apr 5, 2013, 11:15:49 AM4/5/13
to SemWare TSE Pro text editor
I'd like to write a little macro I could use to regularize the
formatting of text in three decades worth of old text files going back
to my WordStar days in CP/M in the 80s.

So I suspect I'm going to have to figure out how to search for,
replace, delete or insert the ASCII characters for carriage returns,
line feeds and tabs. Does that make sense?

Just this morning I finally figured out how to make the -b-3 display
mode work (sort of...see my post in the thread http://tinyurl.com/Further-on-Binary)
so I can now see the CR/LF characters.

One of them appears as a musical note, and the other as a Space
Odyssey monolith with an oval embedded in it.

So which one is the CR and which one is the LF? May I assume that
they appear in that order, i.e., the first one is the CR and the
second one is the LF?

And, more to the point, how do I enter them into the search field, and
how would I enter them in a macro so that I could have the macro
automatically enter them into existing text.

Oh yeah, one more thing: What does the tab character look like in the
-b-3 mode?

Will in Seattle
a.k.a. "Clueless"
Running TSE 2.5 for DOS
Under MS-DOS 6.21

knud van eeden

unread,
Apr 5, 2013, 11:45:42 AM4/5/13
to sem...@googlegroups.com
So I suspect I'm going to have to figure  out how to search for,
> replace, delete or insert the ASCII characters for carriage returns,
> line feeds and tabs.  Does that make sense?

Carriage return and line feeds are in general used as line separators used by TSE
(thus typically located at the end of the line).
So do not search replace them in general.

===

> So which one is the CR and which one is the LF?  

CR is the 'music' symbol,
(almost always) followed by the other linefeed symbol.

===

Run the TSE macro (located in the ..\mac directory of TSE) called

 ascii.mac

And yes, also in v2.5.

===


May I assume that
they appear in that order, i.e., the first one is the CR and the
second one is the LF?

Correct

===


And, more to the point, how do I enter them into the search field, and
how would I enter them in a macro so that I could have the macro
automatically enter them into existing text.

E.g. this will replace 'horizontal tab' ASCII 9 with CR (=ascii 13) then LF (=ascii10).

LReplace( "\d009", "\d013\d010", "glx" )


E.g. this will replace 'horizontal tab' ASCII 9 with a space (=ascii 32)

LReplace( "\d009", "\d032", "glx" )


E.g. this will replace 'horizontal tab' ASCII 9 with a character "A" (=ascii 65)

LReplace( "\d009", "\d065", "glx" )

===

In general use \d<ASCII value in decimal, always 3 digits, possibly add zeroes in front).

---

Oh yeah, one more thing:  What does the tab character look like in the
-b-3 mode?

Like a donut shape.

Run the ascii.mac macro and look at the top of the list, there you see 'horizontal tabl' and its symbol.

with friendly greetings,
Knud van Eeden

knud van eeden

unread,
Apr 5, 2013, 11:48:03 AM4/5/13
to sem...@googlegroups.com
In the previous e-mail:

"glx" means that you first will have to *highlight the block in which you want to do the search replace", after that start the replace.

===

If whole file then use e.g. instead

"gx"

Clueless in Seattle

unread,
Apr 7, 2013, 9:40:07 AM4/7/13
to SemWare TSE Pro text editor


On Apr 5, 8:45 am, knud van eeden <knud_van_ee...@yahoo.com> wrote:

> Run the TSE macro (located in the ..\mac directory of TSE) called
>
>  ascii.mac

Thanks, Ludo! That chart is a great help.

Now I can try to figure out how to write a little macro that will
search for paragraphs without blank lines between them, and insert
blank lines.

I think the trick will be to find a way to distinguish between
paragraph breaks that have blank lines between them, and those that
don't.

So would I be right in assuming that paragraphs with line breaks
between them would be preceded by this sequence:

CRLF CRLF TAB

Whereas paragraphs with no line break, but only a Tab at the
beginning, would be preceded by just a single CRLF line this:

CRLF TAB

If so, then perhaps the simplest way to solve the problem of
distinguishing between those two kinds of paragraph breaks would be to
simply eliminate the problem altogether by first going through the
file and replacing all CRLF CRLF TAB sequences with just CRLF TAB
sequence. Then I could go through the file a second time and replace
all CRLF TAB sequences with CRLF CRLF TAB.

That way I could get the job done just using TSE's built-in Find-and-
Replace command and wouldn't have to figure out how to test each
paragraph break with an IF/ELSE routine, which I haven't a clue how to
write.

Clueless in Seattle

unread,
Apr 7, 2013, 8:10:04 PM4/7/13
to SemWare TSE Pro text editor
I'm reading the instructions on page 120 of my TSE 2 for DOS User's
Guide where it tells how to search for what it calls "regular
expressions."

I've got a test file opened in binary mode, and am trying to search
for the CRs in the file.

I enter search pattern \f and then "x" but I get the error:

\f not found.

What am I doing wrong?

Will in Seattle

knud van eeden

unread,
Apr 8, 2013, 5:05:35 AM4/8/13
to sem...@googlegroups.com
The 'd' in '\d' stands for 'd'ecimal. 

Thus the ASCII value for that character.

E.g. 009 is ASCII 009 or thus 9 or thus horizontal tab.

E.g. 032 is ASCII 032 or thus 32 or thus space.

E.g. 065 is ASCII 065 or thus 'A' 

and so on.

In general the ASCII value can be between 0 and 255.

Always use 3 digits (possibly add zeroes in front).

You must put it between quotes.

E.g.

\d9 is no good.
"\d009" is good.

E.g.
\d32 is no good.
"\d032" is good.

E.g.
\d65 is no good.
"\d065" is good.

===

E.g. this will replace in the whole file 'horizontal tab' ASCII 9 with "whatever".

proc main()
 LReplace( "\d009", "whatever", "gx" )
end

===

E.g. this will replace in the whole file 'horizontal tab' ASCII 9 with CR (=ascii 13) then LF (=ascii10).

proc main()
 LReplace( "\d009", "\d013\d010", "gx" )
end


E.g. this will replace 'horizontal tab' ASCII 9 in the whole file with a space (=ascii 32)

proc main()
 LReplace( "\d009", "\d032", "gx" )
end


E.g. this will replace 'horizontal tab' ASCII 9 in the whole file with a character "A" (=ascii 65)

proc main()
 LReplace( "\d009", "\d065", "gx" )
end

===

Similar from the search/replace:



Clueless in Seattle

unread,
Apr 8, 2013, 10:28:25 AM4/8/13
to SemWare TSE Pro text editor
Hi again, Knud!

I'm afraid I'm having trouble following your instructions, so, as we
say "There's more than one way to skin a cat."

I once had WordStar installed on my little DOS laptop, but recently
deleted it, thinking that I wouldn't be needing it anymore, now that
I'm using TSE.

But Wordstar had a simple and easy way to find and replace CFLFs and
Tabs using control characters in the search strings.

So I think I'll try to put WS back onto my laptop and try using it to
manually reformat my text files. This might be the simplest solution
for this immediate problem.

However, I would still like to try to eventually get a handle on how
to search for sequences of CRLFs and tabs in TSE.

For example, what would I enter into the TSE search field to search
for this sequence: CRLF CRLF Tab?

I'm afraid my aging and illness related cognitive decline is making it
more and more difficult for me to comprehend abstract explanations.
(And my abstract cognition was never all that great to begin with: I
failed 9th grade algebra).

So I need to break down problems into small steps, and then try to
tackle them one at a time.

I hope you don't find my mental slowness to be too exasperating,

knud van eeden

unread,
Apr 8, 2013, 10:53:45 AM4/8/13
to sem...@googlegroups.com
CRLF CRLF Tab?

CR = \d013
LF = \d010
Tab = \d009

CTLFTab=

\d013\d010\d009

Search for e.g. using "gx"

===

Important, when using usual TSE menu search/replace (thus not using a macro) 
do not *use* the quotes. There one should use

search for:

 \d013\d010\d009

replace with:

 whatever

options

 gnx

===

But in TSE macros you will have to use the quotes.

proc main()
 LReplace( "\d013\d010\d009", "whatever", "gxn" )
end
Reply all
Reply to author
Forward
0 new messages