html files , particularly those generated by software can have
very long lines. I've been playing such a file that is 29 Mbytes
in length and had numerous lines that exceeded the 32000 byte limit
that this version of TSE seems to have (TSE - Linux v4.40.79 )
BTW the TSE help says under
>Basic Concepts >Introduction >The Editor Features
"Edit lines up to 16,000 characters in length"
If one loads a file with long lines in TSE it simply breaks the
line at 32000 bytes often in the middle of a string that should
not be broken.
My file has basically a very repetitive format:
<h2 ... </h2> sequences followed by one or more
<tr> ... </tr> sequences
where ... represents up to about 300 characters of other text.
There are some other html tags ect but the above is the vast
majority of the file. It is the sequences of <tr> ... </tr>
that get to be greater than 32000 bytes. What I'd like to do is
have each <tr> ... </tr> sequence be a seperate line.
Is there a way to add newlines with TSE BEFORE long lines are
arbitrarily broken at 32000 bytes? Or does anyone have a
reliable macro or algorithm for breaking up long lines at
selected places AND putting lines back together at breaks?
If "Remove Trailing Whitespace" is on when long lines were broken up
the first part of broken lines may not be 32000 bytes long if there
happened to be spaces at that point.
Given that, is there another way to know where lines were broken?
Turn "Remove Trailing Whitespace" off before loading long lines?
What if there happened to be a line exactly 32000 bytes long?
It seems like breaking lines at selected points before they were
broken at 32000 bytes would be a lot cleaner. Would doing it with a
streaming editor like SED be the best approach?
Fred
--
Fred H. Olson Minneapolis,MN 55411 USA (near north Mpls)
Email: fholson at
cohousing.org 612-588-9532
My Link Pg:
http://fholson.cohousing.org My org:
Communications for Justice -- Free, superior listserv's w/o ads