How to get Balance Tags to include the tags

120 views
Skip to first unread message

DaveHein

unread,
Aug 23, 2011, 7:25:09 PM8/23/11
to BBEdit Talk
I have a lot of HTML files that have do-nothing <span> blocks in them.
I'd like to select all the text -- including the opening and closing
tags -- and then strip HTML. Actually I'd like to just strip the
opening and closing span tags, and leave what's inside them alone.

The problem I'm running into is that Balance Tags will select the
innner HTML but not the span tags themselves. So if I put the cursor
somewhere on or in "<span>some normal text here</span>" and did a Cmd-
B, the "some normal text here" would be selected, but the opening
"<span>" and closing "</span>" would not be selected.

I cannot see any way to get the tags that delimit the selected text to
be selected as well.

Any ideas?


NOTE: what I really want to do is just click on the opening <span> tag
and have a command that will remove that tag (along with it's closing
tag) ... leaving the inner text alone.

Matthew Schinckel

unread,
Aug 23, 2011, 11:53:12 PM8/23/11
to bbe...@googlegroups.com
Will there be nested tags?

 ie <span>foo<span>bar</span>baz</span>

If not, a script or macro that starts at the current insertion point, backtracks until it finds a <span> block, removes it, and then moves forward until it finds a </span> and removes that should do the trick.

Matt.

Matthew Schinckel

unread,
Aug 24, 2011, 12:19:50 AM8/24/11
to bbe...@googlegroups.com
Here is an AppleScript that will do most of what you want:

This will look at the current cursor position, and use the 'balance tags' command to select the content of the current tag.

It will then extend that selection to contain the <span> and </span>, and if it is indeed the tags is a <span> tag, it will remove it.

It will not affect any nested tags. I wasn't able to find a scriptable version of the Remove Markup menu item.

Feel free to use this as you see fit.

tell application "BBEdit"

tell front window

set cursorPos to characterOffset of selection

balance tags

set startPos to characterOffset of selection

set endPos to startPos + (length of selection)

select (characters (startPos - 6) thru (endPos + 6))

set selectedText to selection as text

if characters 1 thru 6 of selectedText as text is equal to "<span>" then

set replaceText to characters startPos thru (endPos - 1) as text

set selection to replaceText

end if

select insertion point before character (cursorPos - 6)

end tell

end tell


Prachi Gauriar

unread,
Aug 24, 2011, 12:13:48 AM8/24/11
to BBEdit Talk
On Aug 23, 7:25 pm, DaveHein <dhein.li...@freshthought.com> wrote:
> I have a lot of HTML files that have do-nothing <span> blocks in them.
> I'd like to select all the text -- including the opening and closing
> tags -- and then strip HTML. Actually I'd like to just strip the
> opening and closing span tags, and leave what's inside them alone.

Why not do a Grep search/replace?

Search: <span.*?>(.*?)</span>
Replace: \1

-Prachi

Matthew Schinckel

unread,
Aug 24, 2011, 12:29:20 AM8/24/11
to bbe...@googlegroups.com
Whoops: bit of a bug in that last one. If you weren't in a tag, or weren't in a span tag, it moves the insertion point.

Also note that it will not re-select text if you had a selection.

This one fixes the first bug, but not the second.

tell application "BBEdit"

tell front window

set cursorPos to characterOffset of selection

balance tags

set startPos to characterOffset of selection

set endPos to startPos + (length of selection)

select (characters (startPos - 6) thru (endPos + 6))

set selectedText to selection as text

if characters 1 thru 6 of selectedText as text is equal to "<span>" then

set replaceText to characters startPos thru (endPos - 1) as text

set selection to replaceText

select insertion point before character (cursorPos - 6)

else

select insertion point before character (cursorPos)

end if

end tell

end tell

Roland Küffner

unread,
Aug 24, 2011, 1:51:14 AM8/24/11
to bbe...@googlegroups.com
Hi,
Am 24.08.2011 um 24, 01:25 schrieb DaveHein:

> The problem I'm running into is that Balance Tags will select the
> innner HTML but not the span tags themselves. So if I put the cursor
> somewhere on or in "<span>some normal text here</span>" and did a Cmd-
> B, the "some normal text here" would be selected, but the opening
> "<span>" and closing "</span>" would not be selected.
>
> I cannot see any way to get the tags that delimit the selected text to
> be selected as well.

The following script does exactly what you want. I think this script was a result of a similar discussion on this list years ago. I didn't write it myself but unfortunately I do not know who's to be credited for it:


tell application "BBEdit"
if (balance tags) then
set x to characterOffset of selection
set y to x + (length of selection)
inside tag start range (x - 2) end range (x - 2)
set tagLength to (end_offset of tag of result) - (start_offset of tag of result)
set x to x - tagLength - 1
inside tag start range (y + 1) end range (y + 1)
set tagLength to (end_offset of tag of result) - (start_offset of tag of result)
set y to y + tagLength
select characters x thru y of window 1
else
beep -- script beeps if it could not create an initial balance
end if
end tell


happy balancing,
Roland

Dave Hein

unread,
Aug 27, 2011, 12:43:19 PM8/27/11
to bbe...@googlegroups.com
OK, with help from Roland Küffner and Christopher Stone, I've been able to create the script I need. More on that in a bit.

There have been a number of suggestions that using find/replace with grep should work. That only works well if there are no nested spans (with inner spans that I want to keep ... i.e. spans with class attributes as opposed to span elements with no attributes). And matching beginning and closing tags is difficult in a nested element scenario. So, a script seems like the safest approach.

So here is what I did. First, I started with the script from Roland that selected the outer tags of a balanced tag selection. I'll repeat that original script here:

tell application "BBEdit"
if (balance tags) then
set x to characterOffset of selection
set y to x + (length of selection)
inside tag start range (x - 2) end range (x - 2)
set tagLength to (end_offset of tag of result) - (start_offset of tag of result)
set x to x - tagLength - 1
inside tag start range (y + 1) end range (y + 1)
set tagLength to (end_offset of tag of result) - (start_offset of tag of result)
set y to y + tagLength
select characters x thru y of window 1
else
beep -- script beeps if it could not create an initial balance
end if
end tell

Then I modified it to remove the outer tags:

tell application "BBEdit"
if (balance tags) then
set x to characterOffset of selection
set y to x + (length of selection)
inside tag start range (x - 2) end range (x - 2)
set tagLength to (end_offset of tag of result) - (start_offset of tag of result)

set xOpen to x - tagLength - 1
set lenOpen to tagLength
set yOpen to xOpen + lenOpen


inside tag start range (y + 1) end range (y + 1)
set tagLength to (end_offset of tag of result) - (start_offset of tag of result)

set xClose to y
set lenClose to tagLength
set yClose to xClose + lenClose
-- set insertion point before character xClose
tell window 1 to select (characters xClose thru yClose)
tell window 1 to delete selection
tell window 1 to select (characters xOpen thru yOpen)
tell window 1 to delete selection


else
beep -- script beeps if it could not create an initial balance
end if
end tell

Then I made it specific to a <span> element:

tell application "BBEdit"
set origInsPt to characterOffset of selection


if (balance tags) then
set x to characterOffset of selection
set y to x + (length of selection)
inside tag start range (x - 2) end range (x - 2)

set t to tag of result
if ("span" is equal to name of t) then
set tagLength to (end_offset of t) - (start_offset of t)
set xOpen to x - tagLength - 1
set lenOpen to tagLength
set yOpen to xOpen + lenOpen


inside tag start range (y + 1) end range (y + 1)
set tagLength to (end_offset of tag of result) - (start_offset of tag of result)

set xClose to y
set lenClose to tagLength
set yClose to xClose + lenClose
-- set insertion point before character xClose
tell window 1 to select (characters xClose thru yClose)
tell window 1 to delete selection
tell window 1 to select (characters xOpen thru yOpen)
tell window 1 to delete selection
else
tell window 1 to select (insertion point before character origInsPt)
beep -- script beeps if it outer element is not <span/>
end if


else
beep -- script beeps if it could not create an initial balance
end if
end tell

Then I used a script from Chris to create the wrapper code to find an empty span element and do some processing with it; I added a loop that repeated until there were no more empty <span> elements remaining:

tell application "BBEdit"
set keepGoing to true
repeat while keepGoing = true
tell text of window 1
set fndRsltStart to find "<span>" options ¬
{search mode:grep, case sensitive:false, starting at top:true} with selecting match
end tell
set keepGoing to found of fndRsltStart
if keepGoing = true then
set z to characterOffset of selection
tell window 1 to select insertion point before character (z + 2)


if (balance tags) then
set x to characterOffset of selection
set y to x + (length of selection)
inside tag start range (x - 2) end range (x - 2)

set t to tag of result
if ("span" is equal to name of t) then
set tagLength to (end_offset of t) - (start_offset of t)
set xOpen to x - tagLength - 1
set lenOpen to tagLength
set yOpen to xOpen + lenOpen


inside tag start range (y + 1) end range (y + 1)
set tagLength to (end_offset of tag of result) - (start_offset of tag of result)

set xClose to y
set lenClose to tagLength
set yClose to xClose + lenClose
-- now remove the closing and opening element tags
tell window 1 to select (characters xClose thru yClose)
tell window 1 to delete selection
tell window 1 to select (characters xOpen thru yOpen)
tell window 1 to delete selection
else
tell window 1 to select (insertion point before character origInsPt)
end if
end if
end if
end repeat
end tell

That is it!! I assigned a shortcut key to that script and am ripping through the nearly 100 files that need to be cleaned up. Very cool.

Now, there could be more to be done here. I probably need an "on error" handler (Chris had one, but I didn't keep it). And I might need to be referencing the current document with something other than "window 1" ... or maybe check that window 1 is actually a text window. And I might want a "beep" if the script did no work.

I'll probably add some of that clean up later (I anticipate need this script quite a bit). For now, I'm quite happy -- lots of tedium avoided.

Thanks to everyone who contributed suggestions. All were appreciated.

Awesome response folks!

--
Dave Hein

> --
> You received this message because you are subscribed to the
> "BBEdit Talk" discussion group on Google Groups.
> To post to this group, send email to bbe...@googlegroups.com
> To unsubscribe from this group, send email to
> bbedit+un...@googlegroups.com
> For more options, visit this group at
> <http://groups.google.com/group/bbedit?hl=en>
> If you have a feature request or would like to report a problem,
> please email "sup...@barebones.com" rather than posting to the group.
> Follow @bbedit on Twitter: <http://www.twitter.com/bbedit>

Reply all
Reply to author
Forward
0 new messages