Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Stripping HTML tags from excel

4,366 views
Skip to first unread message

maunder

unread,
Jan 20, 2008, 3:21:55 AM1/20/08
to
Does anyone know how to strip html tags from excel worksheets?
I regularly use a worksheet, (generated by IT from the web), that
creates an excel sheet. The problem is, the html tags remain, making
it very difficult to read. A sample is below:-

<ul> <p>Unable to log as showing up too earlyin workstream.</p></ul>


I have tried using search and replace, but this does not delete all
the tags. For instance, when it encounters a <div> tag, it generates
the error, 'formula is too long'

I would like to be able to strip ALL tags from the sheet using a
macro. Can anyone help?

Thanks,

Karl


Rick Rothstein (MVP - VB)

unread,
Jan 20, 2008, 4:35:50 AM1/20/08
to
Start Excel's Replace (Edit/Replace or Ctrl+H) and put <*> (that is 3
characters, a less-than symbol, an asterisk, and a greater-than symbol) for
your "Find what" condition and leave the "Replace with" field empty.

Rick


"maunder" <kema...@googlemail.com> wrote in message
news:c64267de-c1de-4c9b...@c4g2000hsg.googlegroups.com...

maunder

unread,
Jan 22, 2008, 9:04:22 AM1/22/08
to
On Jan 20, 4:35 am, "Rick Rothstein \(MVP - VB\)"

<rickNOSPAMn...@NOSPAMcomcast.net> wrote:
> Start Excel's Replace (Edit/Replace or Ctrl+H) and put <*> (that is 3
> characters, a less-than symbol, an asterisk, and a greater-than symbol) for
> your "Find what" condition and leave the "Replace with" field empty.
>
> Rick
>
> "maunder" <kemars...@googlemail.com> wrote in message

>
> news:c64267de-c1de-4c9b...@c4g2000hsg.googlegroups.com...
>
>
>
> > Does anyone know how to strip html tags from excel worksheets?
> > I regularly use a worksheet, (generated by IT from the web), that
> > creates an excel sheet. The problem is, the html tags remain, making
> > it very difficult to read. A sample is below:-
>
> > <ul> <p>Unable to log as showing up too earlyin workstream.</p></ul>
>
> > I have tried using search and replace, but this does not delete all
> > the tags. For instance, when it encounters a <div> tag, it generates
> > the error, 'formula is too long'
>
> > I would like to be able to strip ALL tags from the sheet using a
> > macro. Can anyone help?
>
> > Thanks,
>
> > Karl- Hide quoted text -
>
> - Show quoted text -

Thanks Rick, but I have tried that and still get the 'formula too
long' message. It seems that because the offending cells contain more
than 900 characters, it will not work. I think I need vba code to get
any further.

Karl

Rick Rothstein (MVP - VB)

unread,
Jan 22, 2008, 10:53:40 AM1/22/08
to
> > Start Excel's Replace (Edit/Replace or Ctrl+H) and put <*> (that is 3
> > characters, a less-than symbol, an asterisk, and a greater-than symbol)
> > for
> > your "Find what" condition and leave the "Replace with" field empty.
>
> Thanks Rick, but I have tried that and still get the 'formula too
> long' message. It seems that because the offending cells contain more
> than 900 characters, it will not work. I think I need vba code to get
> any further.

Okay, I tried experimenting with VBA and found that for very long entries in
a cell, the formula bar drop down made it impossible to read the contents of
the cell. Now I can modify the following for you if you have another idea on
how you would want to proceed; but, for now, give the following a try...

In the VBA editor, add a UserForm to your project and put a TextBox and a
CommandButton on it. Make the UserForm somewhat large in order to house a
large TextBox and then make your TextBox large to fill all but enough room
for the CommandButton. For the TextBox in the Properties window, set the
MultiLine property to True and set the ScrollBars property to
3-fmScrollBarsBoth. Now, copy/paste the following code into the UserForm's
code window...

'*********Start UserForm Code*********
Private Sub CommandButton1_Click()
Unload Me
End Sub

Private Sub UserForm_Activate()
Dim X As Long
Dim Cel As Range
Dim CellContents As String
Dim Temp() As String
If Selection.Count = 1 Then
CellContents = Selection.Value
CellContents = Replace(CellContents, ">", "<")
Temp = Split(CellContents, "<")
For X = 1 To UBound(Temp) Step 2
Temp(X) = ""
Next
CellContents = Trim(Replace(Join(Temp), vbLf, vbCr))
Do While InStr(CellContents, vbCr & " ")
CellContents = Replace(CellContents, vbCr & " ", vbCr)
Loop
Do While InStr(CellContents, vbCr & vbCr)
CellContents = Replace(CellContents, vbCr & vbCr, vbCr)
Loop
CellContents = Replace(CellContents, vbCr, vbCrLf)
TextBox1.Text = CellContents
End If
End Sub
'*********End UserForm Code*********

Now we need some way to start this all off. For my test condition, I used
the BeforeDoubleClick event of the Worksheet where your HTML text is going
to be located, but you can change this to a button on the Toolbar or some
other mechanism if you want. In the VBA editor, bring up the code window for
the Worksheet you have your HTML text on and copy/paste the following into
it...

'*********Start Worksheet Code*********
Private Sub Worksheet_BeforeDoubleClick(ByVal Target As Range, _
Cancel As Boolean)
Cancel = True
UserForm1.Show
End Sub
'*********End Worksheet Code*********

Now, simply double-click a cell with your HTML text in it and read the
parsed content in the TextBox. When you are finished, click the
CommandButton to dismiss the UserForm.

Rick

0 new messages