Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

"Headless mshtml"

65 views
Skip to first unread message

Andrew Downum

unread,
May 15, 2002, 8:39:13 AM5/15/02
to
I am trying to get at the HTML DOM in mshtml without any of the UI parts.
What I need to be able to do is to create the DOM, give it an HTML string,
and then traverse the tree.

First, I add a reference the the mshtml Primary Interop Assembly, and then
try the following
//begin code
mshtml.HTMLDocumentClass doc = new mshtml.HTMLDocumentClass();

doc.write("<H1>HTML GOES HERE</H1>".ToCharArray());

//end code
The signature for the call is HTMLDocumentClass:write(params object[]
psarray)

When that is executed, I get a ComInterop Error (Type mismatch)

So I decided to try using the IPersistFile interface...
//begin code
HTMLDocumentClass dom = new HTMLDocumentClass();

UCOMIPersistFile pf = (UCOMIPersistFile)dom;


pf.Load(@"C:\TEMP\file.html", 0);

//end code

I can make this work, but it requires that I write out to the disk, and then
back again...Anyone have any ideas on how to make it work directly

Praveen Kosuri

unread,
Jun 3, 2002, 5:58:23 PM6/3/02
to
Hi Andrew,

What you have found is the way to write to the HTMLDOcument from a .NET
application. If you use VB6 to do this,internally, VB6 virtual machine
automatically initializes MSHTML.HTMLDocument (with
"<html><body></body></html>"). After initializing, the Length of the object
works correctly. However, in .NET world (VB.NET or C# for that matter), you
need to explicitly initialize the object before calling any methods. And
how to do that? It is done using methods of IPersistStreamInit interface
(like you did ). COM Interop Services provides wrappers for any interfaces
(checkout members of System.Runtime.InteropServices). However, it does not
provide a wrapper for this interface. You will need to implement the
interface yourself.

I'm currently in the process of documenting this issue.

Let me know if you have any further questions.

Thank you!
Praveen Kosuri
Microsoft Developer Support

This posting is provided “AS IS” with no warranties, and confers no rights.
--------------------
From: "Andrew Downum" <ado...@designsbydownum.com>
Subject: "Headless mshtml"
Date: Wed, 15 May 2002 06:39:13 -0600
Lines: 33
X-Priority: 3
X-MSMail-Priority: Normal
X-Newsreader: Microsoft Outlook Express 6.00.2600.0000
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2600.0000
Message-ID: <OOcMG2A$BHA.1108@tkmsftngp04>
Newsgroups: microsoft.public.dotnet.framework.interop
NNTP-Posting-Host: 12-252-47-87.client.attbi.com 12.252.47.87
Path: cpmsftngxa08!tkmsftngp01!tkmsftngp04
Xref: cpmsftngxa08 microsoft.public.dotnet.framework.interop:5464
X-Tomcat-NG: microsoft.public.dotnet.framework.interop

Ed

unread,
Jun 10, 2002, 6:26:33 PM6/10/02
to
Hi Andrew,

I would like to use the mshtml object as you are. I tried
your sample code and the document object's readyState changes
from "unitialized" to "loading" but does nothing further. The
body object remains null. Is there something further I must do
to get the document loaded?

My sample code:
// BEGIN CODE
UCOMIPersistFile pf = (UCOMIPersistFile)doc;
pf.Load( htmlfile, 0 );
while( doc.body == null )
{
Thread.Sleep( 100 );
log( doc.readyState );
}
// END CODE

Thanks,
Ed

0 new messages