Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

TWebBrowser - get html source into a listbox

172 views
Skip to first unread message

Jens Möller

unread,
Dec 10, 2001, 4:38:15 PM12/10/01
to
Hello NG,

can someone please post a code snippet how a progam that uses the
TWebBrowser ActiveX Control copy the html source of the current
site into a listbox when the full page has loaded?

First I tried to use code from here:
http://www.bytesandmore.de/rad/cpp/snipp/sc08007.php
But this does not work - when I take a look at the source I can't continue
to surf the web (click on the source tab, click back on the Browser tab
and click a link on the website - the program now thinks the website is
on your harddisk).


Regards,
Jens


Jens Möller

unread,
Dec 10, 2001, 5:04:29 PM12/10/01
to
What I basically need is a translation of this Delphi code into C++:

var
HTMLDocument: IHTMLDocument2;
Source :String;
begin
HTMLDocument := Web.Document as IHTMLDocument2;
Source := HTMLDocument.Body.Get_outerHTML;
Memo1.Text := Source;
end;


Hans Galema

unread,
Dec 10, 2001, 5:13:41 PM12/10/01
to
"Jens Möller" wrote:

> First I tried to use code from here:
> http://www.bytesandmore.de/rad/cpp/snipp/sc08007.php
> But this does not work

I use that code and it works perfect.
It's for a TRichEdit.
Did you try that first before adapting it to a listbox ?

Hans.

Jens Möller

unread,
Dec 10, 2001, 5:22:18 PM12/10/01
to
I hope I don't double post this text (in my newsreader it does not show up so I suppose
my first try did not submit the following text):


The __Web.Document as IHTMLDocument2__ part is the problem.


Hans Galema

unread,
Dec 10, 2001, 5:21:58 PM12/10/01
to
"Jens Möller" wrote:

> First I tried to use code from here:
> http://www.bytesandmore.de/rad/cpp/snipp/sc08007.php
> But this does not work -

So i thougt you could not get the source!
That works also.

> when I take a look at the source I can't continue
> to surf the web (click on the source tab, click back on the Browser tab
> and click a link on the website - the program now thinks the website is
> on your harddisk).

Well i can use back and forward and they work.
Clicking on links does not work however.
With back and forward the page is shown but images
are not found.

We have to find out.

Hans.

Remy Lebeau

unread,
Dec 10, 2001, 5:22:45 PM12/10/01
to
Have a look at this example:

http://www.mers.com/MERLIST/BORLAND/PUBLIC/CPPBUILDER/VCL/COMPONENTS/USING/2
9776.HTML
(may be word-wrapped)


Gambit

"Jens Möller" <jen...@geekmail.de> wrote in message
news:3c153581$1_1@dnews...

> What I basically need is a translation of this Delphi code into C++:

<snip>

Jens Möller

unread,
Dec 10, 2001, 5:31:13 PM12/10/01
to
> http://www.mers.com/MERLIST/BORLAND/PUBLIC/CPPBUILDER/VCL/COMPONENTS/USING/29776.HTML
> (may be word-wrapped)

I already tried this. At http://www.bytesandmore.de/rad/cpp/snipp/sc08007.php you can download
an example which has the same problems as my application has:
when you execute this snippet of code the browser thinks the website is on your harddisk.
So when you show the html source in a Memo for example and then click on a link on the site you
will get an error message.

If someone could translate this Delphi code:

var
HTMLDocument: IHTMLDocument2;
Source :String;
begin
HTMLDocument := Web.Document as IHTMLDocument2;
Source := HTMLDocument.Body.Get_outerHTML;
Memo1.Text := Source;
end;

to C++ - then we could get html source during runtime without saving the file.


Hans Galema

unread,
Dec 10, 2001, 6:05:16 PM12/10/01
to
"Jens Möller" wrote:

> to C++ - then we could get html source during runtime without saving the file.

I don't know how to translate the pascal code but
the C++ code could be transformed that the content is not saved
to a file but to a TMemoryStream; Then the 'filethinking problem'
would not be there - i suppose-.

Who can adapt this for that ?

Hans.

#include <mshtml.h>

IHTMLDocument2 *HTMLDocument = NULL;
IPersistFile *PersistFile = NULL;


if(SUCCEEDED(CppWebBrowser1->Document->QueryInterface(IID_IHTMLDocument2,
(LPVOID*)&HTMLDocument)))
{
if(SUCCEEDED(HTMLDocument->QueryInterface(IID_IPersistFile,
(LPVOID*)&PersistFile)))
{
PersistFile->Save(WideString(String("remytemp.html")),
true);
PersistFile->Release();
}

HTMLDocument->Release();

TStringList *HTMLSource = new TStringList;
HTMLSource->LoadFromFile("temp.html");
// use HTMLSource then...
delete HTMLSource;
}

Jens Möller

unread,
Dec 10, 2001, 6:37:10 PM12/10/01
to
In borland.public.cppbuilder.activex someone translated the Delphi code into:

IHTMLDocument2* HTMLDocument;
AnsiString Source;
HTMLDocument = Web->Document;
Source = HTMLDocument->Body->Get_outerHTML;
Memo1->Text = Source;

But this does not work, I get these two error messages:
Cannot convert 'System::DelphiInterface<IDispatch>' to 'IHTMLDocument2 *'.
'Body' is not a member of 'IHTMLDocument2'

It seems that Delphi can do this "type cast".
Any other ideas?


Jens


Remy Lebeau

unread,
Dec 10, 2001, 7:00:55 PM12/10/01
to
Here's your C++ translation:

IHTMLDocument2 *HTMLDocument = NULL;
if(SUCCEEDED(Web->Document->QueryInterface(IID_IHTMLDocument2,
(LPVOID*)&HTMLDocument)))
{
IHTMLElement *HTMLBody = NULL;
if(SUCCEEDED(HTMLDocument->get_body(&HTMLBody)))
{
BSTR HTMLSource;
if(SUCCEEDED(HTMLBody->get_outerHTML(&HTMLSource)))
Memo1->Text = WideString(HTMLSource);
HTMLBody->Release();
}
HTMLDocument->Release();
}


Gambit

"Jens Möller" <jen...@geekmail.de> wrote in message

news:3c15470c$1_1@dnews...

Hans Galema

unread,
Dec 11, 2001, 12:31:15 PM12/11/01
to
Remy Lebeau wrote:
>
> Here's your C++ translation:

Thank you for your C++ translation for his Delphi code.
This is really very usefull.

Hans.

Hans Galema

unread,
Dec 12, 2001, 4:09:59 PM12/12/01
to
Remy and Jens.

Well we cheered to soon.

The code works but it shows the reassembled source from
the htmlelements after parsing. It also only shows the
source of the body.

To get the original source take the code below.
I once received it from
#Subject: need help on com interface
#Date: Tue, 27 Nov 2001 21:23:48 +0800
#From: "leon" <le...@desunsoft.com>
#Organization: desunsoft
#Newsgroups: borland.public.cppbuilder.internet

.... but had forgotten about it.
My impression is that the code could be simpler.
But it works ok.

By the way: Why did i not see -and Jens also- that this
was only the body? We looked only if the influence of
saving to file was not there anymore.

Hans.

AnsiString GetSource ( TCppWebBrowser *CppWebBrowser )
{
AnsiString Source = "";

IHTMLDocument2 *htm = NULL; // #include <mshtml.h>

if(CppWebBrowser->Document
&&
SUCCEEDED(CppWebBrowser->Document->QueryInterface(IID_IHTMLDocument2,
(LPVOID*)&htm))
)
{
IPersistStreamInit *spPsi = NULL; // ocidl.h

if(SUCCEEDED(htm->QueryInterface(IID_IPersistStreamInit,
(LPVOID*)&spPsi)) && spPsi)
{
IStream *spStream = NULL; // objidl.h
OleCheck(CreateStreamOnHGlobal(NULL, true, &spStream));
if(spStream)
{
__int64 nSize = 0;
STATSTG ss;
LARGE_INTEGER nMove;
nMove.QuadPart = 0;
OleCheck(spPsi->Save(spStream, true));
OleCheck(spStream->Seek(nMove, STREAM_SEEK_SET,
(ULARGE_INTEGER *)&nSize));
OleCheck(spStream->Stat(&ss, STATFLAG_NONAME));
nSize = ss.cbSize.QuadPart;

Source.SetLength(nSize);
OleCheck(spStream->Read((void *)Source.data(), nSize,
(ULONG *)&nSize));
OleCheck(spStream->Release());
}

spPsi->Release();
}
htm->Release();
}

return Source;

0 new messages