The problem, some web sites don't provide the web page with a src tag that
has the url of the pdf file. So, how do I get the url of the pdf file if the
web site doesn't give me the web page with the src tag? Also, I run 9
"Collect PDF" apps on each pc, so reading the cache is not a good idea.
No easy stuff and needs a lot of casting (easier to start with option strict
of and than afterward set it on to correct that)
http://msdn.microsoft.com/en-us/library/aa741317.aspx
You should not set an import to it, but fully describe it every time as
everything becomes terrible slow in VB Net 2003 as the import is used.
Cor
When you say "some web sites don't provide the web page with a src tag", do
you mean those "web pages" actually just let the Acrobat reader take over
the entire browser control (the PDF document was fully filled into the
control)? Or those web pages uses other means to embed a PDF file as a
portion of the entire HTML page?
Could you give me a specific sample of the problem you met so I can better
understand the situation and see if we can come out a solution to it?
Thanks,
Jie Wang (jie...@online.microsoft.com, remove 'online.')
Microsoft Online Community Support
Delighting our customers is our #1 priority. We welcome your comments and
suggestions about how we can improve the support we provide to you. Please
feel free to let my manager know what you think of the level of service
provided. You can send feedback directly to my manager at:
msd...@microsoft.com.
==================================================
Get notification to my posts through email? Please refer to
http://msdn.microsoft.com/en-us/subscriptions/aa948868.aspx#notifications.
Note: MSDN Managed Newsgroup support offering is for non-urgent issues
where an initial response from the community or a Microsoft Support
Engineer within 2 business days is acceptable. Please note that each follow
up response may take approximately 2 business days as the support
professional working with you may need further investigation to reach the
most efficient resolution. The offering is not appropriate for situations
that require urgent, real-time or phone-based interactions. Issues of this
nature are best handled working with a dedicated Microsoft Support Engineer
by contacting Microsoft Customer Support Services (CSS) at
http://msdn.microsoft.com/en-us/subscriptions/aa948874.aspx
==================================================
This posting is provided "AS IS" with no warranties, and confers no rights.
WebDisplay.Document.All(0).outerhtml gives me the following error message
Run-time exception thrown : System.MissingMemberException - Public member
'All' on type 'IAcroAXDocShim' not found.
But with the web sites that do work, I'm able to see the html code that has
the src tag.
Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As
System.EventArgs) Handles MyBase.Load
AxWebBrowser1.Navigate("http://research.microsoft.com/en-us/um/people/cmbish
op/prml/bishop-prml-sample.pdf")
End Sub
Private Sub AxWebBrowser1_DocumentComplete(ByVal sender As System.Object,
ByVal e As AxSHDocVw.DWebBrowserEvents2_DocumentCompleteEvent) Handles
AxWebBrowser1.DocumentComplete
MessageBox.Show(e.uRL.ToString())
End Sub
With the code above, we'll see the MessageBox pops up with the URL of the
PDF file.
Will this then help you to download the PDF file?
Regards,
I would like to ask you how you download the file.
I tried to do that by upgrading an existing vb6 app
using the internet transfer control but it didn't seem to
work.
Will you please tell me?
Thanks in advance.
Any updates on this issue?
If you have any further questions regarding this issue, please kindly let
me know.
Thanks,
Due to company policy I can't list the full urls, but I did list the ending
of the urls that are the difference.
Here is a good link on the topic, as of 5/12/09
http://www.vbforums.com/showthread.php?p=2686017
From the form of the URL we can't really see why some work while others not.
However, I can think of two ways of sending a PDF file to the client side
for it to be displayed in the entire browser control:
1. Write PDF stream directly to the response stream, while setting the
content type to application/pdf.
or, it can
2. Send a HTTP redirect code to the client, so the client is responsible
for re-sending the request to the new URL to get the *real* PDF file.
Now I suspect the "not working" scenario could be cause by the second way.
I'll try to setup an environment to test these two scenarios and see what
exactly happens.
Could you let me know, for the not working URLs like "...
doc1/038110817935", what if you manually navigate to that URL in a IE
browser? Do you get the PDF file?
Thanks,
If I redirect the web control to the URL "... doc1/038110817935" it will
show the web page, not the pdf file, this occurs for both web sites that work
and don't work.
To clarify, the web sites that work, DocumentComplete captures
/doc1/038110817935
and cgi-bin/show_temp.pl?file=pdf18445142694797&type=application/pdf
The web sites that don't work, DocumentComplete captures only
/doc1/038110817935
I've found another way to get the URL of the PDF file currently being
displayed "full screen" in the web browser control.
Actually, when the PDF document is displayed "full screen" (or shall we say
full control) in the web browser control, the Document property actually
returns an IAcroAxDocShim interface instead of an HTML DOM interface. This
matches the error description in your second post.
So all we need to do is check the type of the Document property and see if
it is IAcroAxDocShim, we can call src property on that interface and get
the URL of the PDF file.
The code looks like this (suppose I have a WebBrowser control named
AxWebBrowser1):
If (TypeOf AxWebBrowser1.Document Is AcroPDFLib.IAcroAXDocShim) Then
' this is a full screen PDF file
Dim pdfSrc As String
pdfSrc = CType(AxWebBrowser1.Document, AcroPDFLib.IAcroAXDocShim).src
Else
' this is a normal HTML page, process the page.
End If
To access the IAcroAxDocShim interface, you need to add a COM reference to
your project, named "Adobe Acrobat Browser Control Type Library 1.0".
What I'm not sure is whether or not we can download the file even we got
the URL in some cases - like if the page request requires an authenticated
session, this approach may still fail. I tried to save the PDF file via the
IAcroAXDocShim, but failed to find a way to do so. This control is made by
Adobe and I was not able to find a document of how to use it from their
website.
Anyway, please let me know if the IAcroAXDocShim can help you get the URL
first. Then we'll think of a way to get the file via the URL. You can also
try Adobe's online forum to ask questions related to the IAcroAXDocShim
interface to get more information.
Best regards,
Returns "… /doc1/038110817935"
I want "… /cgi-bin/show_temp.pl?file=pdf44974838787197&type=application/pdf"
This is a challenging problem to say the least. I searched on Adobe site,
Adobe forms and Google didn't find anything relevant in any of those oceans.
I posted a new question on Adobe forms, I will keep this tread updated with
any helpful info. It might be a few days.
Another notable method on the IAcroAxDocShim interface is execCommand, I
don't know if there could be some command to be executed on it to save the
PDF to local disk (should be one because the PDF reader control itself has
a save button on the UI so I guess there should be a corresponding OM way
to do that). But lacking of document from Adobe makes it hard to figure
out, too.
Hope these clues help in some way.
Regards,
Dim PDDoc As Object
Dim AVDoc As Object
Dim AcroExchApp As Object
Dim AVDocTarget As Object
AcroExchApp = CreateObject("AcroExch.App")
AVDocTarget = CreateObject("AcroExch.AVDoc")
AVDoc = AcroExchApp.GetActiveDoc
PDDoc = AVDocTarget.GetPDDoc
PDDoc.Save(1 Or 4 Or 32, "C:\IMAGE\test.pdf")
PDDoc.Close()
PDDoc = Nothing
AVDoc.Close(True)
AVDoc = Nothing
AcroExchApp.Exit()
AcroExchApp = Nothing
AVDocTarget.Exit()
AVDocTarget = Nothing
But AVDoc and PDDoc are coming back with nothing. PDDoc.Save will error
because PDDoc is nothing.
I might be on the wrong track, and need to get back to just using
IAcroAxDocShim interface.
I researched execCommand, like you said no documentation, in fact the white
paper "Adobe Interapplication Communication API Reference" doesn't even list
it as one of the methods.
Some of the code I got from
http://support.adobe.com/devsup/devsup.nsf/docs/51415.htm
I need to vent some more, I hate Adobe. 99% of what I read was crap but I
had to read through it so I can find the 1% that I did need. I also talked
with Adobe tech support, they were no help.
Let's keep looking into IAcroAxDocShim execCommand, unless you know how to
solve the AVDoc and PDDoc are coming back with nothing, stated above.
Not sure how can I cast the document into IDownloadManager interface?
I was looking at the interfaces implemented by the Adobe PDF Reader object,
it implemented the IPersistFile interface and I tried to call Save method
of that interface. However, I got a Not Implemented exception.
I'll try check the ROT to see what else COM interfaces I can get.
I don't know why Adobe implemented a LoadFile method on the IAcroAxDocShim
interface, but didn't put a SaveAs method there.
Meanwhile, please keep trying getting some help from Adobe - I will keep
assisting on this issue and see if there is any other workarounds beyond
dealing with the Acrobat object model, but it just looks weird a MSFT
engineer is supporting on Adobe products. ;-)
I will continue to research.
Thanks,
I would like to try other interfaces. Back on 5/25 you posted, "I was
looking at the interfaces implemented by the Adobe PDF Reader object,
it implemented the IPersistFile interface and I tried to call Save method
of that interface. However, I got a Not Implemented exception.
I'll try check the ROT to see what else COM interfaces I can get. "
I too, recieved a "Not implemented".
I want to try to just save the pdf file, how can I get the web brower
control AxSHDocVw.AxWebBrowser to save the pdf file it is displaying?
CType(WebDisplay.Document, UCOMIPersistFile).Save("C:\IMAGE\test.pdf", True)
but error with
And since Adobe said there is no way to get the PDF file from the PDF
Reader control, it looks like our only hope is to try extracting the PDF
file from IE cache.
I'll check the possibilitie of the cache approach and get back here.
Regards,
Jie Wang
WebDisplay.ExecWB(SHDocVw.OLECMDID.OLECMDID_SAVEAS,
SHDocVw.OLECMDEXECOPT.OLECMDEXECOPT_DONTPROMPTUSER, "C:\IMAGE\test.pdf",
"C:\IMAGE\test.pdf")
Saves the pdf that is displayed in the web browser, however prompt comes up,
even tho the "dontpromptuser" parameter is used. Lots of posts on the net
about this, one post stated after IE4, MS blugged the security hole, and now
requires prompt regarless of "dontpromptuser".
Do you know how make ExecWB not prompt a user on saveas?
I tried OLECMDID_SAVE, but nothing happened, no file was saved to the HD.
Regarding the save dialog, there is no way to suppress it. However, we can
use another thread to automate the dialog:
<DllImport("user32.dll", SetLastError:=True, CharSet:=CharSet.Ansi)> _
Private Shared Function FindWindowEx(ByVal parentHandle As IntPtr, _
ByVal childAfter As IntPtr, _
ByVal lclassName As String, _
ByVal windowTitle As String) As IntPtr
End Function
<DllImport("user32.dll", SetLastError:=True, CharSet:=CharSet.Auto)> _
Private Shared Function SendMessage( _
ByVal hWnd As IntPtr, _
ByVal Msg As UInteger, _
ByVal wParam As IntPtr, _
ByVal lParam As IntPtr) As IntPtr
End Function
Private Const WM_SETTEXT As UInteger = &HC
Private Const BM_CLICK As UInteger = &HF5
Private Const MutexName = "SavePDFMutex"
Private Sub SavePDF(ByVal param As Object)
Dim fileName As String = CType(param, String)
Dim timeOut As Integer = 5
Dim hWndSaveAs As IntPtr
Thread.Sleep(500)
Do While True
' Get the Adobe Reader Save a Copy... dialog window handle
hWndSaveAs = FindWindowEx(IntPtr.Zero, IntPtr.Zero, "#32770", "Save
a Copy...")
If hWndSaveAs = IntPtr.Zero Then
Thread.Sleep(1000)
timeOut = timeOut - 1
If timeOut = 0 Then
' 5 seconds timeout, still can't find the dialog
Throw New ApplicationException("Unable to find the Save
dialog window")
End If
Else
' Dialog found, proceed.
Exit Do
End If
Loop
Dim lpFileName As IntPtr = Marshal.StringToHGlobalAuto(fileName)
Try
Dim hWndCboEx As IntPtr = FindWindowEx(hWndSaveAs, IntPtr.Zero,
"ComboBoxEx32", Nothing)
Dim hWndCbo As IntPtr = FindWindowEx(hWndCboEx, IntPtr.Zero,
"ComboBox", Nothing)
Dim hWndTxt As IntPtr = FindWindowEx(hWndCbo, IntPtr.Zero, "Edit",
Nothing)
Dim hWndSave As IntPtr = FindWindowEx(hWndSaveAs, IntPtr.Zero,
"Button", "Save")
' Set the filename
SendMessage(hWndTxt, WM_SETTEXT, IntPtr.Zero, lpFileName)
' Click on the button
SendMessage(hWndSave, BM_CLICK, IntPtr.Zero, IntPtr.Zero)
Finally
Marshal.FreeHGlobal(lpFileName)
End Try
End Sub
Now at the time you want to save the PDF, use the following code:
' Since you're going to have more than one instance of the application
running,
' the mutex will make sure there will be only one save dialog at a time.
Dim mu As New Mutex(False, MutexName)
Dim t As New Thread(AddressOf SavePDF)
mu.WaitOne()
' Start the save PDF thread, passing the filename to be saved.
t.Start("D:\SavedPDF" & Guid.NewGuid().ToString("N") & ".pdf")
AxWebBrowser1.ExecWB(SHDocVw.OLECMDID.OLECMDID_SAVEAS,
SHDocVw.OLECMDEXECOPT.OLECMDEXECOPT_DONTPROMPTUSER)
' Wait until the thread exists
t.Join()
mu.ReleaseMutex()
**************************
Another possible alternative to the ExecWB is the URLDownloadToFile
function.
<DllImport("urlmon.dll", CharSet:=CharSet.Auto, preservesig:=False)> _
Private Shared Sub URLDownloadToFile( _
<MarshalAs(UnmanagedType.IUnknown)> ByVal pCaller As
Object, _
ByVal szURL As String, _
ByVal szFileName As String, _
ByVal dwReserved As Integer, _
ByVal lpfnCB As IntPtr)
End Sub
Now at the time you want to save the PDF, use the following code:
If (TypeOf AxWebBrowser1.Document Is AcroPDFLib.IAcroAXDocShim) Then
' Get the PDF source URL
Dim url As String = CType(AxWebBrowser1.Document,
AcroPDFLib.IAcroAXDocShim).src
' Download the PDF file in the web browser control's context
URLDownloadToFile(AxWebBrowser1.GetOcx(), url, "D:\test.pdf", 0,
IntPtr.Zero)
End If
Please let me know how these two solutions works.
Thanks,
The first solution might work. I'm wait back from the webmaster to see if
the remaining sites can provide me the temp page that has the pdf url. I'll
keep you updated.
The URLDownloadToFile works if the "web page" on the server side actually
writes a stream of the PDF file to the response.
In my test, if that is the case, the URLDownloadToFile can actually get the
PDF file saved to the disk:
URLDownloadToFile(AxWebBrowser1.GetOcx(),
"http://testSever/getPDF.aspx?file=test.pdf", "D:\test.pdf", 0, IntPtr.Zero)
So why not have a try if you got a minute or two? Just one line of code. :)
Anyway, I'll keep watching this post for you update.
Best regards,