Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Re: Reading text file at a URL

744 views
Skip to first unread message

Paul Randall

unread,
Mar 14, 2009, 3:50:35 PM3/14/09
to

"Codeblack" <Code...@discussions.microsoft.com> wrote in message
news:49258A75-03A4-47A1...@microsoft.com...
> Hello,
>
> Is there way to read a text file which is located at a URL. for example
> http://www.abc.com/test/name.txt
>
> I want to read the contents of the text file which is available at the URL
> and copy the contents to another file locally.

Why restrict your self to downloading text files?
VBScript can easily download any type file, including password-protected
files (if you know the username and password), by using the xmlhttp object
included with most recent versions of Windows.

Groups.google can give you a relatively short list of hits, so you don't
have to wade through millions of useless info. Go to groups.google.com and
paste the following into the search box:
download file xmlhttp group:*.scripting.vbscript

-Paul Randall


vasuki.ra...@gmail.com

unread,
Apr 1, 2009, 7:35:20 AM4/1/09
to
Thanks for this. But is it possible to download a pdf file and save it
as text file.

T Lavedas

unread,
Apr 1, 2009, 8:04:08 AM4/1/09
to
On Apr 1, 7:35 am, vasuki.ramakris...@gmail.com wrote:
> Thanks for this. But is it possible to download a pdf file and save it
> as text file.

Short answer is it can be downloaded from the URL, but it will still
be a PDF - unless you have a PDF to Text converter to do that part of
the job.

The following routine will retrieve the file from the web ...

Sub DownBinFile(FilePath, sURL)
const adTypeBinary = 1
const adModeReadWrite = 3
const adSaveCreateOverwrite = 2
' Create an xmlhttp object:
set oXML = CreateObject("MSXML2.XMLHTTP")
oXML.open "GET", sURL, False
oXML.send
With CreateObject("ADODB.Stream")
.type = adTypeBinary
.mode = adModeReadWrite
.open
Do Until oXML.readyState = 4 : Wscript.Sleep 50 : loop
.write oXML.responseBody
.savetofile FilePath, adSaveCreateOverwrite
End With ' ADODB.Stream
End Sub

The Filepath is the name and location where the result is to be
stored, while sURL contains the full designation of the URL, including
the http:// part. This could be used for ftp downloads as well, as
long as the proper inline login syntax is provided as part of the sURL
string.

The conversion of the pdf is not part of my expertise, so you're on
your own on that part.

Tom Lavedas
***********
http://there.is.no.more/tglbatch/

HL0105

unread,
Apr 6, 2009, 3:03:31 PM4/6/09
to


Free Text To PDF Converter:

http://www.sanface.com/txt2pdf.html

I use txt2pdf.exe in my scripts.

T Lavedas

unread,
Apr 6, 2009, 4:12:05 PM4/6/09
to

The OP was in need of a PDF to text converter.

Alex K. Angelopoulos at

unread,
Apr 9, 2009, 6:01:27 AM4/9/09
to
Here's a link for one of those; in fact, there are several handy PDF tools
in the free Xpdf toolkit:

http://www.foolabs.com/xpdf/download.html

"T Lavedas" <tglb...@cox.net> wrote in message
news:cf68fdb1-e510-4a73...@l1g2000yqk.googlegroups.com...

Dakota Reno

unread,
Apr 14, 2012, 8:14:52 PM4/14/12
to
This is how you do it in C/C++ with the Windows API:
#include <stdlib.h>
#include <stdio.h>
#include <windows.h>

int main(int argc, char *argv[])
{
char * url;
char * filen;
//URL:
url = "http://www.abc.com/test/name.txt";
//Filename:
filen = "C:\\Full\\Path\\To\\File\\Name.ext";
//Download file using WinAPI
URLDownloadToFile(0, url, filen, 0);
//Exit
return 0;
}

> On Friday, March 13, 2009 10:50 AM Codeblac wrote:

> Hello,
>
> Is there way to read a text file which is located at a URL. for example
> http://www.abc.com/test/name.txt
>
> I want to read the contents of the text file which is available at the URL
> and copy the contents to another file locally.
>
> Thanks


>> On Saturday, March 14, 2009 12:46 AM Codeblac wrote:

>> Does any one have any clue on this?


>>> On Saturday, March 14, 2009 3:50 PM Paul Randall wrote:

>>> "Codeblack" <Code...@discussions.microsoft.com> wrote in message
>>> news:49258A75-03A4-47A1...@microsoft.com...
>>>
>>> Why restrict your self to downloading text files?
>>> VBScript can easily download any type file, including password-protected
>>> files (if you know the username and password), by using the xmlhttp object
>>> included with most recent versions of Windows.
>>>
>>> Groups.google can give you a relatively short list of hits, so you don't
>>> have to wade through millions of useless info. Go to groups.google.com and
>>> paste the following into the search box:
>>> download file xmlhttp group:*.scripting.vbscript
>>>
>>> -Paul Randall


>>>> On Sunday, April 05, 2009 8:38 PM vasuki.ramakrishn wrote:

>>>> Thanks for this. But is it possible to download a pdf file and save it
>>>> as text file.


>>>>> On Sunday, April 05, 2009 8:38 PM T Lavedas wrote:

>>>>> On Apr 1, 7:35=A0am, vasuki.ramakris...@gmail.com wrote:
>>>>>
>>>>> Short answer is it can be downloaded from the URL, but it will still
>>>>> be a PDF - unless you have a PDF to Text converter to do that part of
>>>>> the job.
>>>>>
>>>>> The following routine will retrieve the file from the web ...
>>>>>
>>>>> Sub DownBinFile(FilePath, sURL)
>>>>> const adTypeBinary =3D 1
>>>>> const adModeReadWrite =3D 3
>>>>> const adSaveCreateOverwrite =3D 2
>>>>> ' Create an xmlhttp object:
>>>>> set oXML =3D CreateObject("MSXML2.XMLHTTP")
>>>>> oXML.open "GET", sURL, False
>>>>> oXML.send
>>>>> With CreateObject("ADODB.Stream")
>>>>> .type =3D adTypeBinary
>>>>> .mode =3D adModeReadWrite
>>>>> .open
>>>>> Do Until oXML.readyState =3D 4 : Wscript.Sleep 50 : loop
>>>>> .write oXML.responseBody
>>>>> .savetofile FilePath, adSaveCreateOverwrite
>>>>> End With ' ADODB.Stream
>>>>> End Sub
>>>>>
>>>>> The Filepath is the name and location where the result is to be
>>>>> stored, while sURL contains the full designation of the URL, including
>>>>> the http:// part. This could be used for ftp downloads as well, as
>>>>> long as the proper inline login syntax is provided as part of the sURL
>>>>> string.
>>>>>
>>>>> The conversion of the pdf is not part of my expertise, so you're on
>>>>> your own on that part.
>>>>>
>>>>> Tom Lavedas
>>>>> ***********
>>>>> http://there.is.no.more/tglbatch/


>>>>>> On Thursday, April 09, 2009 6:01 AM Alex K. Angelopoulos wrote:

>>>>>> Here is a link for one of those; in fact, there are several handy PDF tools
>>>>>> in the free Xpdf toolkit:
>>>>>>
>>>>>> http://www.foolabs.com/xpdf/download.html


>>>>>>> On Saturday, April 11, 2009 5:24 PM HL0105 wrote:

>>>>>>> Free Text To PDF Converter:
>>>>>>>
>>>>>>> http://www.sanface.com/txt2pdf.html
>>>>>>>
>>>>>>> I use txt2pdf.exe in my scripts.


>>>>>>>> On Saturday, April 11, 2009 5:24 PM T Lavedas wrote:

>>>>>>>> t

Dave "Crash" Dummy

unread,
Apr 14, 2012, 10:21:14 PM4/14/12
to
>> On Friday, March 13, 2009 10:50 AM Codeblac wrote:
>
>> Hello,
>>
>> Is there way to read a text file which is located at a URL. for example
>> http://www.abc.com/test/name.txt
>>
>> I want to read the contents of the text file which is available at the URL
>> and copy the contents to another file locally.
>>
>> Thanks

Here's a simple script that will do it.

Set IE = CreateObject("InternetExplorer.Application")
IE.Navigate "http://www.abc.com/test/name.txt"
do until IE.readyState=4 : wscript.sleep 10 : loop

txt=IE.document.body.innerText

Set fso = CreateObject("Scripting.FileSystemObject")
Set textfile=fso.CreateTextFile("name.txt")
textFile.write txt

--
Crash

Atheism is a matter of faith, too.

Mayayana

unread,
Apr 15, 2012, 9:05:08 AM4/15/12
to
Actually, your version is close to the C++ version.
URLDownloadToFile is just a wrapper around IE
automation. It's not a "real" http operation. MS
has basic sockets functions in the API, so that people
can write code to communicate with a server, and then
they threw in some IE automation functions, but they
never created any kind of mid-level functions that
would allow people flexible Internet functionality.

Come to think of it, I wonder why URLDownloadToFile
and related functions don't have a scripting alternative.
They're really just script automation functions in the first
place.

I think there's one point worth noting (though you
probably know this yourself):
Your method will work for a text file, but doesn't
work in what is now defined as standards mode
with an actual webpage. Anyone downloading
webpages and parsing them in the DOM now has
to use two different sets of functions.

But maybe the most compelling question here is
why someone is posting C++ code to a 3-year-old
scripting thread. I don't understand why that happens
so often. The *most believable* reason I can think of
is that people come across a website "forum" posting,
thinking it's an active forum. They don't realize it's a
Usenet re-post (perhaps they don't know what Usenet
is), and they never bother looking at the dates of the
posts. "On a lark" they then decide to join the forum
and post "any old thing", just for the pleasure of holding
forth authoritatively. I'm imagining that such a person
may have actually had a cellphone call coming into one
ear, while pop music was coming into the other ear,
titillating them with canned emotional entertainment...
thus they never actually had any awareness of having
posted. :)



Dave "Crash" Dummy

unread,
Apr 15, 2012, 10:06:58 AM4/15/12
to
Mayayana wrote:
<snipped to get to the point>

> I think there's one point worth noting (though you probably know this
> yourself): Your method will work for a text file, but doesn't work in
> what is now defined as standards mode with an actual webpage. Anyone
> downloading webpages and parsing them in the DOM now has to use two
> different sets of functions.

Actually, I use this method, with some elaboration, to extract
information from regular web sites. For example,

txt=IE.document.all.tags("table")(3).rows(1).cells(1).innerText

> But maybe the most compelling question here is why someone is posting
> C++ code to a 3-year-old scripting thread. I don't understand why
> that happens so often.

Yeah. Annoying. I just treated it as a new query.

--
Crash

"When you want to fool the world, tell the truth."
~ Otto von Bismarck ~

Michael Bednarek

unread,
Apr 16, 2012, 8:05:47 PM4/16/12
to
On Sat, 14 Apr 2012 22:21:14 -0400, "Dave \"Crash\" Dummy" wrote in
microsoft.public.scripting.vbscript:

>>> On Friday, March 13, 2009 10:50 AM Codeblac wrote:
>>
>>> Hello,
>>>
>>> Is there way to read a text file which is located at a URL. for example
>>> http://www.abc.com/test/name.txt
>>>
>>> I want to read the contents of the text file which is available at the URL
>>> and copy the contents to another file locally.
>
>Here's a simple script that will do it.
>
>Set IE = CreateObject("InternetExplorer.Application")
>IE.Navigate "http://www.abc.com/test/name.txt"
>do until IE.readyState=4 : wscript.sleep 10 : loop
>
>txt=IE.document.body.innerText
>
>Set fso = CreateObject("Scripting.FileSystemObject")
>Set textfile=fso.CreateTextFile("name.txt")
>textFile.write txt

Here's a simpler script:
copy "http://www.abc.com/test/name.txt"

Details at <http://jpsoft.com/help/index.htm?copy.htm> and
<http://jpsoft.com/help/index.htm?ftpservers.htm>. Take Command is a
commercial product.

--
Michael Bednarek, Brisbane "ONWARD"

Mayayana

unread,
Apr 16, 2012, 8:24:30 PM4/16/12
to
| Here's a simpler script:
| copy "http://www.abc.com/test/name.txt"
|

How is it simpler when it requires learning a new
tool and paying $100? Your post is unhelpful spam.

(Not to mention that jpsoft.com is a mess. Apparently
it's not able to function without javascript. I had to
paste URLs and disable CSS just to read the webpages.)


Michael Bednarek

unread,
Apr 18, 2012, 7:28:40 PM4/18/12
to
On Mon, 16 Apr 2012 20:24:30 -0400, "Mayayana" wrote in
microsoft.public.scripting.vbscript:

>| Here's a simpler script:
>| copy "http://www.abc.com/test/name.txt"
>
> How is it simpler when it requires learning a new
>tool and paying $100? Your post is unhelpful spam.

You find it difficult to learn "copy"? Paying $100 may not be
everybody's preference, but I can't see how it is not simple. If you
found it unhelpful, I'd think the normal reaction is not to respond.

--
Michael Bednarek http://mbednarek.com/ "POST NO BILLS"

Mayayana

unread,
Apr 19, 2012, 8:22:25 AM4/19/12
to
| >| Here's a simpler script:
| >| copy "http://www.abc.com/test/name.txt"
| >
| > How is it simpler when it requires learning a new
| >tool and paying $100? Your post is unhelpful spam.
|
| You find it difficult to learn "copy"?

You're being deliberately misleading, which is potenially
confusing to people, which is why I posted in the first
place. Your snide reply that "copy" isn't hard to learn
makes it sound like you're talking about the DOS command.
But you're not. If you really think you have a better solution
then why not be clear in explaining what you're talking about?
Like so:

""I recommend that you forget about VBScript, the Windows
Script Host and the GUI scripting options available with WSH.
Instead you should buy a 3rd-party command interpreter
program. The program is named Take Command. It costs $100.
What you get is a console window in which to use a custom,
3rd-party console command language that is loosely based
on DOS. (Note that for $100 you can only get 32-bit *or*
64-bit version, and your "code" can only run in the specialized
command window, and only on one PC.)""

Ironically, there's a good chance that after spending $100,
studying the docs, and limiting oneself to console operations
that can only work on one machine, the "easy" way will end
up just being a complex, bloated wrapper around Dave's 7-line
script. As can be seen in the original post, even C++ programmers
are often just using IE to download files, because to do it
without a browser requires using the sockets API to carry out
a conversation directly with the hosting server. It's not such
a simple operation.

(This issue came up recently in a VB group. Someone was
asking about downloading files from https. I've written code
myself to download files from servers using winsock with no
dependencies, which I provide free in the form of an ActiveX
EXE. But my control can't handle SSL. That requires not only
talking to the server, but also dealing with encryption and
checking out the server certificate. Once you decide to deal
with certificates then you need to provide some kind of GUI so
that the person using your software can choose how to deal
with something like an expired certificate. It gets complicated
fast. As a result most code that can download files is really just
wrapping IE.... and introducing the security risks that come with
IE.)


Michael Bednarek

unread,
Apr 26, 2012, 1:51:56 AM4/26/12
to
On Thu, 19 Apr 2012 08:22:25 -0400, "Mayayana" wrote in
microsoft.public.scripting.vbscript:

>| >| Here's a simpler script:
>| >| copy "http://www.abc.com/test/name.txt"
>| >
>| > How is it simpler when it requires learning a new
>| >tool and paying $100? Your post is unhelpful spam.
>|
>| You find it difficult to learn "copy"?
>
> You're being deliberately misleading, which is potenially
>confusing to people, which is why I posted in the first
>place. Your snide reply that "copy" isn't hard to learn
>makes it sound like you're talking about the DOS command.
>But you're not.

What's a DOS command? The command "copy" is part of Microsoft's default
command processor COMMAND.COM / cmd.exe which shipped as the default
command processor with every Microsoft operating system. Microsoft
always made it very easy to replace it with something else; 4DOS, 4NT,
and now Take Command (TCC/TCMD), provide a replacement. What about the
copy command from cmd.exe needs to be relearned?

> If you really think you have a better solution
>then why not be clear in explaining what you're talking about?
>Like so:
>
>""I recommend that you forget about VBScript, the Windows
>Script Host and the GUI scripting options available with WSH.
>Instead you should buy a 3rd-party command interpreter
>program. The program is named Take Command. It costs $100.
>What you get is a console window in which to use a custom,
>3rd-party console command language that is loosely based
>on DOS. (Note that for $100 you can only get 32-bit *or*
>64-bit version, and your "code" can only run in the specialized
>command window, and only on one PC.)""

The original question was:

Is there way to read a text file which is located at a URL. for
example http://www.abc.com/test/name.txt

That question doesn't seem to restrict replies to VBScript. A reply as
verbose as you suggest might well be called spamming; I did write:

Details at <http://jpsoft.com/help/index.htm?copy.htm> and
<http://jpsoft.com/help/index.htm?ftpservers.htm>. Take Command is a
commercial product.

The interested reader can find out the details, as you apparently did.
Underestimating your fellow readers seems insulting.

> Ironically, there's a good chance that after spending $100,
>studying the docs, and limiting oneself to console operations
>that can only work on one machine, the "easy" way will end
>up just being a complex, bloated wrapper around Dave's 7-line
>script. As can be seen in the original post, even C++ programmers
>are often just using IE to download files, because to do it
>without a browser requires using the sockets API to carry out
>a conversation directly with the hosting server. It's not such
>a simple operation.

How is
copy "http://www.abc.com/test/name.txt"
a complex, bloated wrapper around Dave's script? Of course it's not a
simple operation; that's why paying someone may be an appropriate
option. It seems to me more efficient than me reinventing that wheel
again.

> (This issue came up recently in a VB group. Someone was
>asking about downloading files from https. I've written code
>myself to download files from servers using winsock with no
>dependencies, which I provide free in the form of an ActiveX
>EXE. But my control can't handle SSL. That requires not only
>talking to the server, but also dealing with encryption and
>checking out the server certificate. Once you decide to deal
>with certificates then you need to provide some kind of GUI so
>that the person using your software can choose how to deal
>with something like an expired certificate. It gets complicated
>fast. As a result most code that can download files is really just
>wrapping IE.... and introducing the security risks that come with
>IE.)

Good points; it turns out that Take Command has implemented HTTPS, so
the command under discussion simply changes to:
copy "https://www.abc.com/test/name.txt"

Second, about using IE as the engine: I suspect that's what's happening
under the hood with the PowerShell command
(New-Object System.Net.WebClient). DownloadFile(`
"https://www.abc.com/test/name.txt", "name.txt")
(which is also no great improvement on Dave's script in the
runs-easy-of-the-tongue stakes.
0 new messages