Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Bug#704496: lynx-cur: lynx prepends html stuff when downloading a pure text file

2 views
Skip to first unread message

Rick Thomas

unread,
Apr 1, 2013, 7:40:01 PM4/1/13
to
Package: lynx-cur
Version: 2.8.8dev.5-1
Severity: normal
Tags: upstream

Hi there!

The following discussion occurred on the "debian-users" email list.

Enjoy!

Rick

======================================================
Rick wrote:

On an apple macintosh G4, running debian squeeze, I use lynx to download cd-images from cdimage.debian.org. I have no problem getting CD ".iso" images.

But when I try downloading the MD5SUMS file from the same directory, I get a few lines of HTML pre-pended to the downloaded file.

Has anybody else seen this behavior? Am I doing something wrong? Is this a bug in Lynx?

Rick



Then wes wrote, in reply:

hi rick.

But when I try downloading the MD5SUMS file from the same directory, I get a few lines of HTML pre-pended to the downloaded file.

lines like the following?

<!-- X-URL: http://cdimage.debian.org/debian-cd/6.0.7/powerpc/iso-cd/MD5SUMS -->
<!-- Date: Mon, 01 Apr 2013 00:34:12 GMT -->
<!-- Last-Modified: Sun, 24 Feb 2013 00:31:23 GMT -->
<BASE HREF="http://cdimage.debian.org/debian-cd/6.0.7/powerpc/iso-cd/MD5SUMS">


Has anybody else seen this behavior?

well, i just noticed it now, retracing your steps.

Is this a bug in Lynx?

i do not believe so. afaict, lynx adds those lines for generally
sensible reasons, as explained here:

http://lynx.isc.org/current/lynx2-8-8/lynx_help/Lynx_users_guide.html#RemoteSource

as mentioned there, if you really want to, you can disable this
behavior by adding

PREPEND_BASE_TO_SOURCE:FALSE

to your lynx.cfg.

<some discussion of using wget instead of lynx for this particular application>

hope this helps,
wes



To which Rick replied:

Thank you , Wes, for the very complete and helpful explanation.

You seem to know a lot about this.
I hope you don't mind if I continue to pick your brain on this subject... (-:


On Mar 31, 2013, at 7:49 PM, wes wrote:

hi rick.

But when I try downloading the MD5SUMS file from the same directory, I get a few lines of HTML pre-pended to the downloaded file.

lines like the following?

<!-- X-URL: http://cdimage.debian.org/debian-cd/6.0.7/powerpc/iso-cd/MD5SUMS -->
<!-- Date: Mon, 01 Apr 2013 00:34:12 GMT -->
<!-- Last-Modified: Sun, 24 Feb 2013 00:31:23 GMT -->
<BASE HREF="http://cdimage.debian.org/debian-cd/6.0.7/powerpc/iso-cd/MD5SUMS">


Yes, exactly.

As you probably figured out, I'm using the "d" command not the "<cr>" option to start the download.

So my next question is: Why does it do this when downloading MD5SUM, but *not* when downloading the ".iso" file? Does it recognize a difference between the two formats (binary vs a text) and invoke a corresponding difference in treatment for download?

And the next-next question: The text file is not an html text file. This is a difference that it could also recognize, one would think. It seems reasonable to me that a non-html file should be treated just like a binary file for download purposes. Am I missing something?



Has anybody else seen this behavior?

well, i just noticed it now, retracing your steps.

Is this a bug in Lynx?

i do not believe so. afaict, lynx adds those lines for generally
sensible reasons, as explained here:

http://lynx.isc.org/current/lynx2-8-8/lynx_help/Lynx_users_guide.html#RemoteSource


So it's a feature (hence, not a bug) when downloading html text files. Is it, then, a bug when that feature gets applied indiscriminately to all text files, even those that don't contain html?



as mentioned there, if you really want to, you can disable this
behavior by adding

PREPEND_BASE_TO_SOURCE:FALSE

to your lynx.cfg.



I'd prefer, of course, if it automatically knew the difference and acted correctly in all cases. But if that's not possible, I guess having a config option to disable the behavior is the next best thing.

For what it's worth, I've never known any browser other than lynx that behaves this way. Is there something special about the design of lynx that makes this desirable?

Thanks!

Rick


And wes replied:

fair enough. report bugs to this address:

lynx...@nongnu.org


-- System Information:
Debian Release: 6.0.7
APT prefers stable-updates
APT policy: (500, 'stable-updates'), (500, 'stable')
Architecture: powerpc (ppc)

Kernel: Linux 2.6.32-5-powerpc-smp (SMP w/2 CPU cores)
Locale: LANG=C, LC_CTYPE=C (charmap=ANSI_X3.4-1968)
Shell: /bin/sh linked to /bin/dash

Versions of packages lynx-cur depends on:
ii debconf [debconf-2.0] 1.5.36.1 Debian configuration management sy
ii libbsd0 0.2.0-1 utility functions from BSD systems
ii libc6 2.11.3-4 Embedded GNU C Library: Shared lib
ii libgcrypt11 1.4.5-2 LGPL Crypto library - runtime libr
ii libgnutls26 2.8.6-1+squeeze2 the GNU TLS library - runtime libr
ii libidn11 1.15-2 GNU Libidn library, implementation
ii libncursesw5 5.7+20100313-5 shared libraries for terminal hand
ii zlib1g 1:1.2.3.4.dfsg-3 compression library - runtime

Versions of packages lynx-cur recommends:
ii mime-support 3.48-1 MIME files 'mime.types' & 'mailcap

Versions of packages lynx-cur suggests:
pn lynx-cur-wrapper <none> (no description available)

-- debconf information:
lynx-cur/defaulturl: http://www.rcthomas.org/
lynx-cur/etc_lynx.cfg:


--
To UNSUBSCRIBE, email to debian-bugs-...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listm...@lists.debian.org

Atsuhito Kohda

unread,
Apr 1, 2013, 8:40:02 PM4/1/13
to
Hi Rick,

On Mon, 01 Apr 2013 16:25:18 -0700, Rick Thomas wrote:

> Hi there!
>
> The following discussion occurred on the "debian-users" email list.
>
> Enjoy!
>
> Rick
>
> ======================================================
> Rick wrote:
>
> On an apple macintosh G4, running debian squeeze, I use lynx to download cd-images from cdimage.debian.org. I have no problem getting CD ".iso" images.
>
> But when I try downloading the MD5SUMS file from the same directory, I get a few lines of HTML pre-pended to the downloaded file.
>
> Has anybody else seen this behavior? Am I doing something wrong? Is this a bug in Lynx?

I guess "downloading" the MD5SUMS file is a bit wrong
but pressing a "P" key (you might see P)rint in a screen)
might solve your problem.

I hope this helps you. Thanks for your interest in lynx.

Best regards, 2013-4-2(Tue)

--
Debian Developer - much more I18N of Debian
Atsuhito Kohda <kohda AT debian.org>
Department of Math., Univ. of Tokushima

Thomas Dickey

unread,
Apr 1, 2013, 9:00:02 PM4/1/13
to
On Mon, Apr 01, 2013 at 04:25:18PM -0700, Rick Thomas wrote:
> Package: lynx-cur
> Version: 2.8.8dev.5-1
> Severity: normal
> Tags: upstream
>
> Hi there!
>
> The following discussion occurred on the "debian-users" email list.
>
> Enjoy!
>
> Rick
>
> ======================================================
> Rick wrote:
>
> On an apple macintosh G4, running debian squeeze, I use lynx to download
> cd-images from cdimage.debian.org. I have no problem getting CD ".iso"
> images.
>
> But when I try downloading the MD5SUMS file from the same directory, I get a
> few lines of HTML pre-pended to the downloaded file.
>
> Has anybody else seen this behavior? Am I doing something wrong? Is this a
> bug in Lynx?

no - the check that Lynx is making essentially says that it's found something
at that URL which it can present (display) rather than some other type of
content. Given that, it's a "document". Consider it a feature.

--
Thomas E. Dickey <dic...@invisible-island.net>
http://invisible-island.net
ftp://invisible-island.net
signature.asc

Rick Thomas

unread,
Apr 1, 2013, 9:00:02 PM4/1/13
to

On Apr 1, 2013, at 5:01 PM, Atsuhito Kohda wrote:

>> On an apple macintosh G4, running debian squeeze, I use lynx to
>> download cd-images from cdimage.debian.org. I have no problem
>> getting CD ".iso" images.
>>
>> But when I try downloading the MD5SUMS file from the same
>> directory, I get a few lines of HTML pre-pended to the downloaded
>> file.

> I guess "downloading" the MD5SUMS file is a bit wrong
> but pressing a "P" key (you might see P)rint in a screen)
> might solve your problem.
>
> I hope this helps you. Thanks for your interest in lynx.

For this particular application, that would work.

But in general, using P is a fragile way to "download" a file. It
expands tabs, and interprets other formatting stuff. If what I want
is a bit-for-bit copy of what's there on the server, that's not useful.

Maybe the right solution is to have "d" download in binary (bit-for-
bit) mode and "D" download assuming I really want to use it as part of
a clone of the source website. I.e. "D" would do what "d" does now
(including recognizing real binary files and not prepending) and "d"
would never alter the file being downloaded. Or, if you prefer, it
could be reversed. "d" remains unchanged from its present behavior,
and "D" does the new bit-for-bit download.

Rick
0 new messages