[Dillo-dev] Bug in meta tag refresh code [with cause]

4 views
Skip to first unread message

Nick Warne

unread,
May 1, 2016, 11:09:11 AM5/1/16
to Dillo Dev
I have been looking at this for about 5 hours now, and it has fried my
brain.

Reading the code, and then reading the email thread about depreciated
meta tag 'refresh', I agree with Dillo's actions.

But what I found was that on some sites (mine included), if there was a tag:

<meta http-equiv="refresh" content="15" />

Dillo reports and the warning message reports:

The author wanted you to go _here_ after 10 seconds.

What happens now is the link (_here_) gets appended with the refresh
value, so for example:

www.example.com/refesh.html

becomes:

www.example.com/10 (or 15 or whatever the refresh value is).

I have found that if the meta tag is right:

<meta http-equiv="refresh" content="5;URL='./index.html'" />

then the code creates the right URL (but it's not seen as no warning is
given.

html.cc from line 3218

if the meta tag refresh is correct HTML on the web page, mr_url returns
the correct path to the refresh. If it isn't, then mr_url returns the
value of the refresh - which fubars the warning messages link.

I tested this running dillo from the commandline with this insertion and
testing the correct and wrong meta tags:

}
/* Skip to anything after "URL=" or ";" if "URL=" is not found */
if ((p = dStriAsciiStr(content, "url=")))
content = p + strlen("url=");
else if ((p = strstr(content, ";")))
content = p + strlen(";");
/* Handle the case of a quoted URL */
if (*content == '"' || *content == '\'') {
if ((p = strchr(content + 1, *content)))
mr_url = dStrndup(content + 1, p - content - 1);
else
mr_url = dStrdup(content + 1);
} else {
mr_url = dStrdup(content);
}
//nick
printf ("%s\n", mr_url);
new_url = a_Html_url_new(html, mr_url, NULL, 0);
// new_url = a_Html_url_new(html, ".", NULL, 0);


Not being very good with C++, I can't work out what goes wrong here -
but it looks like that if 'URL=' or';' isn't found, then the code still
looks for quotes - which does something to the string?

I have two pages you can test on - one with the proper meta tag, and one
with out:

http://irpi.linicks.net:8080/static_simple.html <- good

http://fishpi.linicks.net:8081/static_simple.html <- bad - that shows
the bug

If possible view both with that printf I put in - you can see what
mr_url returns.

Phew*

Nick
--
Gosh that takes me back... or is it forward? That's the trouble with
time travel, you never can tell."
-- Doctor Who "Androids of Tara"

_______________________________________________
Dillo-dev mailing list
Dill...@dillo.org
http://lists.dillo.org/cgi-bin/mailman/listinfo/dillo-dev

Jorge Arellano Cid

unread,
May 2, 2016, 9:02:06 PM5/2/16
to dill...@dillo.org
Hi,
Please try to be clear & concise.

There's a bug, yes. You may try this:

diff -r c20e74568504 src/html.cc
--- a/src/html.cc Sun May 01 10:49:17 2016 -0300
+++ b/src/html.cc Sun May 01 23:26:48 2016 -0300
@@ -3220,6 +3220,8 @@ static void Html_tag_open_meta(DilloHtml
content = p + strlen("url=");
else if ((p = strstr(content, ";")))
content = p + strlen(";");
+ else
+ content = "";
/* Handle the case of a quoted URL */
if (*content == '"' || *content == '\'') {
if ((p = strchr(content + 1, *content)))


@Johannes, does it look OK to you?

--
Cheers
Jorge.-

Nick Warne

unread,
May 3, 2016, 11:48:47 AM5/3/16
to dill...@dillo.org


On 03/05/16 01:59, Jorge Arellano Cid wrote:

> Please try to be clear & concise.

I thought I was :)

> There's a bug, yes. You may try this:
>
> diff -r c20e74568504 src/html.cc
> --- a/src/html.cc Sun May 01 10:49:17 2016 -0300
> +++ b/src/html.cc Sun May 01 23:26:48 2016 -0300
> @@ -3220,6 +3220,8 @@ static void Html_tag_open_meta(DilloHtml
> content = p + strlen("url=");
> else if ((p = strstr(content, ";")))
> content = p + strlen(";");
> + else
> + content = "";
> /* Handle the case of a quoted URL */
> if (*content == '"' || *content == '\'') {
> if ((p = strchr(content + 1, *content)))
>
>
> @Johannes, does it look OK to you?

OK, that doesn't quite work correctly - it removes the warning messages
altogether.

Using:

content = "./";

produces the warning message and with the top level URL - but, a case
like this:

www.example.com/refresh.html

should produce a refresh to that URL, but it drops the page and points
to www.example.com/

Nearly there...

Nick

--
Gosh that takes me back... or is it forward? That's the trouble with
time travel, you never can tell."
-- Doctor Who "Androids of Tara"

Johannes Hofmann

unread,
May 10, 2016, 3:02:00 AM5/10/16
to dill...@dillo.org
Looks ok to me, but the whole code there is a bit hairy.

Cheers,
Johannes

Jorge Arellano Cid

unread,
May 10, 2016, 9:42:15 AM5/10/16
to dill...@dillo.org
On Tue, May 10, 2016 at 08:59:49AM +0200, Johannes Hofmann wrote:
> On Mon, May 02, 2016 at 09:59:09PM -0300, Jorge Arellano Cid wrote:
> > > [...]
> > >
> > > I have two pages you can test on - one with the proper meta tag, and one
> > > with out:
> > >
> > > http://irpi.linicks.net:8080/static_simple.html <- good
> > >
> > > http://fishpi.linicks.net:8081/static_simple.html <- bad - that shows the
> > > bug
> > >
> > > If possible view both with that printf I put in - you can see what mr_url
> > > returns.
> >
> > Please try to be clear & concise.
> >
> > There's a bug, yes. You may try this:
> >
> > diff -r c20e74568504 src/html.cc
> > --- a/src/html.cc Sun May 01 10:49:17 2016 -0300
> > +++ b/src/html.cc Sun May 01 23:26:48 2016 -0300
> > @@ -3220,6 +3220,8 @@ static void Html_tag_open_meta(DilloHtml
> > content = p + strlen("url=");
> > else if ((p = strstr(content, ";")))
> > content = p + strlen(";");
> > + else
> > + content = "";
> > /* Handle the case of a quoted URL */
> > if (*content == '"' || *content == '\'') {
> > if ((p = strchr(content + 1, *content)))
> >
> >
> > @Johannes, does it look OK to you?
>
> Looks ok to me, but the whole code there is a bit hairy.

Yes. This stems from two reasons: the meta-refresh element has no
defined standard, and our code has evolved accomodating to work
with what we've found along the way.

Do you have a testcase battery for meta refresh?

Currently, I'm working on the BODY and HTML element handling
by the parser (special cases since their implicit optional-magic
open and close).

Fortunately I have the test cases for BODY and HTML. I also
found my old set for META refresh, but I don't have the case
you made the last patch for. It would be great to consolidate
them.

--
Cheers
Jorge.-

eocene

unread,
May 10, 2016, 3:23:19 PM5/10/16
to dill...@dillo.org
On Tue, May 10, 2016 at 10:38:57AM -0300, Jorge Arellano Cid wrote:
> Yes. This stems from two reasons: the meta-refresh element has no
> defined standard, and our code has evolved accomodating to work
> with what we've found along the way.

They do have refresh in html5:
https://www.w3.org/TR/html51/document-metadata.html#statedef-http-equiv-refresh

Nick Warne

unread,
May 10, 2016, 3:31:02 PM5/10/16
to dill...@dillo.org
On Tue, 10 May 2016 19:20:21 +0000
eocene <eoc...@gmx.com> wrote:

> On Tue, May 10, 2016 at 10:38:57AM -0300, Jorge Arellano Cid wrote:
> > Yes. This stems from two reasons: the meta-refresh element has no
> > defined standard, and our code has evolved accomodating to work
> > with what we've found along the way.
>
> They do have refresh in html5:
> https://www.w3.org/TR/html51/document-metadata.html#statedef-http-equiv-refresh

Quick reading, that is the way to go.

Nick
--
Gosh that takes me back... or is it forward? That's the trouble with
time travel, you never can tell."
-- Doctor Who "Androids of Tara"

Reply all
Reply to author
Forward
0 new messages