Canonical issues

144 views
Skip to first unread message

gigi...@gmail.com

unread,
Oct 21, 2014, 4:02:28 AM10/21/14
to hippo-c...@googlegroups.com
Hi,

I use this HST tag to generate my canonical URLs:
<hst:link var="pageURL" hippobean="${document}" canonical="true" fullyQualified="true"/>

I have 2 issues with this:

1. For requests that are are not matched by the root item, the trailing slash is missing. ( for 301 redirects and SEO purposes this always needs a trailing slash ) Do I just add this manually or could it be a Hippo improvement or some other means to get it and I missed out on it?
2. The protocol is always resolved as HTTP, altough I have HTTPS on my requests.

Regards,
Alex

Ard Schrijvers

unread,
Oct 21, 2014, 12:00:35 PM10/21/14
to hippo-c...@googlegroups.com
On Tue, Oct 21, 2014 at 10:02 AM, <gigi...@gmail.com> wrote:
> Hi,
>
> I use this HST tag to generate my canonical URLs:
> <hst:link var="pageURL" hippobean="${document}" canonical="true"
> fullyQualified="true"/>
>
> I have 2 issues with this:
>
> 1. For requests that are are not matched by the root item, the trailing
> slash is missing. ( for 301 redirects and SEO purposes this always needs a
> trailing slash ) Do I just add this manually or could it be a Hippo

You mean 'that *are* matched by the root item', right? So for example
the home page document? When running with context path /site for
example, we have last week fixed that the linkrewriting returns /site/
from now on. On prod we normally do not have this problem because
httpd redirect / tot /site/

Does this answer your question? If not, then I do not fully understand it.

> improvement or some other means to get it and I missed out on it?
> 2. The protocol is always resolved as HTTP, altough I have HTTPS on my
> requests.

Then you need to adjust the hst:scheme on mounts/sitemap items. We
support multiple schemes per site, see [1]

Hope this helps,

Regards Ard

[1] http://www.onehippo.org/library/concepts/request-handling/hst-seamless-https-support.html

>
> Regards,
> Alex
>
> --
> Hippo Community Group: The place for all discussions and announcements about
> Hippo CMS (and HST, repository etc. etc.)
>
> To post to this group, send email to hippo-c...@googlegroups.com
> RSS:
> https://groups.google.com/group/hippo-community/feed/rss_v2_0_msgs.xml?num=50
> ---
> You received this message because you are subscribed to the Google Groups
> "Hippo Community" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to hippo-communi...@googlegroups.com.
> Visit this group at http://groups.google.com/group/hippo-community.
> For more options, visit https://groups.google.com/d/optout.



--
Amsterdam - Oosteinde 11, 1017 WT Amsterdam
Boston - 1 Broadway, Cambridge, MA 02142

US +1 877 414 4776 (toll free)
Europe +31(0)20 522 4466
www.onehippo.com

gigi...@gmail.com

unread,
Oct 22, 2014, 4:20:07 AM10/22/14
to hippo-c...@googlegroups.com, gigi...@gmail.com
Hi,

Regarding the HTTPS, I will try that configuration.
For the other topic, I just need this :
<hst:link var="pageURL" hippobean="${document}" canonical="true" fullyQualified="true"/>
to return links with a trailing slash:

example URL currently returned : www.example.org/mypage
and I would need : www.example.org/mypage/

This is for SEO purposes, so links are not indexed twice.
I could mess around with what hst:link returns and add the "/" manually, but I thought there might be an easier fix.

Regards,
Alex

Ard Schrijvers

unread,
Oct 22, 2014, 11:07:17 AM10/22/14
to hippo-c...@googlegroups.com
Hey Alex,
I think messing around with adding an extra "/" is tricky. Removing it
is tricky as well since the channel manager for example really needs
an initial request to https://{cmshost}/site/ . Also, imo, it really
doesn't solve anything: By default, the HST will never create links
with a trailing slash! If you now create links *with* a trailing
slash, really nothing changes since then /foo and /foo/ still serve
the same response.

With respect to SEO, I agree it is a tiny bit nicer to not serve same
content for foo and foo/ although Google says:

"Leave it as-is. Many sites have duplicate content. Our indexing
process often handles this case for webmasters and users. While it’s
not totally optimal behavior, it’s perfectly legitimate and a-okay.
:)"

We already had a customer that required SEO optimization wrt trailing
slashes as well. However, the optimization is not changing the created
links, but in returning a 301 or 302 redirect. That is the real fix.
We can add support for this in the HST, see, however, a simple good
working solution for now is using the url rewriter to either redirect
/foo to /foo/ or vice versa [3]. Make sure however if you rewrite
/foo/ to /foo you do not do this for urls containing the cms host,
because the channel manager will then break

you then need to add one condition, something like

header host != ^cms

Hope this helps

Regards Ard

[1] http://googlewebmastercentral.blogspot.nl/2010/04/to-slash-or-not-to-slash.html
[2] https://issues.onehippo.com/browse/HSTTWO-2733
[3] https://forge.onehippo.org/gf/project/urlrewriter/

Minos Chatzidakis

unread,
Jun 28, 2017, 11:00:32 AM6/28/17
to Hippo Community
Hi all,

I'm reviving this because I wrote a simple (but a bit ugly, more on that later) link processor that adds a trailing slash to all generated URLs. I just do this in the postProcess:

protected HstLink doPostProcess(final HstLink link) {
    HstLink newLink = new HstLink() {
        @Override
        public String getPath() {
            String path = link.getPath();
            return path.endsWith("/") ? path: path + "/";
        }

        @Override
        public String toUrlForm(final HstRequestContext requestContext, final boolean fullyQualified) {
            String linkStr =  link.toUrlForm(requestContext, fullyQualified);
            return linkStr == null ? null : linkStr.endsWith("/") ? linkStr : linkStr + "/";
        }

        @Override
        public void setPath(final String path) {
            link.setPath(path);
        }

        ..... 
        ..... 
go on and implement all methods and just delegate to methods in object 'link'
    };

    return newLink;
}

So this is a bit ugly cause I'm creating a new class and object in every invocation of the postProcess(), though surely I could just write an adapter class and use that instead. If I'm not mistaken this is the decorator pattern, and the only reason I'm doing this is because the HstLinkImpl does a normalization of the path inside its setPath() method (which I found awkward - changing an input value in a setter). So I couldn't have just written link.setPath(myPathWithTrailingSlash). This input parameter gets its trailing slash removed, see org.hippoecm.hst.core.linking.HstLinkImpl#setPath

I've tested the above but I'm not confident it's robust:
- Works with links to folders, documents, images, all binaries
- Seems to be the place to do this if you want it to have global effect, on every single generated link. I'm not sure whether I've missed something!
- Works in Channel manager

I've only tested it partially, but wanted to share. Perhaps others have also done similar implementations for this? 
Thoughts?

thanks,
Minos
--

> To post to this group, send email to hippo-community@googlegroups.com

> RSS:
> https://groups.google.com/group/hippo-community/feed/rss_v2_0_msgs.xml?num=50
> ---
> You received this message because you are subscribed to the Google Groups
> "Hippo Community" group.
> To unsubscribe from this group and stop receiving emails from it, send an
--
Amsterdam - Oosteinde 11, 1017 WT Amsterdam
Boston - 1 Broadway, Cambridge, MA 02142

US +1 877 414 4776 (toll free)
Europe +31(0)20 522 4466
www.onehippo.com

--
Hippo Community Group: The place for all discussions and announcements about Hippo CMS (and HST, repository etc. etc.)

To post to this group, send email to hippo-community@googlegroups.com

RSS: https://groups.google.com/group/hippo-community/feed/rss_v2_0_msgs.xml?num=50
---
You received this message because you are subscribed to the Google Groups "Hippo Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hippo-community+unsubscribe@googlegroups.com.



--



Minos Chatzidakis

Product consultant

email-sig-logo.png

p.

e.

+31 20 522 44 66 

minos.ch...@bloomreach.com

_________________________________________________________________________________
Reply all
Reply to author
Forward
0 new messages