Re: base_url

0 views
Skip to first unread message

onedmc

unread,
Sep 3, 2005, 7:40:15 AM9/3/05
to SOFTplus GSiteCrawler
Thanks, I understand.

onedmc

unread,
Sep 2, 2005, 6:31:54 AM9/2/05
to SOFTplus GSiteCrawler
Why this is marked as abuse? It has been marked as abuse.
Report not abuse
How come GSite Crawler dosn't create a
< Site
base_url="http://xxx.com/"
>

Section.

And does this matter. I need to know as I'm trying to work out why my
base url is causing a HTTP error for google.

adjmc

John Mueller

unread,
Sep 2, 2005, 6:36:26 AM9/2/05
to gsitec...@googlegroups.com
Why this is marked as abuse? It has been marked as abuse.
Report not abuse
Hi
I'm sorry, I don't understand what you mean? When crawling sites with
a base url, it should extract the URLs correctly... or do you mean
within the sitemap file? (if so, the sitemap file standard from Google
does not support that.) Can you send me an example? Thanks!
John

onedmc

unread,
Sep 2, 2005, 7:10:41 AM9/2/05
to SOFTplus GSiteCrawler
Why this is marked as abuse? It has been marked as abuse.
Report not abuse
When I look at the google recommendations for a sitemap.xml file it
says somthing like

Required attributes:
base_url - the top-level URL of the site being mapp

But when GSiteCrawler creates a sitemap.xml there is no section < Site
base_url="http://xxx.com/". in the xml file. I would like to know if
this is causing the issue

I do have a section <url><loc>http://www.xxx.com/</loc>.... just after
the begining of the <urlset ...> and this seems to be where google is
reporting the http error

John Mueller

unread,
Sep 2, 2005, 7:42:36 AM9/2/05
to gsitec...@googlegroups.com
Why this is marked as abuse? It has been marked as abuse.
Report not abuse
Aha! Now I see -- you're looking at the configuration file for the
Python Google Sitemap generator! Those things are only needed if you
use the program from Google to create sitemaps (which you aren't, if
you're using my program). You should be looking at this page for more
information about the file format:
https://www.google.com/webmasters/sitemaps/docs/en/protocol.html

Hope it helps :-) (otherwise send me the URL to your sitemap file via
email, I'll take a look)

Cheers
John

On 9/2/05, onedmc <w...@designdepo.com> wrote:
>
Reply all
Reply to author
Forward
0 new messages