A sitemap is a way of organizing a website, identifying the URLs and the data under each section. Previously, the sitemaps were primarily geared for the users of the website. However, Google's XML format was designed for the search engines, allowing them to find the data faster and more efficiently.
A sitemap is a file where you provide information about the pages, videos, and other files on your site, and the relationships between them. Search engines like Google read this file to crawl your site more efficiently. A sitemap tells Google which pages and files you think are important in your site, and also provides valuable information about these files. For example, when the page was last updated and any alternate language versions of the page.
If your site's pages are properly linked, Google can usually discover most of your site. Proper linking means that all pages that you deem important can be reached through some form of navigation, be that your site's menu or links that you placed on pages. Even so, a sitemap can improve the crawling of larger or more complex sites, or more specialized files.
You can provide multiple Sitemap files, but each Sitemap file that you provide must have no more than 50,000 URLs and must be no larger than 50MB (52,428,800 bytes). If you would like, you may compress your Sitemap files using gzip to reduce your bandwidth requirement; however the sitemap file once uncompressed must be no larger than 50MB. If you want to list more than 50,000 URLs, you must create multiple Sitemap files.
If you submit a Sitemap using a path with a port number, you must include that port number as part of the path in each URL listed in the Sitemap file. For instance, if your Sitemap is located at :100/sitemap.xml, then each URL listed in the Sitemap must begin with :100.
To submit Sitemaps for multiple hosts from a single host, you need to "prove" ownership of the host(s) for which URLs are being submitted in a Sitemap. Here's an example. Let's say that you want to submit Sitemaps for 3 hosts: www.host1.com with Sitemap file sitemap-host1.xmlwww.host2.com with Sitemap file sitemap-host2.xmlwww.host3.com with Sitemap file sitemap-host3.xml
By default, this will result in a "cross submission" error since you are trying to submit URLs for www.host1.com through a Sitemap that is hosted on www.sitemaphost.com (and same for the other two hosts). One way to avoid the error is to prove that you own (i.e. have the authority to modify files) www.host1.com. You can do this by modifying the robots.txt file on www.host1.com to point to the Sitemap on www.sitemaphost.com.
In this example, the robots.txt file at would contain the line "Sitemap: -host1.xml". By modifying the robots.txt file on www.host1.com and having it point to the Sitemap on www.sitemaphost.com, you have implicitly proven that you own www.host1.com. In other words, whoever controls the robots.txt file on www.host1.com trusts the Sitemap at -host1.xml to contain URLs for www.host1.com. The same process can be repeated for the other two hosts.
Many sites have user-visible sitemaps which present a systematic view, typically hierarchical, of the site. These are intended to help visitors find specific pages, and can also be used by crawlers. They also act as a navigation aid[1] by providing an overview of a site's content at a single glance.Alphabetically organized sitemaps, sometimes called site indexes, are a different approach.
For use by search engines and other crawlers, there is a structured format, the XML Sitemap, which lists the pages in a site, their relative importance, and how often they are updated.[2] This is pointed to from the robots.txt file and is typically called sitemap.xml. The structured format is particularly important for websites which include pages that are not accessible through links from other pages, but only through the site's search tools or by dynamic construction of URLs in JavaScript.
Since the major search engines use the same protocol,[3] having a Sitemap lets them have the updated page information. Sitemaps do not guarantee all links will be crawled, and being crawled does not guarantee indexing.[4] Google Webmaster Tools allow a website owner to upload a sitemap that Google will crawl, or they can accomplish the same thing with the robots.txt file.[5]
If you've tried all the ways mentioned above and couldn't locate your XML sitemap, your website probably doesn't have one.In that case, read our guide to XML sitemaps to learn how to create a sitemap for a website. Or use a sitemap generator.
To ensure your sitemap is set up correctly, you can use a website auditing tool like Semrush's Site Audit. The tool will crawl your website (similar to the way Googlebot does) and detect any technical SEO issues.
The function will be called for every page on your site. The page function parameter is the full URL of the page currently under considering, including your site domain. Return true to include the page in your sitemap, and false to leave it out.
The maximum number entries per sitemap file. The default value is 45000. A sitemap index and multiple sitemaps are created if you have more entries. See this explanation of splitting up a large sitemap.
Huge Site? Break Things Up Into Smaller Sitemaps: Sitemaps have a limit of 50k URLs. So if you run a site with a ton of pages, Google recommends breaking up your sitemap into several smaller sitemaps.
Using Sitemaps to help Google find content hosted on your site: Quick video from the Google Webmaster YouTube channel on how sitemaps can help your site appear higher and more often in the search results.
The XML sitemap module creates a sitemap that conforms to the sitemaps.org specification. This helps search engines to more intelligently crawl a website and keep their results up to date. The sitemap created by the module can be automatically submitted to Ask, Google, Bing (formerly Windows Live Search), and Yahoo! search engines. The module also comes with several submodules that can add sitemap links for content, menu items, taxonomy terms, and user profiles.
(One of my very first sites, still new) So I built the website www.singoutgeelong.com.au for a client. They get back to me 4 months later saying its not on google. I do some research, the sitemap is empty. I contact squarespace and get it fixed. I connect to google search console and resubmit. I've done everything I think of but they are complaining about the time it is taking and the fact that they "still aren't on google" and they want to take it up with squarespace even though they've done what they can. They are on google, I have checked the site:www.singoutgeelong.com.au. So yes they are indexed but the first webpage I see when I search "sing out geelong" is for an event at result 8, not their homepage. Other sites I have done, I did not need to do this as they are the very first result.
I just paid for and launched a new site. I transferred an old squarespace domain to the new site. When I check the sitemap for the new site it is empty. Is there a delay in sitemaps being generated for new sites?
HTML sitemaps: This is more like your content sitemap that users can see and use to navigate your site. They're also commonly referred to as your "website archive." Some marketers view HTML sitemaps as outdated or even entirely unnecessary.
XML sitemaps: This is the sitemap that's purely used for indexing and crawling your website and is manually submitted. It's the more modern form of handling how all your content is stored across your website.
While HTML sitemaps might help users find pages on your site, as John Mueller said, your internal linking should take care of that anyways. So the focus from an SEO perspective should be on XML sitemaps.
A page sitemap or regular sitemap improves the indexations of pages and posts. For sites that are not image-focused or video-focused, like photography and videography sites, a page sitemap can also include the images and videos on each page.
An XML video sitemap is similar to a page sitemap, but of course focuses largely on video content, which means they are only necessary if videos are critical to your business. If they aren't, save your crawl budget (the finite amount of crawlable pages and resources across your site) and add the video link to your page sitemap.
If you publish news and want to get those news articles featured on top stories and Google News, you need a news sitemap. There's a crucial rule here: do not include articles that were published longer than the last two days in the file.
Google News sitemaps aren't favored in regular ranking results, so make sure you only add news articles. Also, they do not support image links, so Google recommends you use structured data to specify your article thumbnail.
Like the video sitemaps, image sitemaps are only necessary if images are critical to your business, such as a photography or stock photo site. If they aren't, you can leave them in your page sitemap and mark them up with the image object schema, and they will be crawled along with the page content/URL.
As a result of those limitations, you might need to have more than one sitemap. When you use more than one sitemap file, you need an index file that lists all of those sitemaps. It's the index file that you submit in Google Search Console and Bing Webmaster Tools. That file should look like this:
Adding priorities to your sitemap is one of the things many people do to differentiate between how important different pages are, but Google's Gary Illyes mentioned that Google ignores these priorities. In his exact words:
Generally speaking, as long as you are honest about when your content was actually modified, include it in your sitemap so that Google and other search engines know to re-crawl the modified page and index the new content.
In this section, I will show you how to create a sitemap without using any generator or plugin. If your website is on WordPress or you'd rather use a generator (which makes this easy), skip to the next section.
760c119bf3