<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?>
<feed xmlns="http://www.w3.org/2005/Atom">
  <id>http://groups.google.com/group/gsitecrawler</id>
  <title type="text">SOFTplus GSiteCrawler Google Group</title>
  <subtitle type="text">
  Discussion group for the GSiteCrawler, a Windows tool used to crawl websites and automatically create Google Sitemap files (and much more).
  </subtitle>
  <link href="/group/gsitecrawler/feed/atom_v1_0_msgs.xml" rel="self" title="SOFTplus GSiteCrawler feed"/>
  <updated>2009-11-10T13:35:24Z</updated>
  <generator uri="http://groups.google.com" version="1.99">Google Groups</generator>
  <entry>
  <author>
  <name>webado</name>
  <email>web...@gmail.com</email>
  </author>
  <updated>2009-11-10T13:35:24Z</updated>
  <id>http://groups.google.com/group/gsitecrawler/browse_thread/thread/e74b49b9f1f9ad64/416428a043ec806e?show_docid=416428a043ec806e</id>
  <link href="http://groups.google.com/group/gsitecrawler/browse_thread/thread/e74b49b9f1f9ad64/416428a043ec806e?show_docid=416428a043ec806e"/>
  <title type="text">Re: Heavy Site Map Issues</title>
  <summary type="html" xml:space="preserve">
  Hi Karthick, &lt;br&gt; &lt;p&gt;First of all I have to say I have no first hand knowledge of how to &lt;br&gt; manage such very large sites. The largest site I have has about 12000 &lt;br&gt; urls, easily managed by GSC, though it takes 3 hours or so to recrawl. &lt;br&gt; &lt;p&gt;GsiteCrawler has the option of making sitemap indexes for multiple &lt;br&gt; sitemaps.
  </summary>
  </entry>
  <entry>
  <author>
  <name>Karthick</name>
  <email>sifychen...@gmail.com</email>
  </author>
  <updated>2009-11-10T07:24:50Z</updated>
  <id>http://groups.google.com/group/gsitecrawler/browse_thread/thread/e74b49b9f1f9ad64/039574d5ac8cd9df?show_docid=039574d5ac8cd9df</id>
  <link href="http://groups.google.com/group/gsitecrawler/browse_thread/thread/e74b49b9f1f9ad64/039574d5ac8cd9df?show_docid=039574d5ac8cd9df"/>
  <title type="text">Heavy Site Map Issues</title>
  <summary type="html" xml:space="preserve">
  Hi There &lt;br&gt; &lt;p&gt;I am karthick working for Sify Technologies India. I work for the &lt;br&gt; domain sify.com it is one of the premier portal in India we have &lt;br&gt; decided to create sitemap for our domain, We have many channels like &lt;br&gt; sports, news, finance, movies. we have planned to create separate &lt;br&gt; sitemap for all channels. The issue here is there will be thousands of
  </summary>
  </entry>
  <entry>
  <author>
  <name>Christina S</name>
  <email>web...@gmail.com</email>
  </author>
  <updated>2009-10-30T05:23:54Z</updated>
  <id>http://groups.google.com/group/gsitecrawler/browse_thread/thread/a5fdbc60a79dbd92/c4d02eba5d557875?show_docid=c4d02eba5d557875</id>
  <link href="http://groups.google.com/group/gsitecrawler/browse_thread/thread/a5fdbc60a79dbd92/c4d02eba5d557875?show_docid=c4d02eba5d557875"/>
  <title type="text">Re: [GSiteCrawler] Re: Issue with ftp upload using explicit SSL</title>
  <summary type="html" xml:space="preserve">
  I cannot check because I don&#39;t have an SSL cert anywhere. &lt;br&gt; &lt;p&gt;I fail to see how the upload function can change protocol - I thought that &lt;br&gt; was established at connect time. But what do I know ... &lt;br&gt; &lt;p&gt;----- Original Message ----- &lt;br&gt; To: &amp;quot;SOFTplus GSiteCrawler&amp;quot; &amp;lt;gsitecrawler@googlegroups.com &amp;gt; &lt;br&gt; Sent: Thursday, October 29, 2009 9:40 AM
  </summary>
  </entry>
  <entry>
  <author>
  <name>spencer@3ex</name>
  <email>spen...@3ex.co.uk</email>
  </author>
  <updated>2009-10-29T13:40:55Z</updated>
  <id>http://groups.google.com/group/gsitecrawler/browse_thread/thread/a5fdbc60a79dbd92/bf5e515c0087f578?show_docid=bf5e515c0087f578</id>
  <link href="http://groups.google.com/group/gsitecrawler/browse_thread/thread/a5fdbc60a79dbd92/bf5e515c0087f578?show_docid=bf5e515c0087f578"/>
  <title type="text">Re: Issue with ftp upload using explicit SSL</title>
  <summary type="html" xml:space="preserve">
  I dont think the issue is with connecting to the server using ftpes as &lt;br&gt; this works. We can see from the log that we connect correctly to the &lt;br&gt; server. &lt;br&gt; &lt;p&gt;Where it goes wrong is when we try to upload the file. At this point &lt;br&gt; the log indicates &lt;br&gt; &lt;p&gt;534 Policy requires SSL. &lt;br&gt; Put failed: 534 Policy requires SSL.
  </summary>
  </entry>
  <entry>
  <author>
  <name>webado</name>
  <email>web...@gmail.com</email>
  </author>
  <updated>2009-10-28T12:37:05Z</updated>
  <id>http://groups.google.com/group/gsitecrawler/browse_thread/thread/a5fdbc60a79dbd92/33d7bb3d7542becc?show_docid=33d7bb3d7542becc</id>
  <link href="http://groups.google.com/group/gsitecrawler/browse_thread/thread/a5fdbc60a79dbd92/33d7bb3d7542becc?show_docid=33d7bb3d7542becc"/>
  <title type="text">Re: Issue with ftp upload using explicit SSL</title>
  <summary type="html" xml:space="preserve">
  Sorry, I cannot tell, I don&#39;t use this for my server. &lt;br&gt; &lt;p&gt;But does your ftp address reflect the SSL ? Are you using ftps://.... &lt;br&gt; for the host address?
  </summary>
  </entry>
  <entry>
  <author>
  <name>spencer@3ex</name>
  <email>spen...@3ex.co.uk</email>
  </author>
  <updated>2009-10-28T12:10:23Z</updated>
  <id>http://groups.google.com/group/gsitecrawler/browse_thread/thread/a5fdbc60a79dbd92/232c68f3bbdf045e?show_docid=232c68f3bbdf045e</id>
  <link href="http://groups.google.com/group/gsitecrawler/browse_thread/thread/a5fdbc60a79dbd92/232c68f3bbdf045e?show_docid=232c68f3bbdf045e"/>
  <title type="text">Issue with ftp upload using explicit SSL</title>
  <summary type="html" xml:space="preserve">
  Hello, &lt;br&gt; &lt;p&gt;we are very new to gsite crawler so please bear with us. &lt;br&gt; &lt;p&gt;We have to use explicit SSL to upload files to our website. &lt;br&gt; &lt;p&gt;when we try to upload the test file we get the following error &lt;br&gt; &lt;p&gt;FTP Connection 28/10/2009 11:54 &lt;br&gt; GSiteCrawler v1.23 rev. 286 &lt;br&gt; ------------------------------ ----------
  </summary>
  </entry>
  <entry>
  <author>
  <name>Joe Germann</name>
  <email>motorheadextraordina...@gmail.com</email>
  </author>
  <updated>2009-10-23T12:15:17Z</updated>
  <id>http://groups.google.com/group/gsitecrawler/browse_thread/thread/14affed641cb12e5/52010d45d980fc65?show_docid=52010d45d980fc65</id>
  <link href="http://groups.google.com/group/gsitecrawler/browse_thread/thread/14affed641cb12e5/52010d45d980fc65?show_docid=52010d45d980fc65"/>
  <title type="text">Re: [GSiteCrawler] Re: Recrawl question(s)</title>
  <summary type="html" xml:space="preserve">
  Many thanks. &lt;br&gt; &lt;p&gt;Joe &lt;br&gt; &lt;p&gt;MOTORHEAD extraordinaire &lt;br&gt; Professional Storage and Workspace Solutions &lt;br&gt; 79 Park Road - Chelmsford, MA - 01824 &lt;br&gt; Toll Free 800.618.8028 - Direct 978.618.2800 - Fax 978.418.0404 &lt;br&gt; Visit our web site at &lt;a target=&quot;_blank&quot; rel=nofollow href=&quot;http://www.MotorheadExtraordinaire.com&quot;&gt;[link]&lt;/a&gt; and &lt;br&gt; for our latest specials, &lt;br&gt; &amp;lt;&lt;a target=&quot;_blank&quot; rel=nofollow href=&quot;https://www.motorheadextraordinaire.com/create_account.php&quot;&gt;[link]&lt;/a&gt;&amp;gt;sign up
  </summary>
  </entry>
  <entry>
  <author>
  <name>Christina S</name>
  <email>web...@gmail.com</email>
  </author>
  <updated>2009-10-23T04:45:00Z</updated>
  <id>http://groups.google.com/group/gsitecrawler/browse_thread/thread/14affed641cb12e5/d1767bdc44dcb51a?show_docid=d1767bdc44dcb51a</id>
  <link href="http://groups.google.com/group/gsitecrawler/browse_thread/thread/14affed641cb12e5/d1767bdc44dcb51a?show_docid=d1767bdc44dcb51a"/>
  <title type="text">Re: [GSiteCrawler] Recrawl question(s)</title>
  <summary type="html" xml:space="preserve">
  You can delete all the currently listed urls from URL List (option Delete &lt;br&gt; all non-manual urls) and start a fresh crawl after that. &lt;br&gt; &lt;p&gt;----- Original Message ----- &lt;br&gt; To: &amp;quot;SOFTplus GSiteCrawler&amp;quot; &amp;lt;gsitecrawler@googlegroups.com &amp;gt; &lt;br&gt; Sent: Thursday, October 22, 2009 5:53 PM &lt;br&gt; &lt;p&gt;Christina &lt;br&gt; &lt;a target=&quot;_blank&quot; rel=nofollow href=&quot;http://www.webado.net&quot;&gt;[link]&lt;/a&gt;
  </summary>
  </entry>
  <entry>
  <author>
  <name>Motorhead Extraordinaire</name>
  <email>motorheadextraordina...@gmail.com</email>
  </author>
  <updated>2009-10-22T21:53:18Z</updated>
  <id>http://groups.google.com/group/gsitecrawler/browse_thread/thread/14affed641cb12e5/90f1f7841f39ac91?show_docid=90f1f7841f39ac91</id>
  <link href="http://groups.google.com/group/gsitecrawler/browse_thread/thread/14affed641cb12e5/90f1f7841f39ac91?show_docid=90f1f7841f39ac91"/>
  <title type="text">Recrawl question(s)</title>
  <summary type="html" xml:space="preserve">
  I just did a major reorganization of my eCommerce web site and moved &lt;br&gt; categories and products all around. I also generated a .htaccess file &lt;br&gt; that did a &amp;quot;Redirect 301 From To&amp;quot; for everything that moved about. &lt;br&gt; &lt;p&gt;I just kicked off a recrawl with GSiteCrawler and it looks like GSC is &lt;br&gt; crawling the old URL&#39;s frorm what was last in the database. These must
  </summary>
  </entry>
  <entry>
  <author>
  <name>Joe Germann</name>
  <email>j...@motorheadextraordinaire.com</email>
  </author>
  <updated>2009-10-20T05:38:31Z</updated>
  <id>http://groups.google.com/group/gsitecrawler/browse_thread/thread/73151e82c679a125/c96e756a746be07b?show_docid=c96e756a746be07b</id>
  <link href="http://groups.google.com/group/gsitecrawler/browse_thread/thread/73151e82c679a125/c96e756a746be07b?show_docid=c96e756a746be07b"/>
  <title type="text">Re: [GSiteCrawler] Re: robots.txt and web site remap</title>
  <summary type="html" xml:space="preserve">
  Works great. Thanks a bunch. &lt;br&gt; &lt;p&gt;Joe &lt;br&gt; &lt;p&gt;MOTORHEAD extraordinaire &lt;br&gt; Professional Storage and Workspace Solutions &lt;br&gt; 79 Park Road - Chelmsford, MA - 01824 &lt;br&gt; Toll Free 800.618.8028 - Direct 978.618.2800 - Fax 978.418.0404 &lt;br&gt; Visit our web site at &lt;a target=&quot;_blank&quot; rel=nofollow href=&quot;http://www.MotorheadExtraordinaire.com&quot;&gt;[link]&lt;/a&gt; and &lt;br&gt; for our latest specials,
  </summary>
  </entry>
  <entry>
  <author>
  <name>webado</name>
  <email>web...@gmail.com</email>
  </author>
  <updated>2009-10-19T19:58:12Z</updated>
  <id>http://groups.google.com/group/gsitecrawler/browse_thread/thread/73151e82c679a125/c93648122003ecdf?show_docid=c93648122003ecdf</id>
  <link href="http://groups.google.com/group/gsitecrawler/browse_thread/thread/73151e82c679a125/c93648122003ecdf?show_docid=c93648122003ecdf"/>
  <title type="text">Re: robots.txt and web site remap</title>
  <summary type="html" xml:space="preserve">
  Use &lt;a target=&quot;_blank&quot; rel=nofollow href=&quot;http://web-sniffer.net/&quot;&gt;[link]&lt;/a&gt; and put in your url. &lt;br&gt; &lt;p&gt;On 19 oct, 14:17, Joe Germann &amp;lt;motorheadextraordina...@gmail .com&amp;gt; &lt;br&gt; wrote:
  </summary>
  </entry>
  <entry>
  <author>
  <name>Joe Germann</name>
  <email>motorheadextraordina...@gmail.com</email>
  </author>
  <updated>2009-10-19T18:17:45Z</updated>
  <id>http://groups.google.com/group/gsitecrawler/browse_thread/thread/73151e82c679a125/4e96a3156becbddf?show_docid=4e96a3156becbddf</id>
  <link href="http://groups.google.com/group/gsitecrawler/browse_thread/thread/73151e82c679a125/4e96a3156becbddf?show_docid=4e96a3156becbddf"/>
  <title type="text">Re: [GSiteCrawler] Re: robots.txt and web site remap</title>
  <summary type="html" xml:space="preserve">
  I implemented the following .htaccess and it appears to work just &lt;br&gt; fine for users. I can plug in my IP and still get access to my web site. &lt;br&gt; Options +FollowSymLinks &lt;br&gt; RewriteEngine On &lt;br&gt; RewriteBase / &lt;br&gt; RewriteCond %{HTTP_USER_AGENT} &lt;br&gt; ^.*(Googlebot|Googlebot|Mediap artners|Adsbot|Feedfetcher)-?( Google|Image)?
  </summary>
  </entry>
  <entry>
  <author>
  <name>Joe Germann</name>
  <email>j...@motorheadextraordinaire.com</email>
  </author>
  <updated>2009-10-19T01:00:41Z</updated>
  <id>http://groups.google.com/group/gsitecrawler/browse_thread/thread/73151e82c679a125/a90aa75132d161e5?show_docid=a90aa75132d161e5</id>
  <link href="http://groups.google.com/group/gsitecrawler/browse_thread/thread/73151e82c679a125/a90aa75132d161e5?show_docid=a90aa75132d161e5"/>
  <title type="text">Re: [GSiteCrawler] Re: robots.txt and web site remap</title>
  <summary type="html" xml:space="preserve">
  It looks like it; just have to play around a bit to figure it all out. &lt;br&gt; &lt;p&gt;Joe
  </summary>
  </entry>
  <entry>
  <author>
  <name>Christina S</name>
  <email>web...@gmail.com</email>
  </author>
  <updated>2009-10-19T00:55:38Z</updated>
  <id>http://groups.google.com/group/gsitecrawler/browse_thread/thread/73151e82c679a125/eb18d8d282c31c24?show_docid=eb18d8d282c31c24</id>
  <link href="http://groups.google.com/group/gsitecrawler/browse_thread/thread/73151e82c679a125/eb18d8d282c31c24?show_docid=eb18d8d282c31c24"/>
  <title type="text">Re: [GSiteCrawler] Re: robots.txt and web site remap</title>
  <summary type="html" xml:space="preserve">
  Yes, one of those examples should work for you. &lt;br&gt; &lt;p&gt; ----- Original Message ----- &lt;br&gt; From: Joe Germann &lt;br&gt; To: gsitecrawler@googlegroups.com &lt;br&gt; Sent: Sunday, October 18, 2009 8:44 PM &lt;br&gt; Subject: [GSiteCrawler] Re: robots.txt and web site remap &lt;br&gt; &lt;p&gt; Thanks for the guidance. I am investigating how to properly set us a .htaccess to do this. It looks like it is straight forward. I just have to read up a bit more and set up a test scenario. &lt;a target=&quot;_blank&quot; rel=nofollow href=&quot;http://www.askapache.com/htaccess/503-service-temporarily-unavailable.html&quot;&gt;[link]&lt;/a&gt;
  </summary>
  </entry>
  <entry>
  <author>
  <name>Joe Germann</name>
  <email>j...@motorheadextraordinaire.com</email>
  </author>
  <updated>2009-10-19T00:44:29Z</updated>
  <id>http://groups.google.com/group/gsitecrawler/browse_thread/thread/73151e82c679a125/83368c58b81ae62a?show_docid=83368c58b81ae62a</id>
  <link href="http://groups.google.com/group/gsitecrawler/browse_thread/thread/73151e82c679a125/83368c58b81ae62a?show_docid=83368c58b81ae62a"/>
  <title type="text">Re: [GSiteCrawler] Re: robots.txt and web site remap</title>
  <summary type="html" xml:space="preserve">
  Thanks for the guidance. I am investigating how to properly set us a &lt;br&gt; .htaccess to do this. It looks like it is straight forward. I just &lt;br&gt; have to read up a bit more and set up a test &lt;br&gt; scenario. &lt;br&gt; &lt;a target=&quot;_blank&quot; rel=nofollow href=&quot;http://www.askapache.com/htaccess/503-service-temporarily-unavailable.html&quot;&gt;[link]&lt;/a&gt; &lt;br&gt; &lt;p&gt;Thanks, &lt;br&gt; Joe &lt;br&gt; &lt;p&gt;MOTORHEAD extraordinaire
  </summary>
  </entry>
</feed>
