Google Groups Home
Help | Sign in
Ignore Robots.txt
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  2 messages - Collapse all
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
Nico  
View profile
 More options Jul 11, 12:13 pm
From: Nico <nicolasbottar...@gmail.com>
Date: Fri, 11 Jul 2008 09:13:41 -0700 (PDT)
Local: Fri, Jul 11 2008 12:13 pm
Subject: Ignore Robots.txt
Hi, i need to crawl google search results. Is there a way to ignore
the robots.txt?

thanks in advance

Nicolas Bottarini


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
jhandl  
View profile
 More options Jul 11, 2:24 pm
From: jhandl <jha...@gmail.com>
Date: Fri, 11 Jul 2008 11:24:49 -0700 (PDT)
Local: Fri, Jul 11 2008 2:24 pm
Subject: Re: Ignore Robots.txt
Hi Nico!
You can change Nutch's fetcher code to ignore the robots.txt files,
but you shoudn't.
Regards,
-- Jorge

On Jul 11, 1:13 pm, Nico <nicolasbottar...@gmail.com> wrote:


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »

Create a group - Google Groups - Google Home - Terms of Service - Privacy Policy
©2008 Google