how to search from a particular url, if I have many urls listed in Start Crawling from the Following URLs

0 views
Skip to first unread message

SureshMidde

unread,
Jan 27, 2009, 11:07:05 AM1/27/09
to Google Search Appliance/Google Mini - Google Search Appliance/Google Mini
Hi All,

If I have more than one url in the block of Start Crawling fromthe
following urls, suppose,

http://website1.com,
http://website2.com,
http://website3.com,

How Can I restrict my search should be bringing results only from the
site which i will specify, while making a request,like I can make a
search to bring results from website2.com, that time it should bring
me results from only website2.com.

Regards
Suresh Midde

Timothy Bowler

unread,
Jan 27, 2009, 11:13:34 AM1/27/09
to Google-Search-...@googlegroups.com
Use a collection
-- 
______________________________________________
Timothy M Bowler BSc(Hons) MSc MIET MBCS | Technology Director

Or Media
Unit 5 Elm Court
156 -170 Bermondsey Street
London
SE1 3TQ

T: 020 7939 9540
F: 020 7939 9541
___________________________________________________
The information in this e-mail and any attachments is confidential and
is intended solely for the addressee. The material may not be reproduced
either in whole or in part without permission and may not be used or
disclosed without permission. No copies of the entirety or part of the
information set out in this email or any attachment may be made without
our prior approval.  Any views or opinions presented are solely those of
the author and do not necessarily represent those of Or Multimedia
Limited.  If you are not the intended recipient of this email, please
contact us at <a href="mailto:in...@or-media.com">in...@or-media.com</a>.

Joe D'Andrea

unread,
Jan 27, 2009, 11:13:49 AM1/27/09
to Google-Search-...@googlegroups.com
Greetings!

On Tue, Jan 27, 2009 at 11:07 AM, SureshMidde <sures...@gmail.com> wrote:

> How Can I restrict my search should be bringing results only from the
> site which i will specify, while making a request,like I can make a
> search to bring results from website2.com, that time it should bring
> me results from only website2.com.

You could do this a few ways:

1) Append " site:website2.com" to your search query. Just type it in
to the search query field and away you go!

2) Use a hidden form field named "sitesearch" to implicitly limit
searches and set it to "website2.com".

3) Create a collection (e.g., call it "website2") that only includes
website2.com content and search that instead.

--
Joe D'Andrea
Liquid Joe LLC
Google Enterprise Partner
www.liquidjoe.biz
+1 (908) 781-0323

SureshMidde

unread,
Jan 27, 2009, 11:53:35 AM1/27/09
to Google Search Appliance/Google Mini - Google Search Appliance/Google Mini
Thanks for the reply,

But presently I dont have more than one url in my block of Start
Crawling from the following urls,

So, in order to check this, how can I proceed,

Suppose I have
www.website1.com and is it enough jst to add www.website2.com in the
block and with the follow and crawl pattern website2.com/, or I have
to wait till the newly added url gets crawled, indexes all of its
documents.


Regards
Suresh Midde



On Jan 27, 5:13 pm, "Joe D'Andrea" <jdand...@gmail.com> wrote:
> Greetings!
>

Joe D'Andrea

unread,
Jan 27, 2009, 12:39:23 PM1/27/09
to Google-Search-...@googlegroups.com
On Tue, Jan 27, 2009 at 11:53 AM, SureshMidde <sures...@gmail.com> wrote:

> But presently I dont have more than one url in my block of Start

> Crawling from the following urls, ...

Good! Then you won't have a problem because there's only the one URL. ;)

But seriously ... I see you're asking how to handle follow-and-crawl
when the domain is different.

The "Start Crawling" section is where you will put your starting
lineup. All crawls start from these URLs.

The "Follow and Crawl" section is where you will put all URL
_patterns_ you want to catch when crawling from the above starting
lineup.

So, for example, you might start crawling from:
http://website1.com/

Now, let's say that website1 has links to website2 and website3. So
long as following those links over to website2 and 3 will ultimately
lead to all the links you need from those sites, you only need to add
these patterns to Follow and Crawl:

http://website1.com/
http://website2.com/
http://website3.com/

... and just leave Start Crawling with the website1.com entry. Most of
the time, I find folks make those lists match, but there are always
exceptions.

--
Joe D'Andrea
Liquid Joe LLC

Reply all
Reply to author
Forward
0 new messages