Cannot get my search by category working :(

27 views
Skip to first unread message

Adam

unread,
Apr 30, 2012, 2:34:20 AM4/30/12
to DataparkSearch
Hi all, hi Maxime !

I have spent almost whole day yesterday trying to get my search by
category working and finally I was forced to give up :(

First of all I have restarted everything from the scratch again and
immediately after creating fresh database with -Ecreate option, I have
used your Perl script to create also my 4 categories in categories
table.
So I have 5 rows in my categories table, first row is named Root
without any path and my categories with the following paths: 01 02 03
04

Then I have of course: Limit c:category
included in each configuration file and in search.htm
I'm running full staff with searchd, cached and stored on the same
box, using dbmode=cache&charset=utf8
In my search.htm I have directly DBAddr searchd://localhost/

Also in my search.htm I have the following form :

<select name="c">
<option value="" selected="$&(c)">All results
<option value="01" selected="$&(c)">1st category
<option value="02" selected="$&(c)">2nd category
<option value="03" selected="$&(c)">3rd category
<option value="04" selected="$&(c)">4th category
</select>

In my "server" table I can see for each domain name a correct category
id corresponding to rec_id from "categories" table.

I'm not using any CategoryIf option,
all "Category" options are included in external url.txt like the
following:

Category 01
http://www.blblb.com/

Category 02
http://www.ddskllk.com/

Category 03
and so on.........

I have a total of 55 urls in this file.


Finally cached working with v5 option, is giving the following output
in my dp logs when running indexer -TW:

Apr 30 08:19:52 server2 indexer[6969]: {00} indexer from
dpsearch-4.54-2012-04-29-mysql, config test OK with '/usr/local/
dpsearch/etc/indexer.conf'
Apr 30 08:19:52 server2 cached[23776]: {00} [127.0.0.1] Connected.
PORT: 137,23
Apr 30 08:19:52 server2 indexer[6969]: {00} Writing url data and
limits for mysql://xxx:XXX@localhost/search/?dbmode=cache&cached=localhost&charset=utf8...
Apr 30 08:19:52 server2 cached[23776]: {252} Mon 30 08:19:52 [23776]
Client thread started
Apr 30 08:19:52 server2 cached[23776]: {252} Writing url data and
limits for mysql://xxx:XXX@localhost/search/?dbmode=cache&charset=utf8...
Apr 30 08:19:52 server2 cached[23776]: {252} Creating category index
Apr 30 08:19:52 server2 indexer[6969]: {00} url data and limits Done
Apr 30 08:19:52 server2 searchd[28590]: {00} Query Tracker: SIGTERM
arrived
Apr 30 08:19:52 server2 searchd[28589]: {00} URL data preloaded. 5296
bytes of memory used
Apr 30 08:19:52 server2 searchd[28591]: {300} SIGTERM arrived
Apr 30 08:19:52 server2 searchd[28589]: {00} Ready
Apr 30 08:19:57 server2 cached[23776]: {252} Category Limit by
urlinfo: 49 records processed at 55 (total:49)
Apr 30 08:19:57 server2 cached[23776]: {252} Category Limit by server:
55 records processed at 55 (total:104)
Apr 30 08:19:57 server2 cached[23776]: {252} 0019A10000000000 - 0 196
Apr 30 08:19:57 server2 cached[23776]: {252} 0033420000000000 - 196 8
Apr 30 08:19:57 server2 cached[23776]: {252} 004CE30000000000 - 204 8
Apr 30 08:19:57 server2 cached[23776]: {252} 0066840000000000 - 212 8
Apr 30 08:19:57 server2 cached[23776]: {252} Done
Apr 30 08:19:57 server2 cached[23776]: {252} Creating time index
Apr 30 08:19:57 server2 cached[23776]: {252} 55 records processed at 0
Apr 30 08:19:57 server2 cached[23776]: {252} 359814 - pos:0 len:4
Apr 30 08:19:57 server2 cached[23776]: {252} 369393 - pos:4 len:4
Apr 30 08:19:57 server2 cached[23776]: {252} 369632 - pos:8 len:4
Apr 30 08:19:57 server2 cached[23776]: {252} 370773 - pos:c len:4
Apr 30 08:19:57 server2 cached[23776]: {252} 371010 - pos:10 len:4
Apr 30 08:19:57 server2 cached[23776]: {252} 371023 - pos:14 len:4
Apr 30 08:19:57 server2 cached[23776]: {252} 371034 - pos:18 len:192
Apr 30 08:19:57 server2 cached[23776]: {252} Done
Apr 30 08:19:57 server2 cached[23776]: {252} Creating Site_id index
Apr 30 08:19:57 server2 cached[23776]: {252} 55 records processed at 0
Apr 30 08:19:57 server2 cached[23776]: {252} 405149477 - pos:0 len:4
Apr 30 08:19:57 server2 cached[23776]: {252} 444994661 - pos:4 len:4
Apr 30 08:19:57 server2 cached[23776]: {252} 499341685 - pos:8 len:4
Apr 30 08:19:57 server2 cached[23776]: {252} 511109136 - pos:c len:4
Apr 30 08:19:57 server2 cached[23776]: {252} 572839226 - pos:10 len:4
Apr 30 08:19:57 server2 cached[23776]: {252} 583869802 - pos:14 len:4
Apr 30 08:19:57 server2 cached[23776]: {252} 798180365 - pos:18 len:4
Apr 30 08:19:57 server2 cached[23776]: {252} 848368507 - pos:1c len:4
Apr 30 08:19:57 server2 cached[23776]: {252} 966541896 - pos:20 len:4
Apr 30 08:19:57 server2 cached[23776]: {252} 998485197 - pos:24 len:4
Apr 30 08:19:57 server2 cached[23776]: {252} 1056522868 - pos:28 len:
4
Apr 30 08:19:57 server2 cached[23776]: {252} 1093028710 - pos:2c len:
4
Apr 30 08:19:57 server2 cached[23776]: {252} 1193598979 - pos:30 len:
4
Apr 30 08:19:57 server2 cached[23776]: {252} 1211943318 - pos:34 len:
4
Apr 30 08:19:57 server2 cached[23776]: {252} 1469722806 - pos:38 len:
4
Apr 30 08:19:57 server2 cached[23776]: {252} 1489398782 - pos:3c len:
4
Apr 30 08:19:57 server2 cached[23776]: {252} 1545075253 - pos:40 len:
4
Apr 30 08:19:57 server2 cached[23776]: {252} 1611854342 - pos:44 len:
4
Apr 30 08:19:57 server2 cached[23776]: {252} 1683243559 - pos:48 len:
4
Apr 30 08:19:57 server2 cached[23776]: {252} 1722839569 - pos:4c len:
4
Apr 30 08:19:57 server2 cached[23776]: {252} 1803112713 - pos:50 len:
4
Apr 30 08:19:57 server2 cached[23776]: {252} 1814710124 - pos:54 len:
4
Apr 30 08:19:57 server2 cached[23776]: {252} 1972570312 - pos:58 len:
4
Apr 30 08:19:57 server2 cached[23776]: {252} 1979609120 - pos:5c len:
4
Apr 30 08:19:57 server2 cached[23776]: {252} 2000766612 - pos:60 len:
4
Apr 30 08:19:57 server2 cached[23776]: {252} -1984102908 - pos:64 len:
4
Apr 30 08:19:57 server2 cached[23776]: {252} -1961117299 - pos:68 len:
4
Apr 30 08:19:57 server2 cached[23776]: {252} -1940466927 - pos:6c len:
4
Apr 30 08:19:57 server2 cached[23776]: {252} -1893759763 - pos:70 len:
4
Apr 30 08:19:57 server2 cached[23776]: {252} -1651029947 - pos:74 len:
4
Apr 30 08:19:57 server2 cached[23776]: {252} -1629312980 - pos:78 len:
4
Apr 30 08:19:57 server2 cached[23776]: {252} -1581465961 - pos:7c len:
4
Apr 30 08:19:57 server2 cached[23776]: {252} -1552893508 - pos:80 len:
4
Apr 30 08:19:57 server2 cached[23776]: {252} -1493528772 - pos:84 len:
4
Apr 30 08:19:57 server2 cached[23776]: {252} -1439238164 - pos:88 len:
4
Apr 30 08:19:57 server2 cached[23776]: {252} -1359903747 - pos:8c len:
4
Apr 30 08:19:57 server2 cached[23776]: {252} -1304239854 - pos:90 len:
4
Apr 30 08:19:57 server2 cached[23776]: {252} -1239189764 - pos:94 len:
4
Apr 30 08:19:57 server2 cached[23776]: {252} -1236318232 - pos:98 len:
4
Apr 30 08:19:57 server2 cached[23776]: {252} -1136074474 - pos:9c len:
4
Apr 30 08:19:57 server2 cached[23776]: {252} -1089881315 - pos:a0 len:
4
Apr 30 08:19:57 server2 cached[23776]: {252} -1085607886 - pos:a4 len:
4
Apr 30 08:19:57 server2 cached[23776]: {252} -944352010 - pos:a8 len:
4
Apr 30 08:19:57 server2 cached[23776]: {252} -911601080 - pos:ac len:
4
Apr 30 08:19:57 server2 cached[23776]: {252} -827452332 - pos:b0 len:
4
Apr 30 08:19:57 server2 cached[23776]: {252} -747009614 - pos:b4 len:
4
Apr 30 08:19:57 server2 cached[23776]: {252} -592150150 - pos:b8 len:
4
Apr 30 08:19:57 server2 cached[23776]: {252} -572018473 - pos:bc len:
4
Apr 30 08:19:57 server2 cached[23776]: {252} -279534950 - pos:c0 len:
4
Apr 30 08:19:57 server2 cached[23776]: {252} -230682846 - pos:c4 len:
4
Apr 30 08:19:57 server2 cached[23776]: {252} -171356752 - pos:c8 len:
4
Apr 30 08:19:57 server2 cached[23776]: {252} -69527510 - pos:cc len:4
Apr 30 08:19:57 server2 cached[23776]: {252} -9768523 - pos:d0 len:4
Apr 30 08:19:57 server2 cached[23776]: {252} -8405884 - pos:d4 len:4
Apr 30 08:19:57 server2 cached[23776]: {252} Done
Apr 30 08:19:57 server2 cached[23776]: {252} 55 records of url data
written, at 56
Apr 30 08:19:57 server2 cached[23776]: {252} Sending HUP signal to
searchd, pid:28589
Apr 30 08:19:57 server2 cached[23776]: {252} url data and limits Done
Apr 30 08:19:57 server2 cached[23776]: {252} Mon 30 08:19:57 [23776]
Client action done
Apr 30 08:19:57 server2 cached[23776]: {252} Mon 30 08:19:57 [23776]
Client action BYE received.
Apr 30 08:19:57 server2 searchd[6971]: {00} Query Tracker: SIGTERM
arrived
Apr 30 08:19:57 server2 searchd[28589]: {00} URL data preloaded. 5296
bytes of memory used
Apr 30 08:19:57 server2 searchd[6972]: {300} SIGTERM arrived
Apr 30 08:19:57 server2 searchd[28589]: {00} Ready


AND THE RESULT OF ALL THE ABOVE IS that
- when searching for all results I get them all correctly;
- when searching for results from 1st category I get almost all of
them,
- when searching for results from 2nd, 3rd or 4th category I get
NOTHING !

Here is the output from -TS

Database statistics

Status Expired Total
-----------------------------
200 0 48 OK
206 0 2 Partial OK
301 0 4 Moved Permanently
400 0 1 Bad Request
-----------------------------
Total 0 55

The output from -TS -g 01

Status Expired Total
-----------------------------
200 0 46 OK
206 0 2 Partial OK
400 0 1 Bad Request
-----------------------------
Total 0 49

The ouput from -TS -g 02 or 03 or 04

Status Expired Total
-----------------------------
-----------------------------
Total 0 0



Do you have any idea ?
Thanks in advance for any clues

Adam

Adam

unread,
Apr 30, 2012, 2:45:16 AM4/30/12
to DataparkSearch

Also the last but maybe important ?

in my sections.conf I have the following :

Section category 0 64
Section body 1 256
Section title 2 128
Section meta.keywords 3 128
Section meta.description 4 256
Section url.file 6 64
Section url.path 7 128
Section url.host 8 64

Is it correct ?
This is taken from DP manual but I saw something different for
category in one of forum threads ?

Maxime

unread,
May 6, 2012, 9:27:31 AM5/6/12
to datapar...@googlegroups.com
Hi Adam,

Which command do you use to include url.txt file in your indexer.conf file?
If it is not Include command, the url.txt file is not the place to put Category commands.

Category command apply to a Server/Realm/Subnet command. 
You need to specify Category command before corresponding Server/Realm/Subnet command in your indexer.conf file .


Best regards,
Maxim
Reply all
Reply to author
Forward
0 new messages