Groups
Groups
Sign in
Groups
Groups
语言技术中心论坛
Conversations
About
Send feedback
Help
可以搜索厦大的网页了
7 views
Skip to first unread message
贾剑峰
unread,
Jun 23, 2006, 12:30:18 AM
6/23/06
Reply to author
Sign in to reply to author
Forward
Sign in to forward
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to langua...@googlegroups.com
前几天的配置不对,索引建了2天都没有建完,查看log发现外文学院的一个网页
http://cflc.xmu.edu.cn/yxz/?????§.files/ling/
语言学.files/语言学.files/语言学.files/ling/语言学.files/ling/语言学.files/语言学.files/语言学.files/语言学.files/ling/ling/语言学.files/ling/语言学.files/ling/语言学.files/语言学.files/conf.htm,这个网页都不知道是怎么链接出来的,所以整个log8成都在反复的搜索同一个东西,奇怪的是在正则表达式里已经规定了不搜索 [?*@=] 的这些东西了,其他的类似查询类型的网页都没有访问,就这个例外;
昨天下午改动了深度,设为30层,还是会有这条,但是至少索引建完了,另外设置了1000个线程去抓,速度快多了,能达到2M的获取速度,这样今天中午过来就已经搞完啦。
现在大家可以访问
http://59.77.17.127:8080
,查询的时候,有的网页会出现乱码,还需解决。
--
Joyce
mandel
unread,
Jun 23, 2006, 4:37:30 AM
6/23/06
Reply to author
Sign in to reply to author
Forward
Sign in to forward
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to 语言技术中心论坛
需要研究如何把分词加上去,或做词组检索,否则问题太多
Reply all
Reply to author
Forward
0 new messages