Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

get the contents of the noise file

0 views
Skip to first unread message

Rudolf Wiener

unread,
Mar 21, 2003, 10:13:59 AM3/21/03
to
I need to get the contents of the noise file inorder to strip noise words
from the query that's been entered by the end-user. If I don't strip the
noise words the whole query will be invalid (get a syntax error).

Now, unfortunately I can't just read the noise file with OS functions
because I do not have access rights to the directory where the noise file is
located. Is there a way to get SQL Server to give me the contents of the
noise file. Something like one of those sp_help_fulltext* functions maybe?

Regards
Rudi

PS: I am working with SQL 2000, WinXP, using VC++ and OLEDB to access the
system.


Hilary Cotter

unread,
Mar 21, 2003, 11:04:46 AM3/21/03
to
AFAIK - no. You can perhaps probe around looking for this
file by doing this:

master..xp_fileexist 'c:\program files\microsoft sql
server\mssql\ftdata\sqlserver\config\noise.deu'

waiting for a file exists value of 1 and then do this

master..xp_cmdshell 'type d:\"program files\microsoft sql
server"\mssql\ftdata\sqlserver\config\noise.deu'

and hope for the best.

>.
>

John Kane

unread,
Mar 21, 2003, 11:14:08 AM3/21/03
to
Rudi and Hilary,
Well you could do something like this...

Create Table noise_words
(
Noiseword varchar(50) Not Null
)
Go
Alter Table noise_words Add Constraint PK_noise_words Primary Key Clustered
(
Noiseword
)
Go

Then you can use BULK INSERT, BCP or DTS to copy the contents of the file
into the database. Before you copy in the language-specific noise word file,
you will need to make some changes to the initial file from the end of the
file as the noise word files contain a list of "white space" single letters
and characters at the end of the file, for example, from noise.enu:

a b c d e f g h i j k l m n o p q r s t u v w x y z

BULK INSERT or BCP will fail or think this is one big string (no CR/LF), so
you will need to separate out the row above such that each letter takes up
its own row in the file. Open the language specific noise word file in a
text editor (notepad.exe) and change the above list to:

a
b
c
d
e
f

and so on. Be sure to eliminate any leading or trailing spaces for each
character. Once that's done, you can use BULK INSERT, BCP or DTS to copy the
data from the noise file to the noise_words table.

Once the data is imported correctly, you can use a standard SQL statement
such as:

select count(*) from noise_words where Noiseword = "between"

to use in a string parser function to remove the noise words in your users
input string and then pass this edited string to a SQL Server Full-Text
Search query.

Regards,
John


"Hilary Cotter" <hil...@att.net> wrote in message
news:23ad01c2efc3$97a5c580$a501...@phx.gbl...

Rudolf Wiener

unread,
Mar 22, 2003, 11:20:49 AM3/22/03
to
Smart idea :-)
Thanks, I'll try it.
Rudi


"John Kane" <jt-...@attbi.com> wrote in message
news:kTGea.194316$sf5.1...@rwcrnsc52.ops.asp.att.net...

0 new messages