Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Full Text File Search with Indexing Service on Windows

201 views
Skip to first unread message

Chung Leong

unread,
Aug 17, 2006, 12:59:30 AM8/17/06
to
Here's a short tutorial on how to the OLE-DB extension to access
Windows Indexing Service. Impress your office-mates with a powerful
full-text search feature on your intranet. It's easier than you think.

First, download and install the extension
(http://sourceforge.net/project/showfiles.php?group_id=171247&package_id=198554).
Simply unzip the file and copy the correct version of php_oledb.dll
into the PHP extensions folder. Then add the line
extension=php_oledb.dll in php.ini and restart your web server.

Now, if Indexing Service isn't running on your computer, turn it on. Go
to Control Panel > Administrative Tools > Services and configure
Indexing Service to start automatically. You can also ask the little
dog in the search window to do it for you if you're using Windows XP.
You will need to wait a while for Windows to build the initial index.
It could take a couple hours.

Once the extension is installed and the index is ready, you can start
coding. To connect to Indexing Service, you use the oledb_open
function:

$link = oledb_open("Provider=MSIDXS");

You then call oledb_query with a SQL statement. Let us start with
something simple: We'll look for all files on the computer containing
the word "love":

$keyword = 'love';
$sql = "SELECT filename, size, path
FROM SCOPE()
WHERE CONTAINS(Contents, '$keyword')";
$res = oledb_query($sql, $link);
while($row = oledb_fetch_assoc($res)) {
var_dump($row);
}

An example among the many results I get on my computer:

array(3) {
["FILENAME"]=>
string(11) "ap_config.h"
["SIZE"]=>
int(38787)
["PATH"]=>
string(56) "c:\program files\apache group\apache\include\ap_config.h"
}

Instead of a table name we're retrieving rows from SCOPE(), which
determines which directories are searched. When there's nothing inside
the parenthese, all matching files in the catalogs are returned. To
search a particular directory, we put in 'DEEP TRAVERSAL OF <dir>' or
'SHALLOW TRAVERSAL OF <dir>'.

To search for files in C:\htdocs:

$dir = 'C:\\htdocs'
$keyword = 'love';
$sql = "SELECT filename, size, path
FROM SCOPE('DEEP TRAVERSAL OF \"$dir\"')
WHERE CONTAINS(Contents, '$keyword')";
$res = oledb_query($sql, $link);

Deep traversal means sub-directories are searched, while shallow
traversal means they're not.

There are many fields about a file that you can search on and retrieve.
To see what they are, right click on My Computer and select Manage. In
the Computer Management window, expand Services and Applications, then
Index Service, then System, and select Properties. The names listed
under Friendly Names are the ones you can use in your statement.

Anyway, it's getting late. I'll talk more about this tomorrow :-)

0 new messages