we have developed a multi-language site that runs from one server
(being an english based server i'm assuming it is therefore setup for
english locale?).
the content for pages comes from a db so in order for index server to
'see' this i get the system to generate a 'dummy' html page for db
content. this then has a redirect in it so that when a user clicks on
a search result they are redirected to the appropriate page.
english content is indexed fine by index server but the japanese
content is proving to be a problem.
i have set the 'dummy' pages meta http-equiv="Content-Type" as below:
<meta http-equiv="Content-Type" content="text/html;
charset="shift_jis">
such that index server realises the page to be indexed is in japanese
and therefore (hopefully) applies the japanese wordbreaker.
however the search will still only return the page when very specific
japanese words are entered. it doesn't seem to find anything if
content from the page is used as a search phrase, unless its a really
long string.
being a multi-language site the search page has been designed to cope
with any of the languages needed. i have however removed the localeid
as the search just seems to give errors whenever this is in place.
any advice greatly appreciated
jonathan
So it should look like this:
<META HTTP-EQUIV="Content-Type" CONTENT="text/html;
charset=Shift-JIS" />
<META NAME="MS.LOCALE" CONTENT="JA" />
Then set your session code page of the page where you
enter your search page to Japanese, ie
session.codepage=932
Then make sure you set the locale in your query page to
Japanese.
>.
>
additionally it appears the wbcache.jpn and wbdbase.jpn
files are not present for japanese word breaking, any
idea where these could be found? i'm assuming therefore
the neutral word breaker is being used in which case this
would explain why certain resilts aren't being found
because this breaks at white space (which being japanese
chars isn't very often).
any further comments would be greatly appreciated.
jonathan
>.
>
you're right about the japanese files missing. Let me look into this for
you.
"jonathan" <jcrou...@sevenww.co.uk> wrote in message
news:037e01c2ea44$dd8e4740$3301...@phx.gbl...
jonathan
---------------------------------------
<!-- dummy page code //-->
<!-- Note: charset="shift_jis" NOT charset=shift_jis" as
it should be. Latter means page not found by search? //--
>
<HTML>
<HEAD>
<meta http-equiv="Content-Type" content="text/html;
charset="shift_jis">
<TITLE> V,µ,¢"'z</TITLE>
.
.
.
.
<!-- searchresults page code //-->
<!-- asp to determine the language //-->
if lang <> 8 then
<META NAME="MS.LOCALE" CONTENT="EN-US">
<META HTTP-EQUIV="Content-Type"
content="text/html; charset=iso-8859-1">
else
<META NAME="MS.LOCALE" CONTENT="JA">
<meta http-equiv="Content-Type"
content="text/html; charset=shift_jis">
end if
<!-- asp to determine the SiteLocale (commented out as
appears to make search return 0 reults) //-->
IF lang <> "8" THEN
'SiteLocale = "EN-US"
ELSE
'SiteLocale = "ja"
END IF
<!-- Note: FreeText is forced always on, thus we always
do $Contents //-->
IF FreeText = "on" THEN
CompSearch = "$Contents " & chr(34) &
SearchString & chr(34)
ELSE
CompSearch = "@Contents " & SearchString
END IF
<!-- building the search string //-->
CompSearch = CompSearch + " OR @All " + SearchString
<!-- setting the LocaleID for the query (commented out as
appears to make search return 0 results)
IF SiteLocale <> "" THEN
'Q.LocaleID = util.ISOToLocaleID(SiteLocale)
END IF
----------------------------------------------
installed the japanese language pack (available here -
http://www.microsoft.com/windows/ie/downloads/recommended/
ime/install.asp?gssnb=1) on the server running the site.
placed the following meta tags in the 'dummy' pages that
are indexed by index server.
<META HTTP-EQUIV="Content-Type" CONTENT="text/html;
charset=Shift-JIS" />
<META NAME="MS.LOCALE" CONTENT="JA" />
In the searchresults page:
save the current code page (so we can change back to it
after jap content has been displayed) and set the
session.codepage variable for japanese
session("CurrentCodePage") = session.codepage
session.codepage = 932
.
.
.
.
change the session.codepage var back to what it was
before we set it to japanese
session.codepage = session("CurrentCodePage")
also in the searchresults page ensure
objQuery.LocaleID=1041