Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Japanese help please

0 views
Skip to first unread message

Jonathan Croughton

unread,
Mar 12, 2003, 7:14:49 AM3/12/03
to
please help i'm having an absolute 'mare.

we have developed a multi-language site that runs from one server
(being an english based server i'm assuming it is therefore setup for
english locale?).

the content for pages comes from a db so in order for index server to
'see' this i get the system to generate a 'dummy' html page for db
content. this then has a redirect in it so that when a user clicks on
a search result they are redirected to the appropriate page.

english content is indexed fine by index server but the japanese
content is proving to be a problem.

i have set the 'dummy' pages meta http-equiv="Content-Type" as below:
<meta http-equiv="Content-Type" content="text/html;
charset="shift_jis">
such that index server realises the page to be indexed is in japanese
and therefore (hopefully) applies the japanese wordbreaker.

however the search will still only return the page when very specific
japanese words are entered. it doesn't seem to find anything if
content from the page is used as a search phrase, unless its a really
long string.

being a multi-language site the search page has been designed to cope
with any of the languages needed. i have however removed the localeid
as the search just seems to give errors whenever this is in place.

any advice greatly appreciated

jonathan

Hilary Cotter

unread,
Mar 12, 2003, 12:13:00 PM3/12/03
to
you need to set the ms.locale value for these web pages as
well.

So it should look like this:

<META HTTP-EQUIV="Content-Type" CONTENT="text/html;
charset=Shift-JIS" />
<META NAME="MS.LOCALE" CONTENT="JA" />

Then set your session code page of the page where you
enter your search page to Japanese, ie

session.codepage=932

Then make sure you set the locale in your query page to
Japanese.

>.
>

jonathan

unread,
Mar 14, 2003, 11:15:01 AM3/14/03
to
hilary, many thanks for your reply. i have tried setting
the codepage but received the error 'invalid codepage
value'. trying to sort this at the mo. have also tried
setting the http-equiv and ms.locale values but this just
seems to return no results (to do with code page?)

additionally it appears the wbcache.jpn and wbdbase.jpn
files are not present for japanese word breaking, any
idea where these could be found? i'm assuming therefore
the neutral word breaker is being used in which case this
would explain why certain resilts aren't being found
because this breaks at white space (which being japanese
chars isn't very often).

any further comments would be greatly appreciated.

jonathan

>.
>

Hilary Cotter

unread,
Mar 16, 2003, 5:01:27 PM3/16/03
to
can you post your code?

you're right about the japanese files missing. Let me look into this for
you.

"jonathan" <jcrou...@sevenww.co.uk> wrote in message
news:037e01c2ea44$dd8e4740$3301...@phx.gbl...

jonathan

unread,
Mar 17, 2003, 5:17:07 AM3/17/03
to
hilary, many thanks for taking a look at this. heres the
code parts i think you need.

jonathan

---------------------------------------

<!-- dummy page code //-->
<!-- Note: charset="shift_jis" NOT charset=shift_jis" as
it should be. Latter means page not found by search? //--
>
<HTML>
<HEAD>


<meta http-equiv="Content-Type" content="text/html;
charset="shift_jis">

<TITLE> V,µ,¢"­'z</TITLE>
.
.
.
.


<!-- searchresults page code //-->
<!-- asp to determine the language //-->
if lang <> 8 then
<META NAME="MS.LOCALE" CONTENT="EN-US">
<META HTTP-EQUIV="Content-Type"
content="text/html; charset=iso-8859-1">
else


<META NAME="MS.LOCALE" CONTENT="JA">

<meta http-equiv="Content-Type"
content="text/html; charset=shift_jis">

end if

<!-- asp to determine the SiteLocale (commented out as
appears to make search return 0 reults) //-->
IF lang <> "8" THEN
'SiteLocale = "EN-US"
ELSE
'SiteLocale = "ja"
END IF

<!-- Note: FreeText is forced always on, thus we always
do $Contents //-->
IF FreeText = "on" THEN
CompSearch = "$Contents " & chr(34) &
SearchString & chr(34)
ELSE
CompSearch = "@Contents " & SearchString
END IF

<!-- building the search string //-->
CompSearch = CompSearch + " OR @All " + SearchString

<!-- setting the LocaleID for the query (commented out as
appears to make search return 0 results)
IF SiteLocale <> "" THEN
'Q.LocaleID = util.ISOToLocaleID(SiteLocale)
END IF

jonathan

unread,
Mar 18, 2003, 9:33:23 AM3/18/03
to
just as a point of reference heres what i did to sort
this problem.

----------------------------------------------
installed the japanese language pack (available here -
http://www.microsoft.com/windows/ie/downloads/recommended/
ime/install.asp?gssnb=1) on the server running the site.

placed the following meta tags in the 'dummy' pages that
are indexed by index server.

<META HTTP-EQUIV="Content-Type" CONTENT="text/html;
charset=Shift-JIS" />

<META NAME="MS.LOCALE" CONTENT="JA" />


In the searchresults page:

save the current code page (so we can change back to it
after jap content has been displayed) and set the
session.codepage variable for japanese

session("CurrentCodePage") = session.codepage
session.codepage = 932
.
.
.
.
change the session.codepage var back to what it was
before we set it to japanese
session.codepage = session("CurrentCodePage")


also in the searchresults page ensure
objQuery.LocaleID=1041

0 new messages