Modified:
CJKSplitter/trunk/README.txt
Log:
change to rst
Modified: CJKSplitter/trunk/README.txt
==============================================================================
--- CJKSplitter/trunk/README.txt (original)
+++ CJKSplitter/trunk/README.txt Fri Oct 13 09:47:10 2006
@@ -1,41 +1,40 @@
CJKSplitter - Chinese, Japanese, Korean word splitter for ZCTextIndex
+=============================================================================
+CJKSplitter is a ZCTextIndex splitter for CJK (Chinese-Japenese-Korea) text
+stored as Unicode. It uses a simple, but workable, "hack" instead of trying
+to do real word splitting from dictionaries. Compared to a dictionary based
+word splitter, this results in a bigger index and more matches than necessary,
+but it is a cheap price to pay for the reduced complexity.
- CJKSplitter is a ZCTextIndex splitter for CJK (Chinese-Japenese-Korea) text
- stored as Unicode. It uses a simple, but workable, "hack" instead of trying
- to do real word splitting from dictionaries. Compared to a dictionary based
- word splitter, this results in a bigger index and more matches than necessary,
- but it is a cheap price to pay for the reduced complexity.
+Features
+================
+- use regular expression to compatible with defualt English white space
+splitter
-Feature
+- much simpler code, easy to install, easy to use
- - use regular expression to compatible with defualt English white space
- splitter
+- support multiple encodings: unicode/utf-8/gb18030/gbk/gb2312/mbcs/big5.
+provide 3 splitters(more to come):
- - much simpler code, easy to install, easy to use
+ * 'CJK splitter' : support unicode/utf-8 encoding. this encoding is
+compatible with version 0.1
- - support multiple encodings: unicode/utf-8/gb18030/gbk/gb2312/mbcs/big5.
- provide 3 splitters(more to come):
+ * 'CJK GB splitter' : support unicode/gb18030/gbk/gb2312/mbcs encodings.
- * 'CJK splitter' : support unicode/utf-8 encoding. this encoding is
- compatible with version 0.1
+ * 'CJK BIG5 splitter' : support unicode/big5/mbcs encodings
- * 'CJK GB splitter' : support unicode/gb18030/gbk/gb2312/mbcs encodings.
+- smaller index storage for CJK: index stored as unicode(2 byts) but not
+utf-8(3 bytes)
- * 'CJK BIG5 splitter' : support unicode/big5/mbcs encodings
+- support english globing
- - smaller index storage for CJK: index stored as unicode(2 byts) but not
- utf-8(3 bytes)
+- support single Chinese charactor search
- - support english globing
+About ZOpen
+=================
+ZOpen is a professional Zope/Plone consulting company located in Shanghai,
+China. We are also the supporter for CZUG.org (China Zope User Group).
+We are trying to make Zope/CMF/Plone works for the Chinese people.
- - support single Chinese charactor search
-
-About ZopeChina
-
- ZopeChina.com is a leading ZSP(Zope Service Provider) in China. We are also
-the supporter for CZUG.org (China Zope User Group). We are trying to make
-Zope/CMF/Plone works for the Chinese people. We wish all the Chinese Zope guys
-can be together and make zope works better for Chinese:)
-
- Contact us with : pan_j...@yahoo.com.cn
+Contact us with : pa...@zopen.cn