An utf-8 error

35 views
Skip to first unread message

jar...@yeah.net

unread,
Mar 15, 2013, 1:01:39 PM3/15/13
to falc...@googlegroups.com
OS:
Windows xp,Chinese simple,code page 936
(by the way,if on the chinese traditional XP,code page 950,there is no any error.)

Path:
C:\Fal>falcon -M a.fal

The file content(The fault is easy happened  when the string is one chinese word like as "了","体","大" etc. ):
printl("了")

The File saved as utf-8.

On the xp's cmd,execute like this:
C:\Fal>falcon -M a.fal

Then the console will output the error message:
falcon: FATAL - Program terminated with error.
SyntaxError CO0163 at /E:/Fal/a.fal:1: New line in literal string
SyntaxError CO0163 at /E:/Fal/a.fal:1: New line in literal string
SyntaxError CO0156 at /E:/Fal/a.fal:2: Unbalanced parenthesis at end of file (from line 1)

I don't know how to solve this question,so submited it here.Maybe sb. can give a correct answer.
Thanks a lot.

Steven Oliver

unread,
Mar 15, 2013, 4:07:53 PM3/15/13
to FalconPL
Can you past the code the somewhere? Here, https://gist.github.com/ perhaps?
Steven N. Oliver
> --
> You received this message because you are subscribed to the Google Groups
> "FalconPL" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to falconpl+u...@googlegroups.com.
> To post to this group, send email to falc...@googlegroups.com.
> Visit this group at http://groups.google.com/group/falconpl?hl=en.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>

jar...@yeah.net

unread,
Mar 15, 2013, 11:51:06 PM3/15/13
to falc...@googlegroups.com

I can execute the same source file(a.fal,saved as utf-8) successful on the Chinese traditional OS (windows xp,code page 950),the console output will display another chinese word or "?",because of decoding,but it can't output SyntaxError CO0163.

In the source file,if the sentence changed like this(double Chinese words):
printl("了了")
printl("体体")
printl("大大")
The console can't output SyntaxError CO0163,it will output another word.

If the source file(named as a.fal,saved as utf-8,the content is printl("了") ) changed nothing,but  input chcp 65001 on the XP's CMD,
The console can't output SyntaxError CO0163.

If the source file saved as GB2312 or GBK or GB18030,on the xp's CMD,code page 936,the code will be executed perfectly.
 
 
-------------------------------------------------------------------------------------------------------------------------------
在 2013年3月16日星期六UTC+8上午4时07分53秒,steveno写道:

Giancarlo Niccolai

unread,
Mar 18, 2013, 6:36:55 AM3/18/13
to falc...@googlegroups.com
Il 16/03/2013 04:51, jar...@yeah.net ha scritto:

I can execute the same source file(a.fal,saved as utf-8) successful on the Chinese traditional OS (windows xp,code page 950),the console output will display another chinese word or "?",because of decoding,but it can't output SyntaxError CO0163.

In the source file,if the sentence changed like this(double Chinese words):
printl("����")
printl("����")
printl("���")
The console can't output SyntaxError CO0163,it will output another word.

If the source file(named as a.fal,saved as utf-8,the content is printl("��") ) changed nothing,but  input chcp 65001 on the XP's CMD,
The console can't output SyntaxError CO0163.

If the source file saved as GB2312 or GBK or GB18030,on the xp's CMD,code page 936,the code will be executed perfectly.
 

We try to auto-detect the system I/O encoding, but when we're not able to, the I/O system falls back to "C", or "encode as-is".

Also, on Windows the system I/O encoding is determined by looking at the systems defaults. When console CP and source file encodings are different, you have to either set the proper encoding option in the output stream (stdOut()) or use the command line options -e and/or -E  (check out falcon -H for a descrpition of the two).

   -E <enc>    Source files are in <enc> encoding (overrides -e)
  ...
   -e <enc>    set given encoding as default for VM I/O

Now, -e changes how > "..." encodes strings known to the engine out to the stdout, and how input() reads strings.
-E selects how the source files are read. Notice that after a first read, .fam files are created and the strings are "sealed" as they was interpreted during source file read. So, if you create a .fam through the correct encoding option, you can send it anywhere and just worry about the I/O settings on the target platform.

About chinese encoding, we have the GBK http://en.wikipedia.org/wiki/GBK ; in case this is not what you need, you should be able to write a transcoder by looking at engine/transcoding.cpp and cut&paste from one existing transcoding (in case you do, we'd be delighted to add it).

(I know GBK is used on Windows as well, as we were testing it with the author on Windows).

Gian.

jar...@yeah.net

unread,
Mar 18, 2013, 3:19:10 PM3/18/13
to falc...@googlegroups.com


On Monday, March 18, 2013 6:36:55 PM UTC+8, Giancarlo Niccolai wrote:
We try to auto-detect the system I/O encoding, but when we're not able to, the I/O system falls back to "C", or "encode as-is".

Also, on Windows the system I/O encoding is determined by looking at the systems defaults. When console CP and source file encodings are different, you have to either set the proper encoding option in the output stream (stdOut()) or use the command line options -e and/or -E  (check out falcon -H for a descrpition of the two).

   -E <enc>    Source files are in <enc> encoding (overrides -e)
  ...
   -e <enc>    set given encoding as default for VM I/O

Now, -e changes how > "..." encodes strings known to the engine out to the stdout, and how input() reads strings.
-E selects how the source files are read. Notice that after a first read, .fam files are created and the strings are "sealed" as they was interpreted during source file read. So, if you create a .fam through the correct encoding option, you can send it anywhere and just worry about the I/O settings on the target platform.
...
Gian.

------------------------------------------------------------------


When I saved the source file as utf-8,I use the sentence ("E:\Program Files\Falcon\bin\falcon.exe" "-M" "-e" "utf-8" "E:\Fal\a.fal") in the Nppexec (Notepad++ 's plugin),and select the utf-8 on the Console output/input encoding dialog box of the Nppexec ,then the utf-8 Question has been solved.

Thanks a lot !

Jason.
Reply all
Reply to author
Forward
0 new messages