Error when trying to compile the exampe proto

1,722 views
Skip to first unread message

Tommi Laukkanen

unread,
Nov 12, 2009, 2:36:00 PM11/12/09
to Protocol Buffers
Hello

I am getting a bit desperate because of this problem.

What ever I do, I keep getting:

Test.proto:1:1: Invalid control characters encountered in text.

When I try to compile the example proto with protoc on Windows Vista
Ultimate 64:

package tutorial;

option java_package = "com.example.tutorial";
option java_outer_classname = "AddressBookProtos";

message Person {
required string name = 1;
required int32 id = 2;
optional string email = 3;

enum PhoneType {
MOBILE = 0;
HOME = 1;
WORK = 2;
}

message PhoneNumber {
required string number = 1;
optional PhoneType type = 2 [default = HOME];
}

repeated PhoneNumber phone = 4;
}

message AddressBook {
repeated Person person = 1;
}

any ideas?

-tommi

Kenton Varda

unread,
Nov 12, 2009, 4:45:30 PM11/12/09
to Tommi Laukkanen, Protocol Buffers
The error indicates that the first byte of the file is a non-whitespace character with ASCII value < 0x20.  You need to remove this invalid byte from the file.

Kenton Varda

unread,
Nov 12, 2009, 5:49:41 PM11/12/09
to Tommi Laukkanen, Protocol Buffers
Marc points out that your editor is probably placing a UTF-8 BOM at the beginning of the file.  I had assumed this couldn't be the cause because UTF-8 characters and control characters are different things.  However, looking at the code, there appears to be a bug where if char is signed on your system, UTF-8 chars will be consider to be control characters (because they are thus negative and therefore < 0x20).  So maybe that is the issue.

I'd be happy to accept a patch which makes protoc ignore UTF-8 BOMs.

Marc Gravell

unread,
Nov 12, 2009, 6:03:27 PM11/12/09
to Protocol Buffers
(damn, I forgot to reply-all again!)

It is also entirely possible that the two things are unrelated, in which case: sorry for any confusion.

But I do know that it is painfully easy to get BOM-heavy files if you use Visual Studio, and that protoc doesn't like it; I can't remember which error message it displays.

I guess the real thing to do is to look at the file as binary: how does it start?

Marc

2009/11/12 Kenton Varda <ken...@google.com>



--
Regards,

Marc

Tommi Laukkanen

unread,
Nov 12, 2009, 9:10:09 PM11/12/09
to Protocol Buffers
Hi

Thank you for the information. I use visual studio which seem to be root of the problem. Now I created the file with notepad and it works.

-tommi

Kenton Varda

unread,
Nov 12, 2009, 9:26:09 PM11/12/09
to Tommi Laukkanen, Protocol Buffers
There should be a setting somewhere in visual studio to make it not use BOMs.

Marc Gravell

unread,
Nov 12, 2009, 11:46:05 PM11/12/09
to Protocol Buffers
File -> Advanced Save Options... -> Encoding: Unicode (UTF-8 without signature) - Codepage 65001

Marc

2009/11/13 Kenton Varda <ken...@google.com>



--
Regards,

Marc
Reply all
Reply to author
Forward
0 new messages