Re: Support for Alternate Character Encodings?

53 views
Skip to first unread message

Sean Owen

unread,
Jul 10, 2012, 12:41:51 PM7/10/12
to zx...@googlegroups.com
This depends a great deal on what barcode format you're talking about. For example there is no such thing as character encoding in Code 128. There is in a QR code. If the system is making Code 128 codes, it is irrelevant what encoding that system uses.

Are you sure the barcodes "really" contain EBCDIC encoded text? you should post an example. I actually doubt it. Being unreadable to humans doesn't mean anything about the encoding.

I don't know what you mean about ASCII conversion. The output of the reader is already a String; there is no such thing as character encoding within a String.


nullPointer

unread,
Jul 10, 2012, 2:17:01 PM7/10/12
to zx...@googlegroups.com

Hi Sean,

Thanks a lot for taking the time to respond to this!  That’s great information about how different barcodes have varying levels of support for different character encodings.  The barcode being utilized by the old legacy system is a DataMatrix code.

I’m relatively sure that the DataMatrix barcode contains EBCDIC characters, but only insofar as the developers on the external system have stated as much.  At the very least, the barcode data is able to be decoded by the ZXing reader, so the data is ‘valid’ in terms of the barcode contents.

Here’s an image of a problematic barcode (Sorry it's not an attachment, the internet group policy in my current location appears to be a bit finicky regarding attachments):

Sample_DataMatrix.png

Naturally you’re right about Strings not knowing anything about character encodings.  It’s really the encoding of the bytes contained in the String that I’m altering.  I’m still not entirely sure I’m being clear though, so where words fail, I’ll just post some code.  Here’s my conversion routine:

public static final String stringToAsciiString(String inString)
   
{
       
String asciiString = null;

       
if (inString != null)
       
{
           
try
           
{
               
byte[] asciiBytes = inString.getBytes("US-ASCII");
                asciiString
= new String(asciiBytes, "US-ASCII");
           
}
           
catch (UnsupportedEncodingException encEx)
           
{
               
return null;
               
// encEx.printStackTrace();
           
}
       
}
       
return asciiString;
   
}


The code above converts the barcode contents to ASCII encoding, but it doesn’t help to make the contents any more readable.

Sean Owen

unread,
Jul 10, 2012, 2:23:31 PM7/10/12
to zx...@googlegroups.com
Good, that narrows it down. Yes, Data Matrix supports a byte mode which, like QR codes, also has a notion of ECI messages to specify character encoding. There is otherwise no particular relation between Data Matrix and EBCDIC. So, the Data Matrix has to have an ECI segment specifying this encoding.

Two bits of bad news there. I don't know if there is an ECI value for EBCDIC; I don't know if there is a known value for it. Second, ECI is not implemented in the DM decoder. It wouldn't be hard to fix that up but it's not there.

Your conversion doesn't do anything. The String is already whatever it is -- too late. Translating to ASCII-encoded bytes and back does nothing.

However, if you know the bytes in the DM code are EBCDIC, then access them directly with Result.getRawBytes(). Then make a String from that with "EBCDIC" encoding. I don't know if that's the right name for it in Java but it seems to be "Cp1047" actually.

nullPointer

unread,
Jul 12, 2012, 12:37:44 PM7/12/12
to zx...@googlegroups.com

Hi again Sean,

I just wanted to say thanks again for your assistance on this issue.  Turns out that getRawBytes() was just what the doctor ordered.  After using that I was able to apply EBCDIC encoding just as you postulated.  Success!

Interestingly, it was right around this same time that the external team came through with a separate breakthrough.  Ultimately they were able to apply ASCII encoding to the text contained in the barcode.  Another success!  (And for my money I’ll probably let the external system deal with the extra processing.  It’s still great to know that I’m not strictly tied to their encoding though!)

Thanks again, and this issue is most definitely resolved at this point.

Reply all
Reply to author
Forward
0 new messages