The custom search related code is using the URLEncoder.encode(text, "UTF-8") to encode the data for the URL:
https://github.com/zxing/zxing/blob/2f11529aa35e01354f9036c2aa7747ab23a604ef/android/src/com/google/zxing/client/android/result/ResultHandler.java
The contained binary data is obviously not UTF-8. Would be the correct way to handle binary data to add a raw variant (e.g. %p just percent encoding) in addition to the %s?
I have attached a QR-Code with binary data. The first five bytes could be ignored (first four bytes are the length in decimal numbers). Afterwards for each number from 0 to 0xFF the two characters of the hex-number and then the number itself in a single byte is stored. There are 773 bytes contained in the QR-Code.
The first bytes, up to 0x7F, are correct. Then each larger value is shown as %EF%BF%BD. This is the standard replacement character.
Here is the returned Custom Search URL (added some line breaks):
0773%07
00%0001%0102%0203%0304%0405%0506%0607%0708%08
09%09
0A%0A
0B%0B
0C%0C0D%0D0E%0E0F%0F10%10
11%1112%1213%1314%1415%1516%1617%1718%1819%191A%1A1B%1B1C%1C1D%1D1E%1E1F%1F
20+
21%2122%2223%2324%2425%2526%2627%2728%2829%29
2A*
2B%2B2C%2C
2D-2E.2F%2F
300
311
322333344355366377388399
3A%3A3B%3B3C%3C3D%3D3E%3E3F%3F40%40
41A42B43C44D45E46F47G48H49I4AJ4BK4CL4DM4EN4FO50P51Q52R53S54T55U56V57W58X59Y5AZ
5B%5B5C%5C5D%5D5E%5E
5F_
60%60
61a62b63c64d65e66f67g68h69i6Aj6Bk6Cl6Dm6En6Fo70p71q72r73s74t75u76v77w78x79y7Az
7B%7B7C%7C7D%7D7E%7E7F%7F
80%EF%BF%BD
81%EF%BF%BD
82%EF%BF%BD
83%EF%BF%BD
84%EF%BF%BD
[... more removed ...]
FF%EF%BF%BD
TiA
Christian
Christian
--
You received this message because you are subscribed to the Google Groups "zxing" group.
To unsubscribe from this group and stop receiving emails from it, send an email to zxing+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
If I understand the issue, no that's not the problem. Are you saying URLEncoder doesn't work? I doubt that; I'd suspect the contents of the barcode aren't read as bytes in the way you expect.
--
Yeah, that's a good point about those bytes not being allowed in ISO-8859-1. You could consider base-64 encoding of course, though that makes it 33% bigger.I think it would be better to set the ECI segment if any non-default segment is specified, not just if it doesn't match the default of ISO-8859-1. If that solves your problem I can do that.
--