Write wide string to file

6 views
Skip to first unread message

Pontus Östlund

unread,
Nov 8, 2012, 9:00:04 AM11/8/12
to Pike mailinglist
Hello Pike list!

I have a problem which I don't know how to solve. I call a webservice which returns a PDF-document as a byte array. The actual webservice call is done via Java. Pike invokes the Java webservice client which then returns the byte array to Pike where I create a string from the array.

The problem is that I can't save the file to disk since Pike complains about the string being a wide string. As a side note it works when printing the string to the console and the result looks pretty much as you would expect a PDF to look like.

How do I solve this?

Cheers, Pontus

Chris Angelico

unread,
Nov 8, 2012, 9:02:50 AM11/8/12
to Pike mailinglist
If they're supposed to be bytes, why is the string wide? There may be
a fundamental data issue here somewhere. But, possibly the easiest and
quickest fix is to pass everything through string_to_utf8() before
writing it out - UTF-8 being the most likely encoding to work.

ChrisA

Pontus Östlund

unread,
Nov 8, 2012, 9:08:21 AM11/8/12
to Chris Angelico, Pike mailinglist
Thank, but I should have said I had tried that (and a bunch of other encoding/decoding tricks).

When trying string_to_utf8() Pike says: Character 0xffffffe2 at index 10 is outside the allowed range.

I have tried writing the data to disk directly in Java and that works fine. But the file will not be written to disk in real life. I have also tried creating a string from the byte array in Java and return the string instead from the Java method, but that creates the same wide string problem in Pike.


2012/11/8 Chris Angelico <ros...@gmail.com>

Chris Angelico

unread,
Nov 8, 2012, 9:16:34 AM11/8/12
to Pike mailinglist
On Fri, Nov 9, 2012 at 1:08 AM, Pontus Östlund <poppa...@gmail.com> wrote:
> When trying string_to_utf8() Pike says: Character 0xffffffe2 at index 10 is
> outside the allowed range.
>
> I have tried writing the data to disk directly in Java and that works fine.
> But the file will not be written to disk in real life. I have also tried
> creating a string from the byte array in Java and return the string instead
> from the Java method, but that creates the same wide string problem in Pike.

Ah! You have negative numbers in there. Probably everything >127 is
coming out as negative, getting sign-extended, and then being utterly
unencodeable.

How do you transfer the byte array from Java to Pike? That's where the
problem is, I think.

ChrisA

Pontus Östlund

unread,
Nov 8, 2012, 9:32:27 AM11/8/12
to Chris Angelico, Pike mailinglist
2012/11/8 Chris Angelico <ros...@gmail.com>
On Fri, Nov 9, 2012 at 1:08 AM, Pontus Östlund <poppa...@gmail.com> wrote:
[...]

> I have tried writing the data to disk directly in Java and that works fine.
> But the file will not be written to disk in real life. I have also tried
> creating a string from the byte array in Java and return the string instead
> from the Java method, but that creates the same wide string problem in Pike.

Ah! You have negative numbers in there. Probably everything >127 is
coming out as negative, getting sign-extended, and then being utterly
unencodeable.

How do you transfer the byte array from Java to Pike? That's where the
problem is, I think.

ChrisA


Indeed, there are negative numbers. This is pretty much how it works:

object java = Java.pkg["my/java/class"]();
object /* Java.jarray */ res = java->downloadDocument(some, params); // returns byte[]
array(int) pike_array = values(res);

String.Buffer b = String.Buffer();

foreach (pike_array, int c)
  b->add(sprintf("%c", c);

Stdio.write_file("test.pdf", b->get());

And as I wrote before: The same problem occurs if I create a string from the byte array directly in Java and transfer the string to Pike.

# Pontus

Chris Angelico

unread,
Nov 8, 2012, 9:36:57 AM11/8/12
to Pike mailinglist
On Fri, Nov 9, 2012 at 1:32 AM, Pontus Östlund <poppa...@gmail.com> wrote:
> String.Buffer b = String.Buffer();
>
> foreach (pike_array, int c)
> b->add(sprintf("%c", c);
>
> Stdio.write_file("test.pdf", b->get());

Try %1c instead of %c - that'll write out single-byte values for the
negative integers.

Incidentally, you can shortcut a whole lot of that with one of
sprintf's cool features:

Stdio.write_file("test.pdf", sprintf("%@1c", pike_array));

The @ sign means "do this for every element of the argument array".

ChrisA

Henrik Grubbstr�m (Lysator) @ Pike (-) importm�te f�r mailinglistan

unread,
Nov 8, 2012, 9:40:02 AM11/8/12
to pi...@roxen.com
>Indeed, there are negative numbers. This is pretty much how it works:
>
>object java = Java.pkg["my/java/class"]();
>object /* Java.jarray */ res = java->downloadDocument(some, params); //
>returns byte[]
>array(int) pike_array = values(res);
>
>String.Buffer b = String.Buffer();
>
>foreach (pike_array, int c)
> b->add(sprintf("%c", c);

Modify the above to

b->add(sprintf("%c", c & 255));

or just replace all of it with

string res = (string)map(pike_array, `&, 255);

>Stdio.write_file("test.pdf", b->get());
>
>And as I wrote before: The same problem occurs if I create a string from
>the byte array directly in Java and transfer the string to Pike.

That does seem a bit strange though. Marcus?

/grubba

Pontus Östlund

unread,
Nov 8, 2012, 9:53:20 AM11/8/12
to Henrik Grubbström (Lysator) @ Pike (-) importmöte för mailinglistan, Pike mailinglist
2012/11/8 Henrik Grubbström (Lysator) @ Pike (-) importmöte för mailinglistan <63...@lyskom.lysator.liu.se>

>foreach (pike_array, int c)
>  b->add(sprintf("%c", c);

Modify the above to

   b->add(sprintf("%c", c & 255));

or just replace all of it with

  string res = (string)map(pike_array, `&, 255);
[...]
/grubba


Awesome Grubba! You're the boss! That did the trick :)

# Pontus

Chris Angelico

unread,
Nov 8, 2012, 10:10:11 AM11/8/12
to Pike mailinglist
On Fri, Nov 9, 2012 at 1:40 AM, Henrik Grubbström (Lysator) @ Pike (-)
importmöte för mailinglistan <63...@lyskom.lysator.liu.se> wrote:
> or just replace all of it with
>
> string res = (string)map(pike_array, `&, 255);

Or, picking up the other thread:

string res = (string)(pike_array[*]&255);

The explicit map is probably clearer though.

ChrisA

arronlee

unread,
Jan 25, 2016, 8:08:15 AM1/25/16
to pi...@roxen.com
I wonder do you have any idea about pdf extraction
<http://www.pqscan.com/extract-text/> ? Something went wrong with my pdf
reader.



--
View this message in context: http://pike.1058338.n5.nabble.com/Write-wide-string-to-file-tp5710211p5712447.html
Sent from the Pike - User mailing list archive at Nabble.com.

H. William Welliver III

unread,
Jan 25, 2016, 9:32:14 AM1/25/16
to arronlee, pi...@roxen.com
I generally use iText for working with PDF content. The Java bridge makes it simpler to access java functions from within Pike.

A simple example using java:

http://stackoverflow.com/questions/8821107/pdf-text-extraction-using-itext

Here’s some background on the Java module:

http://bill.welliver.org/space/pike/Java+Bridge

And an example of using the bridge (though not using iText, unfortunately):

http://bill.welliver.org/space/pike/JMS+interface

I know others have used iText from within Pike, so if you have any questions there should be some folks knowledgable in that.

Hope this helps!

Bill
Reply all
Reply to author
Forward
0 new messages