unicode problems

41 views
Skip to first unread message

Tomasz Chmielewski

unread,
Apr 7, 2022, 8:35:06 AM4/7/22
to ox...@googlegroups.com
Hello Group!

Some friends of mine asked me to build a simple model. The model itself caused no problems, but its visualization did. You see, the model contained some typically Romanian characters in names of sequences, phases, boundaries, and even single r-dates. In a downloaded PDF they appear as #. Is there any simple way to solve the apparent unicode problem?

Thanks in advance for any suggestions!

Tomasz

Bayliss, Alex

unread,
Apr 7, 2022, 9:12:59 AM4/7/22
to ox...@googlegroups.com

Export the figures into Adobe or the like and edit manually! You are lucky, in previous versions, these characters made the code crash!

 

Alex

--
You received this message because you are subscribed to the Google Groups "OxCal" group.
To unsubscribe from this group and stop receiving emails from it, send an email to oxcal+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/oxcal/CA%2BYdYLLXDSXn7s%2Bf-NYEnJPzMmGar4xt_FTSgCBmnQZkZMFHhw%40mail.gmail.com.


Historic England Logo

Work with us to champion heritage and improve lives. Read our Future Strategy and get involved at historicengland.org.uk/strategy.
Follow us:  Facebook  |  Twitter  |  Instagram     Sign up to our newsletter     

This e-mail (and any attachments) is confidential and may contain personal views which are not the views of Historic England unless specifically stated. If you have received it in error, please delete it from your system and notify the sender immediately. Do not use, copy or disclose the information in any way nor act in reliance on it. Any information sent to Historic England may become publicly available. We respect your privacy and the use of your information. Please read our full privacy policy for more information.


Christopher Ramsey

unread,
Apr 7, 2022, 10:34:53 AM4/7/22
to OxCal group
Tomasz

The issue is not within OxCal itself - the UNICODE characters do work - a simple model like:

 Plot()
 {
  R_Date("δ∇πωΦẸč", 3000, 30);
 };

will display properly in the web-browser.  This assumes the characters have been entered from a Unicode friendly editor - like the web interface or a text editor.  If they are pasted from a Microsoft product they will probably be in some other format.

However, while the characters show ok in the web-browser the conversion from svg to pdf using the Apache Batik engine used on our server:

https://en.wikipedia.org/wiki/Apache_Batik

does loose some - so the Greek characters are ok - but not many of the others - and you get something like the first attached image.

You can edit the pdf - or probably easier, if you download the svg file instead, you can open it in Inkscape (or probably Adobe Illustrator) it should be fine and from there you can save it as a pdf  -like the second attached image.

I will look into whether recent Batk updates have addressed this - but from the documentation it does not look as if this is part of the updates.

Best wishes

Christopher




> On 7 Apr 2022, at 14:12, Bayliss, Alex <Alex.B...@HistoricEngland.org.uk> wrote:
>
> Export the figures into Adobe or the like and edit manually! You are lucky, in previous versions, these characters made the code crash!

> Alex

> From: ox...@googlegroups.com <ox...@googlegroups.com> On Behalf Of Tomasz Chmielewski
> Sent: 07 April 2022 13:35
> To: ox...@googlegroups.com
> Subject: unicode problems

> Hello Group!

> Some friends of mine asked me to build a simple model. The model itself caused no problems, but its visualization did. You see, the model contained some typically Romanian characters in names of sequences, phases, boundaries, and even single r-dates. In a downloaded PDF they appear as #. Is there any simple way to solve the apparent unicode problem?

> Thanks in advance for any suggestions!

> Tomasz
> --
> You received this message because you are subscribed to the Google Groups "OxCal" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to oxcal+un...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/oxcal/CA%2BYdYLLXDSXn7s%2Bf-NYEnJPzMmGar4xt_FTSgCBmnQZkZMFHhw%40mail.gmail.com.
>
>
> <image357d65.JPG>

>
> Work with us to champion heritage and improve lives. Read our Future Strategy and get involved at historicengland.org.uk/strategy.
> Follow us:  Facebook  |  Twitter  |  Instagram     Sign up to our newsletter    
>
>
>
>
>
> This e-mail (and any attachments) is confidential and may contain personal views which are not the views of Historic England unless specifically stated. If you have received it in error, please delete it from your system and notify the sender immediately. Do not use, copy or disclose the information in any way nor act in reliance on it. Any information sent to Historic England may become publicly available. We respect your privacy and the use of your information. Please read our full privacy policy for more information.
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> --
> You received this message because you are subscribed to the Google Groups "OxCal" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to oxcal+un...@googlegroups.com.
Quick-7.pdf
Quick-8.pdf

Tomasz Chmielewski

unread,
Apr 7, 2022, 3:28:39 PM4/7/22
to ox...@googlegroups.com
Dear Alex, Dear Christopher,

All  the characters I was writing about have been entered with the web interface (I simply switched from Polish/English to Romanian keyboard). That's why I thought that it must have had something to do with the transformation of UNICODE characters by OxCal. In the meantime, I also started to think about transformation of *.csv files as the best possible way of solving the problem. Now, having read the exhausting explanation of Christopher and your tips, I'm absolutely convinced to this solution. Great thanks!

Best,

Tomasz 

Christopher Ramsey

unread,
Apr 11, 2022, 7:47:09 AM4/11/22
to OxCal group
Tomasz

Thanks - let me know if you need any more help. The csv files generated by OxCal are also utf8 and will read correctly in text editors. However again Excel has patchy support for utf8. It is possible to get my version of Excel to open the CSV files correctly if they are converted to UTF-8 with BOM rather than just UTF-8 - and I could add these special characters (BOM) at the start of the CSV. The issue is that older versions of Excel don't work with that either and you get a strange set of characters in the first cell. However, I expect this is less of an issue now as it was when I last updated the code - given that most people will have updated their Microsoft software for other reasons.

I can look at that and try to sort out a solution that works properly - any advice or experience of others on this would be welcome.

Best wishes

Christopher
> To view this discussion on the web visit https://groups.google.com/d/msgid/oxcal/CA%2BYdYLKPOsTMK6WV3E4s40wSEkbs5T0ua2pCSRt8NrZKxA8A%2Bg%40mail.gmail.com.

Tomasz Chmielewski

unread,
Apr 13, 2022, 1:56:20 PM4/13/22
to ox...@googlegroups.com
Dear Christopher,

Thank you again! The conversion of *.csv files worked well, so I won't be looking for any other solution. Anyway, it's great that you expanded this thread and are ready to figure out the problem. I suspect that many have already faced similar troubles *but were afraid to ask. Probably most of us simply avoid using special characters in OxCal but this may lead to some misunderstandings. Especially toponyms commonly used to describe sequences, phases, and even r_dates might be difficult to identify when written simply with the use of basic latin alphabet. I'm just afraid that I cannot help other way but by indicating the difficulty. I simply have no kowledge and experience on that :(

Best,

T.

Christopher Ramsey

unread,
Apr 13, 2022, 2:04:56 PM4/13/22
to OxCal group
Dear Tomasz

Yes - it is important that people can use the correct characters in different languages. Fortunately the mechanisms for this have got better over the years and I have tried to make OxCal work as well as I can in this respect.

Best wishes

Christopher
> To view this discussion on the web visit https://groups.google.com/d/msgid/oxcal/CA%2BYdYLLVXtSG_Fz7sS8cGereeEpVZRZdtErpiaib57pAVG7FRg%40mail.gmail.com.

Reply all
Reply to author
Forward
0 new messages