In Tue, Nov 29, 2011 at 12:31 AM, David Huynh <dfh...@gmail.com> wrote:Based on Thad's previous feedback, his data was Windows 1252, so we
> Hi Thad,
> My latest checkin should fix your clipboard scenario. For some reason,
> neither Chrome nor Safari sends any charset in the POST.
may need to use the platform's default encoding rather than UTF-8
across all platforms.
> Hi Tom,Importing multiple files with different encodings into the same
> The bigger issue here is that a single project can now be created from
> several files, each with potentially a different encoding. A project-wide
> encoding setting no longer seems to make sense.
project seems like a recipe for disaster. Do we think it's going be a
likely case?
> I think in Reinterpret.java, line 67, encoder should still be o2 and decoderChanging reinterpret to be reinterpret(new-encoding,old-encoding)
> should be args[2], so that the decoder would be an optional parameter. What
> do you think?
instead of vice-versa is fine with me. Optional arguments
traditionally follow mandatory arguments, but I didn't really think
there was a backward compatibility issue in this case.
The PASTE into Clipboard is handled by the OS as an Array of (bytes),
specifically a MemoryStream in Windows, I think. It seems that both
the encoding choosen by the User for rendering a document in the
browser, OR the application will affect the rendering / display on a
given DOM document. The DOM document encoding is handled by a
property called documentCharacterSet, I think.
Regardless, I do not think the platform's encoding is an issue. I
should be able to operate my Windows system in Greek as my locale.
And Refine, operating as an application with UTF8 as the default
rendering for the DOM document, which it appears Refine has always
done correctly. So we're good there. And browsers now are set to
AutoDetect from parsing <html lang="en"><head><meta charset="utf-8">.
A sidenote: PASTING a single character that was previously copied by
highlighted from an HTML page actually does different things depending
on the application that your pasting into. Some applications like
PSPad (and it's built in Hex Editor) on Windows, allow you to paste as
Text, Unicode Text, HTML Format, OEM Text, Text Locale, and will carry
along the metadata with the Clipboard's MemoryStream, such as even the
source url of the document you copied from.
The FORM element that Refine uses for the Clipboard can utilize this:
https://developer.mozilla.org/en/DOM/form.acceptCharset
The only real abilities that current browsers have with the Form
element are noted here:
http://www.w3.org/TR/1999/REC-html401-19991224/interact/forms.html#adef-enctype
We probably always want to ignore the User's Browser Encoding
preference they set (Under the covers that is set with the DOM
documentCharacterSet)
The multipart/form-data that we are using for the Clipboard function
seems to pass along the raw byte stream correctly for the Korean char
챔 (in HEX: 54CC)
Partsmultipart/form-data
clipboard Trois-Riviì±”res Port Authority
Source
Content-Type: multipart/form-data;
boundary=---------------------------491299511942 Content-Length: 173
-----------------------------491299511942 Content-Disposition:
form-data; name="clipboard" Trois-Riviì±”res Port Authority
-----------------------------491299511942--