Hi All,
I've been using AddTextRaw and GetTextRaw methods of wxSTC to load and display invalid UTF-8 code. Everything seems to work fine except Copy/Paste operations.
Let's say I have text with three characters: '\193' (single quote, a character with decimal code 193 and another single quote). When loaded into wxSTC using SetTextRaw (and codepage is being set to SC_CP_UTF8/65001), the text is displayed as '[xC1]', where [xC1] is the way Scintilla displays invalid characters (usually in inverted color).
When this fragment is copied and pasted, only two characters are pasted 'g (open single quote and a letter 'g'). Since ScintillaWX::CopyToClipboard is using "wxTextBuffer::Translate(stc2wx(st.Data(), st.Length()))" and "wxTextDataObject(text)", it seems like the conversion happens in (one of) those two places. In fact, wxTextDataObject doesn't like invalid text at all (it stores an empty string in those cases). It seems like somewhere in conversion to/from UTF8 the proper content is lost.
Scintilla provides TargetAsUTF8 method, which seems like something that may help in this case, but it's not available in wxSTC:
# not sure what to do about these yet
'TargetAsUTF8' : ( None, 0, 0, 0),
'SetLengthForEncode' : ( None, 0, 0, 0),
'EncodedFromUTF8' : ( None, 0, 0, 0),
Is there a proper way to handle this copying or a workaround?
Paul.