How do I change the default charset option

1,312 views
Skip to first unread message

Alan

unread,
Jul 17, 2013, 5:50:31 PM7/17/13
to blueg...@googlegroups.com
Hi

Since my last upgrade using the new file option or new file button creates a HTML5 file with

    <meta content="text/html; charset=windows-1252" http-equiv="content-type">

I want to use UTF-8 at as my character set.  I can't find a option to change it for default.  If I select more options to the side of the new file button or use the new file wizard the UTF-8 option is set as default

how can I change the default option for the new file option

Thanks

Alan

Greg Chapman

unread,
Jul 17, 2013, 6:44:04 PM7/17/13
to blueg...@googlegroups.com
Hi Alan,

On 17 Jul 13 22:50 Alan <alid...@googlemail.com> said:
> I want to use UTF-8 at as my character set. I can't find a option
> to change it for default. If I select more options to the side of
> the new file button or use the new file wizard the UTF-8 option is
> set as default
>
> how can I change the default option for the new file option

Select: Tools > Preferences > Advanced > Configuration Editor > I'll
be careful I promise

Search for: intl.charset.default

double-click the item and, in the "Enter string value" dialogue enter:
utf-8


Greg Chapman
http://www.gregtutor.plus.com
Helping new users of KompoZer and The GIMP
Still exploring BlueGriffon

Philip Goddard

unread,
Oct 23, 2016, 12:48:48 PM10/23/16
to bluegriffon
Hi, Greg.

I'm now starting the conversion of my websites to HTML5, and have read that I need to convert to utf-8 encoding. I've just seen this now out-of-date reply of yours, and would most appreciate a note of what about:config values to change to utf-8 now, in the current BG version, which doesn't have intl.charset.default , but has the following lines whose values may or may not need changing:
intl.accept_charsets (value iso-8859-1,*,utf-8)
intl.charset.detector (unset)
intl.charset.fallback.override (unset)
intl.fallbackCharsetList.ISO-8859-1;windows-1252 (value windows-1252)

At the moment BG is saving to a charset that EditPad identifies as ASCII+#65535;NCR, and converts pages that have been converted to utf-8 in EditPad back into the former charset - not exactly useful!

If the appropriate BG about:config line is set to utf-8, would BG then convert a page with a different charset into utf-8, so that it then saves with that encoding?  Also, would BG then change the charset declaration in the page's header, or would one have to do that manually?  Any ideas?  :-)  - Many thanks!

--
Philip

Philip Goddard

unread,
Oct 24, 2016, 7:07:43 AM10/24/16
to bluegriffon
Hmmm...  Well, I tried adding intl.charset.default to BG's about:config, set to utf-8, but BG still saves in what EditPad describes as ASCII+#65535;NCR (while leaving the<meta charset="utf8"> tag intact). It looks to me as though current version of BG is broken and I don't know what to do, because although it's really the only properly usable web-page editor available to me, at least, that suits my modus operandi, this intransigent failure to save in utf-8 is now a show-stopper as it doesn't make sense for me not to have all my pages in properly compliant HTML5 with utf-8 encoding.  I'll report this bug, but I'm stymied at the moment.

--
Philip

Greg Chapman

unread,
Oct 24, 2016, 9:42:29 AM10/24/16
to blueg...@googlegroups.com
Hi Philip,

On 24/10/16 12:07, Philip Goddard wrote:
> Hmmm... Well, I tried adding intl.charset.default to BG's
> about:config, set to utf-8, but BG still saves in what EditPad
> describes as ASCII+#65535;NCR (while leaving the<meta charset="utf8">
> tag intact).

I think that's exactly what is meant to happen. The meta tag is an
instruction to the browser on how to render a page. It has nothing to do
with the format that some external editor uses or the file you upload to
a server - well, indirectly it does, as the character set you save the
file in will impact on the range of characters that the browser is
instructed to use.

Or that is my understanding, but I'm not expert in this area.

--
Greg Chapman
http://www.gregtutor.co.uk
Still helping users of KompoZer but using BlueGriffon

Philip Goddard

unread,
Oct 24, 2016, 9:56:17 AM10/24/16
to bluegriffon
Hello again, Greg!

Sure, the metatag is meant for browsers, but nonetheless it makes absolutely no sense for an editor, especially a Web page editor, to save in an encoding different from what is specified in the charset metatag. After all, that is a serious error for a browser to have to cope with, and an editor should at least warn the user and give sensible options for the current save, rather than silently save in a different encoding to the declared charset. And in any case we still have the clear bug, that BG is consistently saving in non-utf-8 encoding despite my having done all I can to configure BG to save in utf-8. A dead loss at the moment!  :-(  I'll have to struggle along using EditPad instead of a WYSIWYG editor for the moment until this serious BG issue has been fixed.

--
Philip.

Greg Chapman

unread,
Oct 24, 2016, 10:33:34 AM10/24/16
to blueg...@googlegroups.com
Hi Philip,

On 24/10/16 14:56, Philip Goddard wrote:
> Sure, the metatag is meant for browsers, but nonetheless it makes
> absolutely no sense for an editor, especially a Web page editor, to
> save in an encoding different from what is specified in the charset
> metatag.
Character Sets are a field that is beyond my level of competence, but I
think there may be some confusion here about what Editpad is telling you
and what is of relevance to a browser.

Interestingly, when I search for ASCII+#65535;NCR the early results are
pages from Editpad help files and they seem to be explaining that they
are madly trying to get rid of the UTF coding for you.

It strikes me that your setting in Editpad may not be optimal for
handling UTF-8, but this has gone beyond my level of competence, so I'm
reluctant to offer more.

Philip Goddard

unread,
Oct 24, 2016, 10:56:35 AM10/24/16
to bluegriffon
Actually I'd already set EditPad to save HTML / HTM filetype as utf-8, and when I reload a page saved from EditPad, it shows 'utf-8' in the status bar. It's only after the file has been loaded into BG and then saved from that program that EditPad shows it to be anything other than utf-8.  No, unfortunately it does appear that BG is the culprit.

--
Philip

Charles Cooke

unread,
Oct 24, 2016, 3:52:50 PM10/24/16
to bluegriffon
Hi Philip and Greg,

After a long absence I’m back though my priorities are elsewhere so I may not stick with this long.

I’ve only just installed BG 2.1.1 so it is a new build.

I confirm the settings Philip listed in bold in his post of 23 October.

HOWEVER my files (I’ve created only a few) declare <meta content="text/html; charset=utf-8" http-equiv="content-type"> which is what is expected.

So, it seems to me that BG does have a bug. That is not to say it was delivered with one but rather that something might have crept in to Philip’s build.


I found a W3C page https://www.w3.org/International/questions/qa-choosing-encodings which has links to several others which seem to be pointing to areas where problems might arise but I haven’t been able to pursue all the leads.

Greg Chapman

unread,
Oct 24, 2016, 4:58:42 PM10/24/16
to blueg...@googlegroups.com

Hi Charles,

Great to hear from you. (I read up your KompoZer manual today to remind myself of some of this character set stuff, but I'm not sure I fully understood it!)


On 24/10/16 20:52, Charles Cooke wrote:
I’ve only just installed BG 2.1.1 so it is a new build.

I confirm the settings Philip listed in bold in his post of 23 October.

HOWEVER my files (I’ve created only a few) declare <meta content="text/html; charset=utf-8" http-equiv="content-type"> which is what is expected.

So, it seems to me that BG does have a bug. That is not to say it was delivered with one but rather that something might have crept in to Philip’s build.

I'm still using BG 1.7 (It's the latest available in the Ubuntu repository. My install of 2.1.1 refused to accept my registration key and emails about it to Daniel have gone unanswered. I've had an email from someone who read my post here on the topic who has the same issue, so it doesn't seem to be just me.

I've sorted the about:config list for "Status" and can't find anything my copy that changes the default doctype and mine does produce <!DOCTYPE html> on clicking "File > New". By default, it also produced the line:


<meta content="text/html; charset=windows-1252" http-equiv="content-type">

but this was easily changed to produce charset=UTF-8 by editing the Preference name "intl.charset.default" to "utf-8". This doesn't seem to be amongst those that Philip tried. Is 2.1.1 missing that setting?

My solution would be an answer to Alan's 2013 post and, possibly, Philip's, but in his case I understand that he is editing existing files, initially generated outside of BlueGriffon as HTML4.01 Transitional and I know that KompoZer, and perhaps BlueGriffon, can wobble when DOCTYPE lines are changed.

Then, on top of that, Philip's concern appears to be about how Edipad then handles the file after editing within BlueGriffon. I think that there may be some misunderstanding about character sets in that issue, but it's not inside my area of competence.

Philip Goddard

unread,
Oct 25, 2016, 4:20:04 AM10/25/16
to bluegriffon
I've just done a quick repeated test, creating a bare-bones new file in BG and saving it, and that comes out as utf-8, but with the old-style content-type metatag, which is supposed to be replaced by just <meta charset="utf-8"> for HTML5.

That loads and saves as utf-8 from BG and from EditPad.  However, when I replaced the content-type metatag with the charset one, sometimes BG would save as utf-8, sometimes not - haven't had time yet to establish just what was the difference of circumstances for that difference.  But my established web page files all appear to be blighted by BG, consistently saving as non-utf-8, no matter what the relevant metatag. This will need more experimentation, but clearly BG is seriously buggy in this respect, and until this is fixed I'll have to avoid using it as far as possible, and would have to use EditPad to reconvert any BG-saved page of mine back to utf-8.

As I've already noted, BG v.2.1.1 doesn't shoe a intl.charset.default option to set, but I created one anyway, and set it to utf-8, but that hasn't changed the misbehaviour with my own web pages.

--
Philip

Philip Goddard

unread,
Oct 27, 2016, 2:46:07 PM10/27/16
to bluegriffon
I have nothing further to report on this issue with BG so far, but at least I think now I can return to using BG as I have set up a work-around for the problem. Indeed, it consists of three corrective actions, to be carried out each time before I upload the edited files.

1.  I've set up a macro in EditPad so that, after loading EP with all my web pages in a particular local website folder, I run the macro, and it changes the encoding to UTF-8 and ensures that line endings are all Windows ones (CR/LF), and then just press Shift-Ctl S to save all changed files.  I thought this would cause all the files to have to be saved and then uploaded as supposedly changed, but actually only those files that needed to be changed got changed in my tests, so although the macro is applied globally per folder (and there is one for each of my five sites), normally only one to a few files would have to be saved.

2. I've set up a 'favorite' search/replace action in PowerGrep, to change the legacy 'content-type' Meta tag back to <meta charset="utf-8">, so I can quickly apply that action to all my web pages, fully globally.

3. Another PowerGrep action set up to convert non-breaking spaces, which in my testing occasionally got changed into inappropriate characters for that function in utf-8, back to &#160; , which latter character code at least works properly and don't display as other, visible, characters. That again is applied globally.

So, before uploading changed web pages I now have to remember to carry out those three actions before I run Beyond Compare to mirror-sync to my remote website folders. It's a hassle that I'd prefer to do without, but at least manageable until such time as the BG bug(s) relating to maintaining file encoding of utf-8 and ensuring the correct (charset) Meta tag have been fixed.

--
Philip.
Reply all
Reply to author
Forward
0 new messages