Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Progress report cppx - Unicode console i/o via streams in Windows

11 views
Skip to first unread message

Alf P. Steinbach

unread,
Jan 20, 2016, 7:44:23 PM1/20/16
to
Finally a version that provides Unicode i/o via the standard streams, in
Windows, and works with both g++ and Visual C++.

The code below is an overview of how I configure the streams, in 5 more
or less intricate steps. I think no beginner can be expected to
implement this on their own. Especially not for a first little
personalized greeting program where his/her name should be presented.

So, I think it's way past time that someone provides basic i/o as a
library. After all most every other language has working basic i/o...

---------------------------------------------------------------------
File on GitHub:
https://github.com/alf-p-steinbach/cppx/blob/955b7b85733287934969fc7fc0add5fa011474bf/basics/io/stdstreams/iostreams_config.hpp

<code>
#pragma once
// p/cppx/basics/io/stdstreams/iostreams_config.hpp
// Copyright © Alf P. Steinbach 2015. Boost Software License 1.0.

#include <p/cppx/basics/io/stdstreams/external_stream_kind.hpp>
#include
<p/cppx/basics/io/stdstreams/iostreams_config/Console_narrow_encoding.hpp>
#include
<p/cppx/basics/io/stdstreams/iostreams_config/console_input_streambuffer_.hpp>
#include
<p/cppx/basics/io/stdstreams/iostreams_config/Console_output_streambuffer_.hpp>
#include <p/cppx/basics/io/stdstreams/streams_from_ids.hpp>
#include <p/cppx/core_language_support.hpp> //
cppx::Default_c_level_locale_setter

namespace progrock{ namespace cppx{

// This is probably idempotent, but there's no point calling it
more than once.
inline void configure_iostreams()
{
// We're dealing with changing apparently nonsense functionality.
// So it's difficult to say whether order is or can later become
// significant here, but the given order below feels safest. I.e.
// going inward from physical environs to software abstractions,
// so as to not change thing that X depends on after changing X.

// 1
// Set console active codepage to the one used for narrow literals.
// In Unixland the "codepage" will just be 0, with no effect.
int const codepage = atoi(
execution_character_set_encoding.c_str() );
static const Console_narrow_encoding using_encoding( codepage );

// 2
// Set the C level locale to the same codepage, if possible.
static const Default_c_level_locale_setter using_default_locale;

// 3
// Set the C++ global locale also to that codepage, if possible.
// With at least some g++ variants it's not possible, but happily
// with those compilers the wide streams apparently translate
// narrow strings according to the C level locale.
using std::locale;
try
{
// The default native locale with the C++ execution character
// set encoding, established by Default_c_level_locale_setter:
const auto specification = setlocale( LC_ALL, nullptr );
locale::global( locale( specification ) );
}
catch( ... )
{
// Could try just the default native locale, sans encoding,
// locale::global(locale("")); // but would reset C the level!
#ifndef __GNUC__ // g++'s can deal OK with things anyway.
throw;
#endif
}

using Id = process::Stream_id;
using Kind = process::External_stream_kind;

// 4
// In Windows use direct console i/o to be able to submit UTF-16
// encoded output, and receive UTF-16 encoded keyboard input.
if( syschar_is_wide )
{
if( external_stream_kind( Id::std_input ) == Kind::console )
{
static Console_input_streambuffer_<Syschar_base> buffer;
wcin.rdbuf( &buffer );
}

for( const Id::Enum id : process::outstream_ids() )
{
if( external_stream_kind( id ) == Kind::console )
{
static Console_output_streambuffer_<Syschar_base>
buffers[3];
wide_stream( id ).rdbuf( &buffers[id -
Id::std_output] );
}
}
}

// 5
// Imbue the right-codepage global C++ locale in each stream.
// Note: the stream delegates the imbuing to the stream's buffer.
wcin.imbue( locale() );
for( const Id::Enum id : process::outstream_ids() )
{
wide_stream( id ).imbue( locale() );
}
}

}} // namespace progrock::cppx
</code>

---------------------------------------------------------------------
One of the example programs I used for very-informal testing:

<code>
#include <p/cppx/basics_and_main.hpp> // This header brings in a
default "main".
using namespace progrock::cppx;

void cpp_main()
{
// Basics: output of text parts, input of Unicode line.
sys.out << "Hi, what’s your name? ";
const String username = line_from( sys.in );
sys.out << "Pleased to meet you, " << username << "!" << endl;

// Just to input at least 2 lines, so as to test g++:
sys.out << endl;
sys.out << "What’s your favorite activity, then, " << username << "? ";
const String activity = line_from( sys.in );
sys.out
<< "What a coincidence!" << endl
<< "Earlier I favored making Norwegian blåbærsyltetøy," << endl
<< "but now my favorite activity is also " << activity << "!"
<< endl;

sys.out << endl;
sys.out << "Well, have a nice day, " << username << "! Bye!" << endl;
}
</code>

This works nicely with Norwegian and Cyrillic, but my console windows
refuse to DISPLAY Chinese ideograms. However, copying and pasting
ideograms work for input in the sense that the program gets the right
Unicode code points. Main restriction is that in Windows one is limited
to the Basic Multilingual Plane of Unicode (essentially original 16-bit
Unicode), due to the console window architecture in Windows.

Cheers, & enjoy,

- Alf

PS: I run unit test for some functions, but most of the i/o
functionality is only tested by running some example programs. So there
are probably some bugs. This is a Work In Progress™, as we used to say
in the 1990's web pages. :)

0 new messages