Imbuing a certain niceness on one’s formatting

Oh my, std::locale(“”) is gorgeous when slapped on a std::stringstream. So long as you’re not writing super-duper-performance-sensitive code, you can do wonderful things like this:

#include <string>
#include <sstream>
#include <locale>
#include <iostream>
#include <iomanip>

int main()
{
	std::stringstream ss;
	ss.imbue(std::locale(""));
	ss << std::setiosflags(std::ios::fixed) << std::setprecision(2) << 12345678.123;
	std::cout << ss.str();
	return (0);
}

Compile and run the above little C++ program with the latest GCC under Linux or with Microsoft’s Visual Studio and the output you’ll get from this firm, rounded and nicely tanned piece of code is this:

12,345,678.12

Isn’t that splendiferous? Well, I think it is. All the hard work of inserting commas for thousand separators has been taken care of because we said “hey, you, yes, you, stringstream. I’m wanting you to format us up using the user’s locale setting!”. Then, if you’re in France then you’ll get dots, onions and croissants rather than commas. Et voila, instant human readable numbers.

So, basically, the crowd goes wild. Or at least they did under Windows using Microsoft’s C++ compiler. Under GCC on OSX (gcc version 4.2.1 (Apple Inc. build 5666)), though, something really rather odd happens. It doesn’t work. Not at all. Not even the smallest sausage. It turns out that the only locale that this GCC seems to support is “C” – which means “programmer bollocks” rather than “human friendly”. Thus, no matter how you tweak, twiddle and fondle the above code, it would output:

Irregular Pigeon

Irregular Pigeon is here to point out just how bloody irregular this is.

12345678.12

… if you are lucky, like, inside a debugger, for example. You’re much more likely to see this:

cobrascobras: ~/Desktop $./a.out
terminate called after throwing an instance of 'std::runtime_error'
what(): locale::facet::_S_create_c_locale name not valid
Abort trap

Now, as we know, when something that should work under the C++ standard doesn’t, a kitten gets punched in the face. Having “invested” an hour or so in research on this, I’ve found a lot of angry OSX programmers and many confused GCC users (they were discussing this in 2003). I’ve decided that life is to short to figure out why1 the only locale supported by the standard library that comes with Apple’s GCC is “C” so I knocked out a simple function that does this for the case I need, long unsigned integers:

std::string GetCommaSeparatedUINT64String(uint64_t number)
{
	static const unsigned long UINT64_CONV_BUFFER_SIZE = 32;
	static char thousandsSep = ',';
	static bool bInitialised = false;
	//
	// If this is the first time in, grab default locale's
	// thousands separator if we can:
	if (false == bInitialised)
	{
		setlocale(LC_ALL, ""); // <- Do not forget this bit
		struct lconv* lc = localeconv();
		if (lc && lc->thousands_sep && *lc->thousands_sep)
		{
			thousandsSep = *lc->thousands_sep;
		}	// if (separator specified)
		bInitialised = true;
	}	// if (first use)
	//
	// Declare a working buffer, set cursor to end and zero-terminate:
	char workBuffer[UINT64_CONV_BUFFER_SIZE];
	char* cursor = &workBuffer[UINT64_CONV_BUFFER_SIZE - 1];
	*cursor = '\0';
	unsigned int index = 0;
	do
	{
		// If we've done three and we're not at the start,
		// insert thousands separator:
		if ((0 == (index % 3)) && (index))
		{
			--cursor;
			*cursor = thousandsSep;
		}
		//
		// Insert this digit and keep on going:
		--cursor;
		*cursor = '0' + (number % 10);
		number /= 10;
		++index;
	} while (number);
	//
	// Return an STL string built from this result:
	return (std::string(cursor));
}

This code wins no awards, but, frankly, after spending so much time figuring out why something that should work did not work, this is the best that “mildly annoyed” programming can achieve. It won’t behave with an Indian locale properly, for example, because they use some spooky weird thousands separator that, well, isn’t. It also won’t do floating point numbers, negative numbers, imaginary numbers or count giraffes but if Europe is “your thing” then it at least compiles and runs.

Now I’ll sit here quietly until someone I know points me at some compiler option I missed along the lines of "--do_locale_right_please". I’ll dust off my screaming hat just in case.

Honestly, eh?


1 Well, actually, I do know why. Here’s the lib-c++ source code for creating locales:

void locale::facet::_S_create_c_locale(__c_locale& __cloc, const char* __s, __c_locale) 
{ 
	// Currently, the generic model only supports the "C" locale. 
	// See http://gcc.gnu.org/ml/libstdc++/2003-02/msg00345.html
	__cloc = NULL; 
	if (strcmp(__s, "C")) 
		__throw_runtime_error(__N("locale::facet::_S_create_c_locale name not valid")); 
}


… basically, it’s “scene missing” on the code. Insert tears to continue…

This entry was posted in Rants, Software development and tagged , , , , , , . Bookmark the permalink.

Comments are closed.