Regina Calculation Engine
Encodings for international strings

As of version 4.5, Regina (finally) pays attention to character encodings.

The calculation engine uses UTF-8 for all strings (except possibly for filenames; see below). This means that programmers who pass strings into routines must ensure that they use UTF-8, and programmers who receive strings from routines may assume that they are returned in UTF-8. Note that plain ASCII is a subset of UTF-8, so plain ASCII text is always fine to use.

Regina's XML data files are also stored using UTF-8. Older versions of Regina used LATIN1 (the default for the Qt libraries) and did not specify an encoding in the XML header; however, Regina's file I/O routines are aware of this, and will convert older data into UTF-8 as it is loaded into memory (the files themselves are of course not modified). The routine versionUsesUTF8() may be useful for programmers who need to work with older data files at a low level.

File names are a special case, since here Regina must interact with the underlying operating system. All filenames that are passed into routines must be presented in whatever encoding the operating system expects; Regina will simply pass them through to the standard C/C++ file I/O routines (such as fopen() or std::ifstream::open()) without modifying them in any way.

It should be noted that ancient data files that use the old binary format (Regina 2.x, before mid-2002) only support plain ASCII text. Support for the old binary format is likely to be removed entirely in the very near future.

Users and programmers who use the Python interface must take special care, since Python does not pass strings around in UTF-8 by default.

Proper support for character encodings is quite new, and the main author rarely uses this (being a native English speaker). If you see Regina treating international characters in unexpected ways, please mail the author(s) or file a bug report so the problem can be fixed!

Copyright © 1999-2014, The Regina development team
This software is released under the GNU General Public License, with some additional permissions; see the source code for details.
For further information, or to submit a bug or other problem, please contact Ben Burton (