Skip to content

(Windows Terminal) Unicode Characters outside Basic Multilingual Plane not rendering #176

@AmmoniumX

Description

@AmmoniumX

PDCurses Version: 3.9

Most characters outside the Basic Multilingual Plane (Plane 0, U+0000 - U+FFFF) don't render correctly (Windows Terminal)
Example characters from Plane 1 (Supplementary Multilingual Plane, U+10000 - U+1FFFF):
U+1F600 - U+1F64F, Emoticons: 😊👍🔥 (⭐ and ⚔️ are on the BMP and therefore they do render)
U+10900 - U+1091F, Phoenician Characters: 𐤀𐤁𐤂
U+1D400 - U+1D7FF, Mathematical Alphanumeric Symbols: 𝑎𝑏𝑐

See https://www.compart.com/en/unicode/plane/U+10000 for all blocks in the Supplementary Multilingual Plane. Many of them aren't really implemented by most fonts, but there are many common ones (like the ones mentioned above) that are.

Sample code:

#include <curses.h>
#include <locale>

void setup() {
    initscr();
    cbreak();
    noecho();
    curs_set(0);
    keypad(stdscr, TRUE);
}

void cleanup() {
    endwin();
}

int main() {
    // Initialize locale
    std::locale::global(std::locale(""));

    // Initialize curses
    setup();

    std::wstring ws = L"Emoticons: (😊👍⭐⚔️🔥) \nPhoenician: (𐤀𐤁𐤂) \nMathematical Alphanumeric: (𝑎𝑏𝑐)";

    // Add the string to the main screen
    addnwstr(ws.c_str(), static_cast<int>(ws.length()));

    // Refresh the screen to show the changes.
    refresh();

    // Wait for the user to press a key.
    getch();

    // Clean up and exit
    cleanup();

    return EXIT_SUCCESS;
}

Building and running:

> cl.exe /std:c++23preview /EHsc /W4 /utf-8 /DPDC_WIDE /I "C:\Users\diego\src\PDCurses-3.9\" .\testpdcurses.cpp ..\PDCurses-3.9\wincon\pdcurses.lib user32.lib gdi32.lib advapi32.lib

> .\testpdcurses.exe
Emoticons: (����⭐⚔️��)
Phoenician: (������)
Mathematical Alphanumeric: (������)

The issue is probably within https://github.com/wmcbrine/PDCurses/blob/master/wincon/pdcdisp.c , I see that it uses the WriteConsoleW API call, which I have tested in isolation and it does print unicode characters correctly to terminal, so this may be an implementation bug with the specific way it's used, and how PDCurses handles UTF-16 surrogate pairs.

Working example using Windows API:

#include <string>

#include <Windows.h>

int main() {

    std::wstring ws = L"Emoticons: (😊👍⭐⚔️🔥) \nPhoenician: (𐤀𐤁𐤂) \nMathematical Alphanumeric: (𝑎𝑏𝑐)";

    HANDLE console = GetStdHandle(STD_OUTPUT_HANDLE);
    WriteConsoleW(console, ws.data(), static_cast<DWORD>(ws.size()), NULL, NULL);

    return EXIT_SUCCESS;
}

Building and running:

> cl.exe /std:c++23preview /EHsc /W4 /utf-8 /DPDC_WIDE .\testwinapi.cpp
> .\testwinapi.exe
Emoticons: (😊👍⭐⚔️🔥)
Phoenician: (𐤀𐤁𐤂)
Mathematical Alphanumeric: (𝑎𝑏𝑐)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions