Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Addressing Unicode character width ambiguities in kitty #8265

Closed
unxed opened this issue Jan 28, 2025 · 1 comment
Closed

Addressing Unicode character width ambiguities in kitty #8265

unxed opened this issue Jan 28, 2025 · 1 comment
Labels

Comments

@unxed
Copy link

unxed commented Jan 28, 2025

I would like to propose an improvement for kitty that addresses one more long-standing issue in terminal emulators: the ambiguity of Unicode character widths.

The problem lies in the lack of a definitive method to determine whether a specific set of Unicode code points should be treated as wide (full-width), regular (half-width), or non-printable character. The interpretation of these properties can vary between Unicode versions, and applications running in a terminal have no way to ascertain which Unicode standard the terminal adheres to. Even worse, different libraries produce inconsistent results: for instance, the system call wcwidth (which is notoriously unreliable on many systems) often yields results that differ from those of ICU. This inconsistency leads to issues in text editors, console-based file managers, and similar applications.

Various approaches have been proposed to address this issue. For example, iTerm2 introduces a special escape sequence that allows the terminal to adhere to a specific Unicode version:
https://iterm2.com/documentation-escape-codes.html#:~:text=Unicode%20Version
However, this method has its own drawbacks, such as the complexity of implementation in terminal emulators (e.g., how should the terminal handle all the differences between Unicode versions?). Additionally, discrepancies might still arise if the terminal and application use different libraries.

Another, seemingly more robust, solution has been proposed by the developers of VTM. Their idea involves using specific escape sequences to explicitly define how the terminal should render individual characters. Here's the specification:
https://github.com/directvt/vtm/blob/master/doc/character_geometry.md

It would be fantastic if you could look into this issue and consider addressing it when you have some time. Thank you!

@unxed unxed added the bug label Jan 28, 2025
@kovidgoyal
Copy link
Owner

dup of #8226

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants