How can I find the common name for a particular glyph?

Sometimes, I’d like to know the name of a glyph. For example, if I see , I may want to know if it’s a hyphen -, an en-dash , an em-dash , or a minus symbol . Is there a way that I can copy-paste this into a terminal to see what it is?

I am unsure if my system knows the common names to these glyphs, but there is certainly some (partial) information available, such as in /usr/share/X11/locale/en_US.UTF-8/Compose. For example,

<Multi_key> <exclam> <question>         : "‽"   U203D # INTERROBANG

Another example glyph: 🐄.

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

Try the unicode utility:

$ unicode ‽
U+203D INTERROBANG
UTF-8: e2 80 bd  UTF-16BE: 203d  Decimal: ‽
‽
Category: Po (Punctuation, Other)
Bidi: ON (Other Neutrals)

Or the uconv utility from the ICU package:

$ printf %s ‽ | uconv -x any-name
N{INTERROBANG}

You can also get information via the recode utility:

$ printf %s ‽ | recode ..dump
UCS2   Mne   Description

203D         point exclarrogatif

Or with Perl:

$ printf %s ‽ | perl -CLS -Mcharnames=:full -lne 'print charnames::viacode(ord) for /./g'
INTERROBANG

Note that those give information on the characters that make-up that glyph, not on the glyph as a whole. For instance, for (e with combining acute accent):

$ printf é | uconv -x any-name
N{LATIN SMALL LETTER E}N{COMBINING ACUTE ACCENT}

Different from the standalone é character:

$ printf é | uconv -x any-name
N{LATIN SMALL LETTER E WITH ACUTE}

You can ask uconv to recombine those (for those that have a combined form):

$ printf 'eu0301bu0301' | uconv -x '::nfc;::name;'
N{LATIN SMALL LETTER E WITH ACUTE}N{LATIN SMALL LETTER B}N{COMBINING ACUTE ACCENT}

(é has a combined form, but not b́).

Method 2

You can use Perl viacode function from charnames module:

$ printf ‽ | perl -Mcharnames=:full -CLS -nle 'print charnames::viacode(ord)'
INTERROBANG
$ printf 🐄 | perl -Mcharnames=:full -CLS -nle 'print charnames::viacode(ord)'
COW

charnames was first released with perl v5.6.0


With Perl 6 will be production ready on this Christmas day, it’s worth to mention it here, since when it has the best support for Unicode characters I have ever seen. You only need to call uniname method/routine:

$ printf ‽ | perl6 -ne 'say .uniname'
INTERROBANG

(e with combining acute accent) and standalone é character both give you:

# e with combining acute accent
$ printf é | perl6 -ne 'say .uniname'
LATIN SMALL LETTER E WITH ACUTE

# standalone é
$ printf é | perl6 -ne 'say .uniname'
LATIN SMALL LETTER E WITH ACUTE

(.uniname is the shorthand for $_.uniname)

Method 3

The best way I know is through Perl’s uniprops. It comes with Perl’s Unicode::Tussle module. You can install it with

sudo perl -MCPAN -e 'install Unicode::Tussle'

You can then run it on any glyph you want to test:

$ uniprops  ‽
U+203D ‹‽› N{INTERROBANG}
    pP p{Po}
    All Any Assigned InPunctuation Punct Is_Punctuation Common Zyyy Po P
       General_Punctuation Gr_Base Grapheme_Base Graph GrBase Other_Punctuation
       Pat_Syn Pattern_Syntax PatSyn Print Punctuation STerm Term
       Terminal_Punctuation Unicode X_POSIX_Graph X_POSIX_Print X_POSIX_Punct

$ uniprops  🐄
U+1F404 ‹🐄› N{COW}
    pS p{So}
    All Any Assigned InMiscPictographs Common Zyyy So S Gr_Base Grapheme_Base Graph
       GrBase Misc_Pictographs Miscellaneous_Symbols_And_Pictographs Other_Symbol
       Print Symbol Unicode X_POSIX_Graph X_POSIX_Print

Method 4

You can use unicode, which also outputs some more information than just the name:

# unicode –
U+2013 EN DASH
UTF-8: e2 80 93  UTF-16BE: 2013  Decimal: –
–
Category: Pd (Punctuation, Dash)
Bidi: ON (Other Neutrals)

Method 5

Create a bash script with this:

#!/bin/bash
awk -F ":" '{print $2}' /usr/share/X11/locale/en_US.UTF-8/Compose | grep "$1" | awk -F "#" '{print $2}'

Name it as you want, for example, namechar and give it executing permissions.

Now, you can call for example:

./namechar @

and the result will be:

COMMERCIAL AT


All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x