Sometimes, I’d like to know the name of a glyph. For example, if I see −, I may want to know if it’s a hyphen -, an en-dash –, an em-dash —, or a minus symbol −. Is there a way that I can copy-paste this into a terminal to see what it is?
I am unsure if my system knows the common names to these glyphs, but there is certainly some (partial) information available, such as in /usr/share/X11/locale/en_US.UTF-8/Compose. For example,
<Multi_key> <exclam> <question> : "‽" U203D # INTERROBANG
Another example glyph: 🐄.
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
Try the unicode utility:
$ unicode ‽ U+203D INTERROBANG UTF-8: e2 80 bd UTF-16BE: 203d Decimal: ‽ ‽ Category: Po (Punctuation, Other) Bidi: ON (Other Neutrals)
Or the uconv utility from the ICU package:
$ printf %s ‽ | uconv -x any-name
N{INTERROBANG}
You can also get information via the recode utility:
$ printf %s ‽ | recode ..dump UCS2 Mne Description 203D point exclarrogatif
Or with Perl:
$ printf %s ‽ | perl -CLS -Mcharnames=:full -lne 'print charnames::viacode(ord) for /./g' INTERROBANG
Note that those give information on the characters that make-up that glyph, not on the glyph as a whole. For instance, for é (e with combining acute accent):
$ printf é | uconv -x any-name
N{LATIN SMALL LETTER E}N{COMBINING ACUTE ACCENT}
Different from the standalone é character:
$ printf é | uconv -x any-name
N{LATIN SMALL LETTER E WITH ACUTE}
You can ask uconv to recombine those (for those that have a combined form):
$ printf 'eu0301bu0301' | uconv -x '::nfc;::name;'
N{LATIN SMALL LETTER E WITH ACUTE}N{LATIN SMALL LETTER B}N{COMBINING ACUTE ACCENT}
(é has a combined form, but not b́).
Method 2
You can use Perl viacode function from charnames module:
$ printf ‽ | perl -Mcharnames=:full -CLS -nle 'print charnames::viacode(ord)' INTERROBANG $ printf 🐄 | perl -Mcharnames=:full -CLS -nle 'print charnames::viacode(ord)' COW
charnames was first released with perl v5.6.0
With Perl 6 will be production ready on this Christmas day, it’s worth to mention it here, since when it has the best support for Unicode characters I have ever seen. You only need to call uniname method/routine:
$ printf ‽ | perl6 -ne 'say .uniname' INTERROBANG
é (e with combining acute accent) and standalone é character both give you:
# e with combining acute accent $ printf é | perl6 -ne 'say .uniname' LATIN SMALL LETTER E WITH ACUTE # standalone é $ printf é | perl6 -ne 'say .uniname' LATIN SMALL LETTER E WITH ACUTE
(.uniname is the shorthand for $_.uniname)
Method 3
The best way I know is through Perl’s uniprops. It comes with Perl’s Unicode::Tussle module. You can install it with
sudo perl -MCPAN -e 'install Unicode::Tussle'
You can then run it on any glyph you want to test:
$ uniprops ‽
U+203D ‹‽› N{INTERROBANG}
pP p{Po}
All Any Assigned InPunctuation Punct Is_Punctuation Common Zyyy Po P
General_Punctuation Gr_Base Grapheme_Base Graph GrBase Other_Punctuation
Pat_Syn Pattern_Syntax PatSyn Print Punctuation STerm Term
Terminal_Punctuation Unicode X_POSIX_Graph X_POSIX_Print X_POSIX_Punct
$ uniprops 🐄
U+1F404 ‹🐄› N{COW}
pS p{So}
All Any Assigned InMiscPictographs Common Zyyy So S Gr_Base Grapheme_Base Graph
GrBase Misc_Pictographs Miscellaneous_Symbols_And_Pictographs Other_Symbol
Print Symbol Unicode X_POSIX_Graph X_POSIX_Print
Method 4
You can use unicode, which also outputs some more information than just the name:
# unicode – U+2013 EN DASH UTF-8: e2 80 93 UTF-16BE: 2013 Decimal: – – Category: Pd (Punctuation, Dash) Bidi: ON (Other Neutrals)
Method 5
Create a bash script with this:
#!/bin/bash
awk -F ":" '{print $2}' /usr/share/X11/locale/en_US.UTF-8/Compose | grep "$1" | awk -F "#" '{print $2}'
Name it as you want, for example, namechar and give it executing permissions.
Now, you can call for example:
./namechar @
and the result will be:
COMMERCIAL AT
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0