Unicode Character Lookup

Search and explore Unicode characters by name, codepoint, or category. Find emoji, symbols, and special characters with their HTML entities and CSS escape codes.

Search by name, codepoint (U+2764), or paste a character

Common Characters

How to Use Unicode Character Lookup

1Search by character name (e.g. "snowflake"), codepoint (e.g. U+2744), or paste a character.
2See the character's name, category, block, and encoding details.
3Copy the HTML entity, CSS escape, or JavaScript escape code.
4Browse Unicode blocks and categories.

ZenovayAnalytics

Analytics built for founders.

Real-time visitor tracking
Privacy-first, no cookie banner
Set up in two minutes

Explore Zenovay

Related Tools

JSON Formatter & Validator

Format, validate, and beautify JSON data with syntax highlighting and error detection.

JWT Decoder

Decode and inspect JWT tokens. View header, payload, and verify signatures.

Base64 Encode/Decode

Encode text to Base64 or decode Base64 back to text. Supports UTF-8 and binary data.

URL Encode/Decode

Encode or decode URL components. Handle special characters, query strings, and full URLs.

Frequently Asked Questions

What is Unicode?▾

Unicode is a universal character encoding standard that assigns a unique number (codepoint) to every character in every writing system. The Unicode Standard covers 149,813 characters (Unicode 15.1) across 161 scripts including Latin, Arabic, Chinese, Japanese, Korean, Devanagari, Emoji, mathematical symbols, and historic scripts. Unicode codepoints are written as U+XXXX (e.g., U+0041 = A, U+1F600 = 😀). UTF-8, UTF-16, and UTF-32 are encoding forms that store Unicode codepoints as bytes.

What is the difference between Unicode and UTF-8?▾

Unicode is the abstract standard (assigns numbers to characters). UTF-8 is a concrete encoding (converts those numbers to bytes). In UTF-8: ASCII characters (U+0000 to U+007F) use 1 byte; characters up to U+07FF use 2 bytes; up to U+FFFF use 3 bytes; up to U+10FFFF use 4 bytes. UTF-8 is backward-compatible with ASCII and the dominant encoding on the web (~98% of websites). UTF-16 uses 2 or 4 bytes per character and is used internally by JavaScript and Java.

What is a Unicode codepoint and how do I escape it?▾

A codepoint is the unique number assigned to each character. U+0041 = decimal 65 = letter A. Escape forms: HTML entity: A or A (hex) or & (named). JavaScript: \u0041 (BMP) or \u{1F600} (full range, ES2015+). CSS: \41 or \000041. Python: \u0041 or \U00001F600. JSON: \u0041 (BMP only, surrogate pairs for others). URL encoding: %41 (percent-encoded).

What is a Unicode block?▾

Unicode is divided into 308 blocks (Unicode 15.1), each a contiguous range of codepoints for a related group of characters. Examples: Basic Latin (U+0000-U+007F), Latin-1 Supplement (U+0080-U+00FF), Greek (U+0370-U+03FF), Cyrillic (U+0400-U+04FF), CJK Unified Ideographs (U+4E00-U+9FFF, 20,902 characters), Emoji (Emoticons block U+1F600-U+1F64F). The "Basic Multilingual Plane" (BMP) covers U+0000 to U+FFFF.

What are Unicode categories?▾

Unicode assigns each character a General Category: L (Letter): Lu=uppercase, Ll=lowercase, Lt=titlecase, Lm=modifier, Lo=other. N (Number): Nd=decimal digit, Nl=letter number, No=other. P (Punctuation): Pc, Pd, Ps, Pe, Pi, Pf, Po. S (Symbol): Sm=math, Sc=currency, Sk=modifier, So=other. Z (Separator): Zs=space, Zl=line, Zp=paragraph. C (Other): Cc=control, Cf=format, Cs=surrogate, Co=private use, Cn=unassigned.