About Unicode Escape
Escape non-ASCII characters as <code>\uXXXX</code> sequences (UTF-16, the JavaScript / JSON form) or <code>\xXX</code> bytes. Decode the reverse. Useful when debugging encoding issues, inspecting strings in logs, and porting strings between languages with different string conventions.
What this fixes
Encoding bugs are subtle. A backend writes UTF-8, a middleware reinterprets as Latin-1, the front-end displays mojibake. Escaping captures the raw codepoints, surviving even broken transcoding.
Output forms
é— JavaScript / JSON / Java legacy (4-digit BMP codepoint)\u{1F600}— JavaScript ES2015 (any codepoint, no surrogates)\xE9— single-byte hex (Python bytes literal)\U0001F600— Python long form (8-digit codepoint)
Common workflows
Make a non-ASCII string survive a Latin-1 logger. Escape before logging; decode the lines later when investigating.
Embed a unicode char in a source file under a restricted charset. Generate the escape sequence, paste in source.
Inspect a string from a log. Paste escaped text, see what the codepoints actually represent.
Transcribe between languages. A é in JavaScript becomes é in Java but a different form in Python — the tool helps translate.
Frequently asked questions
\\u or \\x?
\u is 4 hex digits — covers any 16-bit codepoint (the BMP, most characters). \x is 2 hex digits — covers a single byte. For codepoints above U+FFFF (emoji, rare CJK), JavaScript uses surrogate pairs (two \u escapes); Python uses \U with 8 digits.Why escape?
Surrogate pairs?
\u escapes follow suit. Toggle to ES2015 \u{...} for the modern alternative.Round-trip?
Is the input sent anywhere?
Will it touch ASCII?
Related tools
Last updated: 2025-01-15