UnicodeFix
![]() |
UnicodefixNormalizes Unicode to ASCII equivalents. Contribute to unixwzrd/UnicodeFix development by creating an account on GitHub. |
UnicodeFix is a lightweight, open-source utility that cleans up messy text by converting strange or invisible Unicode characters into clean, readable ASCII. If you’ve ever pasted content into your code editor and had it break because of curly quotes, non-breaking spaces, or hidden characters — this tool can save your sanity.
Whether you’re a developer, writer, or just someone who copies and pastes from the internet, UnicodeFix helps normalize your text so it behaves.
🔧 What Does It Do?
- Replaces smart quotes, em/en dashes, ellipses, and other Unicode punctuation with standard ASCII
- Removes invisible Unicode characters like:
U+200B
(Zero-width space)U+200C
(Non-joiner)U+200D
(Joiner)
- Makes AI-generated or copied web content safer to paste into your terminal, editor, or scripts
- Avoids hidden errors in YAML, JSON, Markdown, and other plain text formats
✅ Platform Compatibility
UnicodeFix has been developed and tested on macOS.
It should work on Linux and Windows (via WSL or Python), but is not yet officially tested on those platforms. Contributions or testing feedback welcome!
🚀 Get Started
Run it as a CLI tool with a simple one-liner:
python cleanup-text.py input.txt -o output.txt
Or pipe directly from stdin and back to stdout:
cat input.txt | python cleanup-text.py
More details and usage examples on GitHub:
💬 Tip
Combine with tools like VS Code, Hex Fiend, or Clipboard managers to debug and sanitize tricky copy-pasted text.
Built by unixwzrd — bringing clarity back to your clipboard, one invisible character at a time.
Project Blog Entries
Introducing Unicodefix
April 25, 2025Welcome to the Unicodefix project blog. Here we’ll share updates, insights, and progress on our development journey.