Tag: Cleanup

  • Remove Extra Spaces & Whitespace from Text [2026]

    Remove Extra Spaces & Whitespace from Text [2026]

    TL;DR: A whitespace remover collapses multiple spaces to one, strips leading and trailing spaces, removes blank lines, and (optionally) deletes invisible characters like non-breaking space (NBSP, U+00A0) and zero-width characters. Useful when copying text from PDFs, emails, or web pages that arrive with broken formatting. Our free whitespace remover handles all of these in your browser, with toggles for which kinds of whitespace to strip.

    Text copied from PDFs, emails, Word docs, and web pages almost never arrives clean. PDFs convert paragraph breaks to \r\n, embedded NBSPs, or even hard line breaks mid-sentence. Email signatures bring trailing spaces. Web copy-paste sneaks zero-width joiners and the invisible “Right-to-Left Mark” that breaks search-and-replace. Outlook substitutes regular spaces for non-breaking spaces inside paragraphs, so a string-equality check passes visually but fails programmatically.

    Our whitespace remover covers all of these with explicit toggles: collapse multiple spaces, trim every line, remove blank lines, normalise line endings (CRLF → LF), strip NBSP, strip zero-width characters, optionally remove all line breaks for one-paragraph output. Every option is a separate checkbox so you can apply only what you want. This guide explains which problem each toggle solves and the gotchas that catch most regex-based whitespace fixes.

    Whitespace categories — what each toggle does

    Toggle Removes Common source
    Collapse multiple spaces "a b""a b" PDF copy, double-space-after-period habit
    Trim each line Leading/trailing spaces per line Email signatures, manual indent
    Remove blank lines Empty or whitespace-only lines PDF page breaks, double Enter habit
    Normalise line endings CRLF/CR → LF Windows files, mixed editors
    Strip NBSP U+00A0 → regular space Word, Outlook, Pages, web HTML
    Strip zero-width U+200B, U+200C, U+200D, U+FEFF YouTube descriptions, Slack pastes
    Strip BiDi marks U+200E, U+200F, U+202A–U+202E Right-to-left text, accidental keyboard
    Tabs to spaces \t → 2 or 4 spaces Code, tabular text from spreadsheet

    The invisible character problem

    Visible whitespace is the easy part. The harder problem is invisible characters that look like nothing but break string equality, sort order, and search. The four most common culprits in pasted text:

    • Non-breaking space (NBSP, U+00A0) — looks like a regular space but doesn’t break across lines. Word and Outlook generate these freely. Your "hello world".indexOf("hello world") returns -1 if either side has an NBSP.
    • Zero-width joiner / non-joiner (U+200C / U+200D) — invisible. Used legitimately in Arabic and Indic scripts; appears as garbage in copy-paste from rich-text editors.
    • Byte-order mark (BOM, U+FEFF) — invisible. Often the first character of a UTF-8 file saved by Windows tools. Breaks JSON parsing, CSV import, and shell scripts.
    • Right-to-left mark (RLM, U+200F) — invisible. Flips display direction of subsequent text. Single accidental keystroke can mangle a whole paragraph.

    Our remover strips all of these by default. Toggle off if you’re working with intentional Arabic / Hindi / Hebrew content where these characters carry meaning.

    How to clean up text in your browser

    1. Open the whitespace remover
    2. Paste your text in the input
    3. Pick a preset (Light cleanup, Standard cleanup, Aggressive) or toggle individual options
    4. Output appears live as you type or paste
    5. Click Copy or Download .txt

    Common gotchas

    • Inside-string spaces matter in some contexts. If you’re cleaning code, “Aggressive” mode will collapse spaces inside string literals, breaking "foo bar". Use Light or Standard for code; aggressive for prose only.
    • Markdown is sensitive to trailing spaces. Two trailing spaces at the end of a markdown line is a hard line break. “Trim each line” destroys that. Disable trim if you’re cleaning markdown source.
    • YAML and Python are indent-sensitive. Don’t run “tabs to spaces” or “trim leading whitespace” on indented config — you’ll change the meaning. Use this tool on prose; use a code formatter on code.
    • NBSP in Word documents is intentional sometimes. Editorial styles use NBSP to keep “Mr. Smith” or “Page 5” together across line wraps. Aggressive NBSP-strip removes that protection.
    • BOM at the start of a file is invisible to your eye. Open a “broken” CSV in a hex editor — if the first three bytes are EF BB BF, that’s the BOM. Our tool strips it; many command-line tools don’t.
    • Tab width matters. Replacing tabs with spaces requires a width — 2 (modern web), 4 (Python / older code), or 8 (terminal default). Pick the same width your destination uses.

    When NOT to use this tool

    For programmatic cleanup inside a script, use the language-native tools — String.prototype.normalize('NFC').replace(/\s+/g, ' ').trim() in JavaScript, or " ".join(s.split()) in Python. For source code, use a code formatter (Prettier for JS, Black for Python). For Markdown source, use a Markdown linter that knows the format. Use this browser tool for prose, copy-pasted text from PDFs, email cleanup, and one-off formatting jobs where you don’t want to write a regex.

    Frequently asked questions

    Why does my “hello world” string fail comparison even though it looks right?

    Almost always an NBSP (U+00A0) where you expect a regular space (U+0020). Word, Outlook, and many web editors substitute NBSP automatically. Run “Strip NBSP” or use str.replace(/ /g, ' ') in code.

    Will this remove blank lines from my code?

    Yes if you toggle “Remove blank lines”. Don’t run that on code — blank lines often separate logical blocks. Use a code formatter instead. For prose, removing blank lines collapses paragraphs into one big block; for converting paragraphs to a single paragraph use the “Remove all line breaks” toggle instead.

    What’s the difference between trim, collapse, and strip blank lines?

    Trim removes leading and trailing whitespace per line. Collapse merges consecutive spaces inside a line into one. Strip blank lines deletes lines that contain only whitespace. They’re independent — you can run any combination.

    Does this fix line endings between Windows and macOS / Linux?

    Yes — toggle “Normalise line endings” to convert all CRLF and CR to LF (or pick the reverse). Useful when sharing CSV or text files across operating systems where one tool expects Unix-style and another expects Windows-style line endings.

    Is my text uploaded?

    No. The whitespace remover runs in your browser via JavaScript. Pasted content never reaches our servers — useful when cleaning up sensitive material (drafts, contracts, internal notes).

    Can I keep paragraph breaks but collapse internal whitespace?

    Yes — that’s the most common preset. Enable “Collapse multiple spaces” + “Trim each line” + “Strip NBSP” and leave “Remove blank lines” off. Result: clean paragraphs with no internal extra spaces, original paragraph structure preserved.

    Related tools and guides