Zsh Mailing List Archive
Messages sorted by:
Reverse Date,
Date,
Thread,
Author
Re: UTF-8 non-breaking spaces
- X-seq: zsh-workers 48251
- From: Daniel Shahaf <d.s@xxxxxxxxxxxxxxxxxx>
- To: Zsh hackers list <zsh-workers@xxxxxxx>
- Subject: Re: UTF-8 non-breaking spaces
- Date: Fri, 26 Mar 2021 21:07:51 +0000
- Archived-at: <https://zsh.org/workers/48251>
- Archived-at: <http://www.zsh.org/sympa/arcsearch_id/zsh-workers/2021-03/20210326210751.GF18178%40tarpaulin.shahaf.local2>
- In-reply-to: <CAH+w=7Z8-AOLSLD7dKD6bMJDv31-gyw2mAwKTiAHES6bWurBzA@mail.gmail.com>
- List-id: <zsh-workers.zsh.org>
- References: <CAH+w=7Z8-AOLSLD7dKD6bMJDv31-gyw2mAwKTiAHES6bWurBzA@mail.gmail.com>
Bart Schaefer wrote on Fri, Mar 26, 2021 at 11:15:47 -0700:
> > If you're copy-pasting from an edit in browser gmail, for example, it
> > has a tendency to insert non-breaking spaces whenever there is more
> > than one consecutive space, which the shell interprets as
> > non-whitespace and attempts to execute as commands.
>
> Non-breaking space in this case is (bindkey syntax) "\M-B\M- ". The
> error message is equally confusing because you still can't see the
> non-breaking spaces when "not found" is reported.
>
> Handling this is complicated by bracketed-paste, which protects the
> non-breaking spaces from (for example) { bindkey -s '\M-B\M- ' ' ' }.
>
> "unsetopt multibyte" does not affect this but LANG=C results in (for example)
>
> (In gmail editor)
> echo " " " "
> (Pasted at shell prompt)
> % echo " " "<c2><a0> "
>
> That's totally a ZLE display thing, the actual nbsp is output when the
> command executes, but at least you can see what's going on.
>
> (The non-breaking spaces go back to normal spaces in sent email, I
> believe, or at least do so when the message is displayed in gmail;
> this is just a "thing" in the browser text editor.)
>
> Similar goofiness can result when copy-pasting from other "smart"
> multibyte editors when zsh has a UTF-8 variant in $LANG.
>
> Any good suggestions how to deal with this in a non-confusing fashion?
(I presume "Use a non-buggy MUA" isn't the answer you're after.)
With zsh-syntax-highlighting:
. /path/to/zsh-syntax-highlighting
ZSH_HIGHLIGHT_HIGHLIGHTERS=( pattern ) # or += if you already use z-sy-h
typeset -A ZSH_HIGHLIGHT_PATTERNS=($'\uA0' 'bg=blue,bold')
This'll highlight nbsp's. Not change them, just highlight them. To
change them, a custom s/nbsp/space/g widget might be convenient.
> Everything I've thought of (short of hacking up the lexer) risks
> corrupting parts of the input that aren't intended to be word
> separators (the bindkey -s above has that problem, for example, if
> bracketed-paste is disabled).
>
Messages sorted by:
Reverse Date,
Date,
Thread,
Author