Zsh Mailing List Archive
Messages sorted by:
Reverse Date,
Date,
Thread,
Author
[PATCH] (take two?) typeset array[position=index]=value / unset hash[$stuff]
- X-seq: zsh-workers 48980
- From: Bart Schaefer <schaefer@xxxxxxxxxxxxxxxx>
- To: Zsh hackers list <zsh-workers@xxxxxxx>
- Subject: [PATCH] (take two?) typeset array[position=index]=value / unset hash[$stuff]
- Date: Tue, 1 Jun 2021 19:51:32 -0700
- Archived-at: <https://zsh.org/workers/48980>
- In-reply-to: <CAH+w=7a+M7nTTqzmve+SZHwtEeRPTicQB5ZRy2XO6er4UhMb1w@mail.gmail.com>
- List-id: <zsh-workers.zsh.org>
- References: <20191216212706.i3xvf6hn5h3jwkjh@chaz.gmail.com> <20191217073846.4usg2hnsk66bhqvl@chaz.gmail.com> <20191217111113.z242f4g6sx7xdwru@chaz.gmail.com> <2ea6feb3-a686-4d83-ab27-6a582424487c@www.fastmail.com> <20200101140343.qwfx2xaojumuds3d@chaz.gmail.com> <20210430061117.buyhdhky5crqjrf2@chazelas.org> <CAH+w=7bHxSbFr60ZU0+oZ6+qEejhfBYTzvL7=aXadY5XzWtSzw@mail.gmail.com> <20210505114521.bemoiekpophssbug@chazelas.org> <CAH+w=7ZqE2DnxpHhvCjZnXB4A1vJ=EKB2fpWyUMaZX0VYqU9kg@mail.gmail.com> <CAH+w=7bVn2LiTsq194GjshVSOCSib7t4T=uZQ2ZvUGu6Z2XoPw@mail.gmail.com> <20210601053235.b4junj6muuwegl7b@chazelas.org> <CAH+w=7a+M7nTTqzmve+SZHwtEeRPTicQB5ZRy2XO6er4UhMb1w@mail.gmail.com>
On Tue, Jun 1, 2021 at 9:05 AM Bart Schaefer <schaefer@xxxxxxxxxxxxxxxx> wrote:
>
> [...] I should at least examine whether getasg() ought to be using
> parse_subscript() even though the corresponding parse.c block is using
> skipparens().
I've decided parse_subscript() is appropriate there, so that's
included below. If anyone sees any problem with that, please point it
out.
> Whereas most subscript flags are meaningless for unset, we might
> consider supporting (e) there. I'm not sure whether that would fully
> address the problem with the other characters, though.
Turns out it does seem to fix that, so that's the direction I've gone.
More discussion below.
> The issue with the empty key seems merely to be that the subscript
> validity test for associative arrays never changed from the one for
> plain arrays.
To maintain error-equivalent backward compatibility I didn't "fix"
this, instead, hash[(e)] (or hash[(e)''] if you think that more
readable) is required in order to unset the element with the empty
key.
> [...] I'm not sure what backward-compatibility would be broken? Do you
> mean that scripts that already have work-arounds for the current issue
> might stop working?
The one compatibility issue with the foregoing is this:
Current pre-patch zsh:
% typeset -A zz
% bad='(e)bang'
% zz[$bad]=x
t% typeset -p zz
typeset -A zz=( ['(e)bang']=x )
% unset zz\["$bad"\]
% typeset -p zz
typeset -A zz=( )
With the patch, the "(e)" appearing in the value of $bad becomes a
subscript flag, because $bad is expanded before "unset" parses:
% zz[$bad]=x
% typeset -p zz
typeset -A zz=( ['(e)bang']=x )
% unset zz\["$bad"\]
% typeset -p zz
typeset -A zz=( ['(e)bang']=x )
You have to double the flag:
% unset zz\["(e)$bad"\]
% typeset -p zz
typeset -A zz=( )
Is that a small enough incompatibility for this to be acceptable?
I've included doc updates, but not tests. If anyone else would like
to take a stab at those?
diff --git a/Doc/Zsh/builtins.yo b/Doc/Zsh/builtins.yo
index f8e11d79d..07d8f6ac9 100644
--- a/Doc/Zsh/builtins.yo
+++ b/Doc/Zsh/builtins.yo
@@ -2411,6 +2411,8 @@ but the previous value will still reappear when the scope ends.
Individual elements of associative array parameters may be unset by using
subscript syntax on var(name), which should be quoted (or the entire command
prefixed with tt(noglob)) to protect the subscript from filename generation.
+The `tt(LPAR())tt(e)tt(RPAR())' subscript flag may be used to require strict
+interpretation of characters in the key.
If the tt(-m) flag is specified the arguments are taken as patterns (should
be quoted) and all parameters with matching names are unset. Note that this
diff --git a/Doc/Zsh/params.yo b/Doc/Zsh/params.yo
index dc28a45ae..03547ef3f 100644
--- a/Doc/Zsh/params.yo
+++ b/Doc/Zsh/params.yo
@@ -443,7 +443,9 @@ not inhibited.
This flag can also be used to force tt(*) or tt(@) to be interpreted as
a single key rather than as a reference to all values. It may be used
-for either purpose on the left side of an assignment.
+for either purpose on the left side of an assignment. This flag may
+also be used in the subscript of an argument to tt(unset) to disable
+interpretation of special characters and quoting.
)
enditem()
diff --git a/Src/builtin.c b/Src/builtin.c
index a16fddcb7..7aff354cc 100644
--- a/Src/builtin.c
+++ b/Src/builtin.c
@@ -1933,10 +1933,14 @@ getasg(char ***argvp, LinkList assigns)
asg.flags = 0;
/* search for `=' */
- for (; *s && *s != '='; s++);
+ for (; *s && *s != '[' && *s != '=' /* && *s != '+' */; s++);
+ if (s > asg.name && *s == '[') {
+ char *se = parse_subscript(s + 1, 1, ']');
+ if (se && *se == ']') s = se + 1;
+ }
/* found `=', so return with a value */
- if (*s) {
+ if (*s && *s == '=') {
*s = '\0';
asg.value.scalar = s + 1;
} else {
@@ -3726,11 +3730,16 @@ bin_unset(char *name, char **argv, Options ops, int func)
if (ss) {
char *sse;
*ss = 0;
+ /* parse_subscript() doesn't handle empty brackets - should it? */
if ((sse = parse_subscript(ss+1, 1, ']'))) {
*sse = 0;
- subscript = dupstring(ss+1);
+ if (ss[1] == '(' && ss[2] == 'e' && ss[3] == ')') {
+ subscript = dupstring(ss+4);
+ } else {
+ subscript = dupstring(ss+1);
+ remnulargs(subscript);
+ }
*sse = ']';
- remnulargs(subscript);
untokenize(subscript);
}
}
@@ -3763,6 +3772,12 @@ bin_unset(char *name, char **argv, Options ops, int func)
} else if (PM_TYPE(pm->node.flags) == PM_SCALAR ||
PM_TYPE(pm->node.flags) == PM_ARRAY) {
struct value vbuf;
+ if (!*subscript) {
+ *ss = '[';
+ zerrnam(name, "%s: invalid parameter name", s);
+ returnval = 1;
+ continue;
+ }
vbuf.isarr = (PM_TYPE(pm->node.flags) == PM_ARRAY ?
SCANPM_ARRONLY : 0);
vbuf.pm = pm;
Messages sorted by:
Reverse Date,
Date,
Thread,
Author