Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

[PATCH] add zgetopt



this is a contrib function/script that wraps `zparseopts -G` to provide
an interface like util-linux's getopt(1), which is extremely useful but
not portable

i think wanting to implement something like this was the initial reason
i started working on zsh's internals, so it kind of completes the cycle
for me

it should be safe to go in before 5.10 since it's only a contrib thing,
but lmk if you disagree. or if you just think it's unnecessary

ps: i have been calling single-hyphen long options (-foo) 'sun style'
for some reason, i guess because it's what java and their c compilers
use. but apparently nobody else calls them that. maybe someone knows a
better name

dana


diff --git a/Doc/Zsh/contrib.yo b/Doc/Zsh/contrib.yo
index c1bea6022..030b63029 100644
--- a/Doc/Zsh/contrib.yo
+++ b/Doc/Zsh/contrib.yo
@@ -4672,6 +4672,84 @@ Same as tt(zmv -C) and tt(zmv -L), respectively.  These functions do not
 appear in the zsh distribution, but can be created by linking tt(zmv) to
 the names tt(zcp) and tt(zln) in some directory in your tt(fpath).
 )
+findex(zgetopt)
+item(tt(zgetopt) [ tt(-a) ] [ tt(-A) var(array) ] [ tt(-l) var(spec) ] [ tt(-n) var(name) ] [ tt(-o) var(spec) ] tt(--) [ var(args) ])(
+This is a wrapper around tt(zparseopts) (from tt(zsh/zutil)) which
+provides an interface similar to the util-linux implementation of
+tt(getopt+LPAR()1+RPAR()) (sometimes called `GNU tt(getopt)'). It
+simplifies GNU-style argument parsing (including permutation) and
+can make it easier to write functions and scripts with complex APIs,
+particularly ones where the order of options is significant.
+
+The typical usage pattern is as follows:
+
+example(zgetopt -o abc: -l aaa,bbb,ccc: -- "$@" || return
+while (( $# )); do
+  case $1 in
+    -a|--aaa+RPAR() ...; shift ;;        # handle -a
+    -b|--bbb+RPAR() ...; shift ;;        # handle -b
+    -c|--ccc+RPAR() ...; shift 2 ;;      # handle -c and arg
+    --+RPAR()       ...; shift; break ;; # end of options
+  esac
+done
+# handle operands)
+
+It can also be called as a stand-alone script from other shells
+using the more traditional print-and-eval pattern:
+
+example(args="$( zgetopt -n myscript -o abc: -l aaa,bbb,ccc: -- "$@" )" || return
+eval set -- "$args"
+while [ $# -ne 0 ]; do ...; done)
+
+Options:
+
+startsitem()
+sitem(tt(-A var(array)))(When called as a function, assign the parsed
+arguments to the named array var(array). Defaults to tt(argv), which
+overwrites the caller's positional parameters. Has no meaning when
+called as a script, in which case the parsed and quoted arguments are
+always printed to standard output. An empty string forces the
+printing behaviour in either mode.)
+sitem(tt(-a))(Use Sun-style single-hyphenated long options instead of
+GNU-style double-hyphenated ones (tt(-foo) vs tt(--foo)). Note that
+long options with optional optargs can't always be distinguished
+accurately from short options with optional optargs when using this
+option. Also, due to limitations of tt(zparseopts), a Sun-style long
+option whose name is only one character long is always treated as a
+short option.)
+sitem(tt(-l var(spec)))(Specify long options to recognise when
+parsing. These should be given using just the option name (no
+dashes), suffixed by `tt(:)' or `tt(::)' if it takes a mandatory or
+optional argument respectively. Multiple options can be defined
+either by separating them by commas or by supplying -l again.
+Example: tt(-l foo,bar: -l baz))
+sitem(tt(-n var(name)))(Specify the name to use in the error message
+if argument parsing fails. Defaults to the name of the nearest
+calling function or the base name of tt($ZSH_ARGZERO). Note that
+errors related to the usage of tt(zgetopt) itself are always reported
+as coming from tt(zgetopt).)
+sitem(tt(-o var(spec)))(Specify short options to recognise when
+parsing. These should be given as a single string, in the same format
+used by the tt(getopts) built-in or the tt(getopt+LPAR()3+RPAR())
+library function, again using `tt(:)' or `tt(::)' to indicate a
+mandatory or optional argument. The spec may be prefixed with `tt(+)'
+to indicate that option parsing should stop at the first non-option
+argument (equivalent to setting the environment variable
+tt(POSIXLY_CORRECT)). Example: tt(-o ab:cd::))
+endsitem()
+
+At least one of tt(-o) or tt(-l) must be given. The function's own
+options should be followed by zero or more arguments to parse. It is
+critical that these be separated explicitly by `tt(--)', as in the
+above examples, to ensure that the function can accurately
+distinguish the arguments it's meant to parse from its own.
+
+Refer to the manual for util-linux's tt(getopt+LPAR()1+RPAR()) for
+more information about the way arguments are parsed and results are
+returned. Note however that this function is not intended to be a
+complete re-implementation. In particular, it omits all
+portability/compatibility features.
+)
 item(tt(zkbd))(
 See `Keyboard Definition'
 ifzman(above)\

diff --git a/Functions/Misc/zgetopt b/Functions/Misc/zgetopt
new file mode 100755
index 000000000..5fc1e7725
--- /dev/null
+++ b/Functions/Misc/zgetopt
@@ -0,0 +1,198 @@
+#!/bin/zsh -f
+
+# Wrapper around zparseopts which gives it an interface similar to util-linux's
+# getopt(1). See zshcontrib(1) for documentation
+
+emulate -L zsh -o extended_glob
+zmodload -i zsh/zutil || return 3
+
+# Very stupid and brittle internal wrapper around zparseopts used to insert the
+# caller name into its error messages, allowing us to implement --name. This
+# MUST be called with -v, since argv has the options to zparseopts itself
+__zgetopt_zparseopts() {
+  local __err __ret
+
+  __err=$( zparseopts "$@" 2>&1 )
+  __ret=$?
+
+  zparseopts "$@" &> /dev/null && return
+
+  # Raw error message should look like this:
+  # zgetopt_zparseopts:zparseopts:3: bad option: -x
+  [[ -n $__err ]] && print -ru2 - ${__err/#*:zparseopts:<->:/$name:}
+  return __ret
+}
+
+local optspec pat i posix=0
+local -a match mbegin mend optvv argvv
+local -a array alt lopts sopts name
+local -a specs no_arg_opts req_arg_opts opt_arg_opts tmp
+
+# Same as leading + in short-opts spec
+(( $+POSIXLY_CORRECT )) && posix=1
+
+# This 0=... makes any error message we get here look a little nicer when we're
+# called as a script. Unfortunately the function name overrides $0 in
+# zwarnnam() in other scenarios, so this can't be used to implement --name
+0=${0:t} zparseopts -D -F -G - \
+  {A,-array}:-=array \
+  {a,-alternative}=alt \
+  {l,-longoptions,-long-options}+:-=lopts \
+  {n,-name}:-=name \
+  {o,-options}:-=sopts \
+|| {
+  print -ru2 "usage: ${0:t} [-A <array>] [-a] [-l <spec>] [-n <name>] [-o <spec>] -- <args>"
+  return 2
+}
+
+# Default to the caller's name
+(( $#name )) && name=( "${(@)name/#(-n|--name=)/}" )
+[[ -n $name ]] || name=( ${funcstack[2]:-${ZSH_ARGZERO:t}} )
+
+(( $#array )) && array=( "${(@)array/#(-A|--array=)/}" )
+
+if [[ $ZSH_EVAL_CONTEXT != toplevel ]]; then
+  [[ $array == *[^A-Za-z0-9_.]* ]] && {
+    print -ru2 - "${0:t}: invalid array name: $array"
+    return 2
+  }
+  (( $#array )) || array=( argv )
+
+elif [[ -n $array ]]; then
+  print -ru2 - "${0:t}: -A option not meaningful unless called as function"
+  return 2
+fi
+
+# getopt requires a short option spec; we'll require either short or long
+(( $#sopts || $#lopts )) || {
+  print -ru2 - "${0:t}: missing option spec"
+  return 2
+}
+
+optspec=${(@)sopts/#(-o|--options=)/}
+sopts=( )
+
+for (( i = 1; i <= $#optspec; i++ )); do
+  # Leading '+': Act POSIXLY_CORRECT
+  if [[ $i == 1 && $optspec[i] == + ]]; then
+    posix=1
+  # Leading '-': Should leave operands interspersed with options, but this is
+  # not really possible with zparseopts
+  elif [[ $i == 1 && $optspec[i] == - ]]; then
+    print -ru2 - "${0:t}: optspec with leading - (disable operand collection) not supported"
+    return 2
+  # Special characters: [+=\\] because they're special to zparseopts, ':'
+  # because it's special to getopt, '-' because it's the parsing terminator
+  elif [[ $optspec[i] == [+:=\\-] ]]; then
+    print -ru2 - "${0:t}: invalid short-option name: $optspec[i]"
+    return 2
+  # 'a'
+  elif [[ $optspec[i+1] != : ]]; then
+    sopts+=( $optspec[i] )
+  # 'a:'
+  elif [[ $optspec[i+2] != : ]]; then
+    sopts+=( $optspec[i]: )
+    (( i += 1 ))
+  # 'a::'
+  elif [[ $optspec[i+3] != : ]]; then
+    sopts+=( $optspec[i]:: )
+    (( i += 2 ))
+  fi
+done
+
+lopts=( ${(@)lopts/#(-l|--long(|-)options=)/} )
+lopts=( ${(@s<,>)lopts} )
+
+# Don't allow characters that are special to zparseopts in long-option specs.
+# See above
+pat='(*[+=\\]*|:*|*:::##|*:[^:]*)'
+[[ -n ${(@M)lopts:#$~pat} ]] && {
+  print -ru2 - "${0:t}: invalid long-option spec: ${${(@M)lopts:#$~pat}[1]}"
+  return 2
+}
+
+(( $#alt )) || lopts=( ${(@)lopts/#/-} )
+
+specs=( $sopts $lopts )
+
+# Used below to identify options with optional optargs
+no_arg_opts=( ${(@)${(@M)specs:#*[^:]}/#/-} )
+req_arg_opts=( ${(@)${(@)${(@M)specs:#*[^:]:}/#/-}/%:#} )
+opt_arg_opts=( ${(@)${(@)${(@M)specs:#*::}/#/-}/%:#} )
+
+# getopt returns all instances of each option given, so add +
+specs=( ${(@)specs/%(#b)(:#)/+$match[1]} )
+
+# POSIXLY_CORRECT: Stop parsing options after first non-option argument
+if (( posix )); then
+  tmp=( "$@" )
+  __zgetopt_zparseopts -D -F -G -a optvv -v tmp - $specs || return 1
+  argvv=( "${(@)tmp}" )
+
+# Default: Permute options following non-option arguments
+else
+  tmp=( "$@" )
+  __zgetopt_zparseopts -D -E -F -G -a optvv -v tmp - $specs || return 1
+  argv=( "${(@)tmp}" )
+  # -D + -E leaves an explicit -- in argv where-ever it might appear
+  local seen
+  while (( $# )); do
+    [[ -z $seen && $1 == -- ]] && seen=1 && shift && continue
+    argvv+=( "$1" )
+    shift
+  done
+fi
+
+# getopt outputs all optargs as separate parameters, even missing optional ones,
+# so we scan through and add/separate those if needed. This can't be perfectly
+# accurate if Sun-style (-a) long options are used with optional optargs -- e.g.
+# if you have specs a:: and abc::, then argument -abc=d is ambiguous. We don't
+# guarantee which one is prioritised
+(( $#opt_arg_opts )) && {
+  local cur next
+  local -a old_optvv=( "${(@)optvv}" )
+  optvv=( )
+
+  for (( i = 1; i <= $#old_optvv; i++ )); do
+    cur=$old_optvv[i]
+    next=$old_optvv[i+1]
+    # Option with no optarg
+    if [[ -n ${no_arg_opts[(r)$cur]} ]]; then
+      optvv+=( $cur )
+    # Option with required optarg -- will appear in next element
+    elif [[ -n ${req_arg_opts[(r)$cur]} ]]; then
+      optvv+=( $cur "$next" )
+      (( i++ ))
+    # Long option with optional optarg -- will appear in same element delimited
+    # by '=' (even if missing)
+    elif [[ $cur == *=* && -n ${opt_arg_opts[(r)${cur%%=*}]} ]]; then
+      optvv+=( ${cur%%=*} "${cur#*=}" )
+    # Short option with optional optarg -- will appear in same element with no
+    # delimiter (thus the option appears alone if the optarg is missing)
+    elif [[ -n ${opt_arg_opts[(r)${(M)cur#-?}]} ]]; then
+      optvv+=( ${(M)cur#-?} "${cur#-?}" )
+    # ???
+    else
+      print -ru2 - "${0:t}: parse error, please report!"
+      print -ru2 - "${0:t}: specs: ${(j< >)${(@q+)specs}}"
+      print -ru2 - "${0:t}: old_optvv: ${(j< >)${(@q+)old_optvv}}"
+      print -ru2 - "${0:t}: cur: $cur"
+      optvv+=( $cur ) # I guess?
+    fi
+  done
+}
+
+if [[ -n $array ]]; then
+  # Use EXIT trap to assign in caller's context
+  trap "$array=( ${(j< >)${(@q+)optvv}} -- ${(j< >)${(@q+)argvv}} )" EXIT
+
+elif [[ $ZSH_EVAL_CONTEXT != toplevel ]]; then
+  print -r - "${(@q+)optvv}" -- "${(@q+)argvv}"
+
+# If called as a script, use unconditional single-quoting. This is ugly but it's
+# the closest to what getopt does and it offers compatibility with legacy shells
+else
+  print -r - "${(@qq)optvv}" -- "${(@qq)argvv}"
+fi
+
+return 0

diff --git a/Test/Z04zgetopt.ztst b/Test/Z04zgetopt.ztst
new file mode 100644
index 000000000..c2bc22be0
--- /dev/null
+++ b/Test/Z04zgetopt.ztst
@@ -0,0 +1,206 @@
+%prep
+
+  autoload -Uz zgetopt
+
+%test
+
+  zgetopt -A '' -- a b c
+  zgetopt -A '' -o '' -- a b c
+  zgetopt -A '' -l '' -- a b c
+0:-o or -l required
+?zgetopt: missing option spec
+>-- a b c
+>-- a b c
+
+  zgetopt -A '' -o - -- a b c
+  zgetopt -A '' -o -a -- a b c
+  zgetopt -A '' -o a- -- a b c
+  zgetopt -A '' -o a+ -- a b c
+  zgetopt -A '' -o a= -- a b c
+  zgetopt -A '' -o a\\ -- a b c
+  zgetopt -A '' -o :a -- a b c
+  zgetopt -A '' -o a::: -- a b c
+  zgetopt -A '' -o '' -- a b c
+  zgetopt -A '' -o + -- a b c
+0:weird short-option specs
+?zgetopt: optspec with leading - (disable operand collection) not supported
+?zgetopt: optspec with leading - (disable operand collection) not supported
+?zgetopt: invalid short-option name: -
+?zgetopt: invalid short-option name: +
+?zgetopt: invalid short-option name: =
+?zgetopt: invalid short-option name: \
+?zgetopt: invalid short-option name: :
+?zgetopt: invalid short-option name: :
+>-- a b c
+>-- a b c
+
+  zgetopt -A '' -l a,+ -- a b c
+  zgetopt -A '' -l a,= -- a b c
+  zgetopt -A '' -l a,\\ -- a b c
+  zgetopt -A '' -l a,: -- a b c
+  zgetopt -A '' -l a,:b -- a b c
+  zgetopt -A '' -l a,b:b -- a b c
+  zgetopt -A '' -l a,b::: -- a b c
+  zgetopt -A '' -l '' -- a b c
+  zgetopt -A '' -l , -- a b c
+  zgetopt -A '' -l a,,,,,b -- a b c
+  zgetopt -A '' -l - -- a b c ---
+0:weird long-option specs
+?zgetopt: invalid long-option spec: +
+?zgetopt: invalid long-option spec: =
+?zgetopt: invalid long-option spec: \
+?zgetopt: invalid long-option spec: :
+?zgetopt: invalid long-option spec: :b
+?zgetopt: invalid long-option spec: b:b
+?zgetopt: invalid long-option spec: b:::
+>-- a b c
+>-- a b c
+>-- a b c
+>--- -- a b c
+
+  zgetopt -A '' -o ab:c:: -- a b c
+  zgetopt -A '' -o ab:c:: -- -a
+  zgetopt -A '' -o ab:c:: -- -a a b c
+  zgetopt -A '' -o ab:c:: -- -a a -b c
+  zgetopt -A '' -o ab:c:: -- -a a -b -c
+  zgetopt -A '' -o ab:c:: -- -a a -b -c d
+  zgetopt -A '' -o ab:c:: -- -a a -b -c -c
+  zgetopt -A '' -o ab:c:: -- -a a -b -c -c d
+  zgetopt -A '' -o ab:c:: -- -a a -b -c -cd
+0:short options
+>-- a b c
+>-a --
+>-a -- a b c
+>-a -b c -- a
+>-a -b -c -- a
+>-a -b -c -- a d
+>-a -b -c -c '' -- a
+>-a -b -c -c '' -- a d
+>-a -b -c -c d -- a
+
+  zgetopt -A '' -l aaa,bbb:,ccc:: -- a b c
+  zgetopt -A '' -l aaa,bbb:,ccc:: -- --aaa
+  zgetopt -A '' -l aaa,bbb:,ccc:: -- --aaa a b c
+  zgetopt -A '' -l aaa,bbb:,ccc:: -- --aaa a --bbb c
+  zgetopt -A '' -l aaa,bbb:,ccc:: -- --aaa a --bbb=c
+  zgetopt -A '' -l aaa,bbb:,ccc:: -- --aaa a --bbb --ccc
+  zgetopt -A '' -l aaa,bbb:,ccc:: -- --aaa a --bbb --ccc d
+  zgetopt -A '' -l aaa,bbb:,ccc:: -- --aaa a --bbb --ccc --ccc
+  zgetopt -A '' -l aaa,bbb:,ccc:: -- --aaa a --bbb --ccc --ccc d
+  zgetopt -A '' -l aaa,bbb:,ccc:: -- --aaa a --bbb --ccc --ccc=d
+0:long options
+>-- a b c
+>--aaa --
+>--aaa -- a b c
+>--aaa --bbb c -- a
+>--aaa --bbb c -- a
+>--aaa --bbb --ccc -- a
+>--aaa --bbb --ccc -- a d
+>--aaa --bbb --ccc --ccc '' -- a
+>--aaa --bbb --ccc --ccc '' -- a d
+>--aaa --bbb --ccc --ccc d -- a
+
+  zgetopt -A '' -al aaa,bbb:,ccc:: -- a b c
+  zgetopt -A '' -al aaa,bbb:,ccc:: -- --aaa a b c
+  zgetopt -A '' -al aaa,bbb:,ccc:: -- -aaa
+  zgetopt -A '' -al aaa,bbb:,ccc:: -- -aaa a b c
+  zgetopt -A '' -al aaa,bbb:,ccc:: -- -aaa a -bbb c
+  zgetopt -A '' -al aaa,bbb:,ccc:: -- -aaa a -bbb=c
+  zgetopt -A '' -al aaa,bbb:,ccc:: -- -aaa a -bbb -ccc
+  zgetopt -A '' -al aaa,bbb:,ccc:: -- -aaa a -bbb -ccc d
+  zgetopt -A '' -al aaa,bbb:,ccc:: -- -aaa a -bbb -ccc -ccc
+  zgetopt -A '' -al aaa,bbb:,ccc:: -- -aaa a -bbb -ccc -ccc d
+  zgetopt -A '' -al aaa,bbb:,ccc:: -- -aaa a -bbb -ccc -ccc=d
+0:long options with -a (Sun style)
+>-- a b c
+?(eval): bad option: --aaa
+>-aaa --
+>-aaa -- a b c
+>-aaa -bbb c -- a
+>-aaa -bbb c -- a
+>-aaa -bbb -ccc -- a
+>-aaa -bbb -ccc -- a d
+>-aaa -bbb -ccc -ccc '' -- a
+>-aaa -bbb -ccc -ccc '' -- a d
+>-aaa -bbb -ccc -ccc d -- a
+
+  zgetopt -A '' -al a: -- -a=b
+0:single-character long option with -a
+>-a '=b' --
+
+  zgetopt -A '' -o ''
+0:zero args to parse
+>--
+
+  zgetopt -A '' -o '' -- -- a b c
+  zgetopt -A '' -o '' -- a b -- c
+  zgetopt -A '' -o '' -- a b c --
+  zgetopt -A '' -o c -- a b -- -c
+  zgetopt -A '' -o c -- a b - -c
+0:parsing terminator
+>-- a b c
+>-- a b c
+>-- a b c
+>-- a b -c
+>-c -- a b -
+
+  zgetopt -A '' -o a -- a -a b
+  zgetopt -A '' -o +a -- a -a b
+  POSIXLY_CORRECT=1 zgetopt -A '' -o a -- a -a b
+0:POSIXLY_CORRECT
+>-a -- a b
+>-- a -a b
+>-- a -a b
+
+  zgetopt -A '' -o '' -- $'\a\'\a'
+0:function-mode quoting style
+>-- $'\C-G\'\C-G'
+
+  zgetopt -A '' -o '' -- a -a b
+  zgetopt -A '' -o '' -- a --a b
+1:bad options
+?(eval): bad option: -a
+?(eval): bad option: --a
+
+  zgetopt -A ''            ; echo $? # missing spec
+  zgetopt -A '' -o '' -x   ; echo $? # bad option to zgetopt
+  zgetopt -A '' -o '' -- -y; echo $? # bad option to parse
+0:return status
+*?zgetopt: missing option spec
+*>2
+*?zgetopt:zparseopts:*: bad option: -x
+*?usage:*
+*>2
+*?\(eval\): bad option: -y
+*>1
+
+  () { zgetopt -o a -- "$@"; typeset -p argv } -a b c
+  () { local -a v; zgetopt -A v -o a -- "$@"; typeset -p argv v } -a b c
+0:array output
+>typeset -g -a argv=( -a -- b c )
+>typeset -g -a argv=( -a b c )
+>typeset -a v=( -a -- b c )
+
+  zgetopt -A '' -o a: -- -x
+  zgetopt -A '' -o a: -- -a
+  ()     { zgetopt -A '' -o a: -- "$@"; : } -x
+  func() { zgetopt -A '' -o a: -- "$@"; : }; func -x
+  f1()   { zgetopt -A '' -o a: -- "$@"; : }; f2() { f1 "$@" }; f2 -x
+0:automatic name
+?(eval): bad option: -x
+?(eval): missing argument for option: -a
+?(anon): bad option: -x
+?func: bad option: -x
+?f1: bad option: -x
+
+  zgetopt -n aaa -A '' -o a: -- -x
+  zgetopt -n aaa -A '' -o a: -- -a
+  ()     { zgetopt -n bbb -A '' -o a: -- "$@"; : } -x
+  func() { zgetopt -n ccc -A '' -o a: -- "$@"; : }; func -x
+  f1()   { zgetopt -n ddd -A '' -o a: -- "$@"; : }; f2() { f1 "$@" }; f2 -x
+0:manual name with -n
+?aaa: bad option: -x
+?aaa: missing argument for option: -a
+?bbb: bad option: -x
+?ccc: bad option: -x
+?ddd: bad option: -x




Messages sorted by: Reverse Date, Date, Thread, Author