Bash: Passing variables by reference
Contents
Problem
How can I pass variables by reference to a bash function?
I see this technique of 'passing by reference' proliferating:
f() { local b; g b; echo $b; }
g() { eval $1=bar; }  #  WRONG, although it
f                     #+ looks ok: b=bar
But this deceptive simple solution has some serious drawbacks.
For one, the eval introduces a security thread:
g() { eval $1=bar; }  # WRONG
g 'ls /; true'        # Oops, executes command
Second, the eval doesn't work if `g()' has the return variable declared local:
g() { local b; eval $1=bar; }  # WRONG
g b                            # Conflicts with `local b'
echo $b                        # b is empty unexpected
The conflict stays there even if local `b' is unset:
g() { local b; unset b; eval $1=bar; }  # WRONG
g b                                     # Still conflicts with `local b'
echo $b                                 # b is empty unexpected
Why bother?
Why bother passing variables by reference, if one can return values in bash by using a subshell:
a=$(func)
Well, subshells do an expensive fork operation - costing time, resources, and energy.  Consider this time comparison, using either a subshell or eval when doing 1,000 single value assignments:
                                           time
           ---------------------------------------------------------------------
                         real [%]                  real [s]   user [s]   sys [s]
           -------------------------------------   --------   --------   -------
           0                                 100
           -------------------------------------
subshell   #####################################      4.046      0.968     3.076
eval       ###                                        0.367      0.360     0.008
           -------------------------------------   --------   --------   -------
Table 1: Time consumed doing 1,000 assignments, using either a subshell or eval. See #Appendix B: time_eval.sh for the code used.
As can be seen in table 1, pursuing the eval method - trying to find a safe solution, is worth the effort.
Also, the subshell method supports returning one value only, or at least makes it very cumbersome to return multiple values via the subshell method. Passing by reference would make returning multiple values much easier.
Solutions
Both solutions upvar & upvars have been tested successfully on bash versions: 2.05b, 3.0.0, 3.2.39, 4.0.33 and 4.1.7.
See #Test suite for the code used.
Solution 1: Upvar: Assign single variable by reference
Use upvar in a function, returning a single value, like this:
local "$1" && upvar $1 "value(s)"
Example:
f() { local b; g b; echo $b; } # (1) g() { local "$1" && upvar $1 bar; } # (2) f # Ok: b=bar
Explicitly make the variable local in the function whishing to return (2).  Then this function unsets the variable, calling upvar (2), effectively making the variable appear down the call-stack to the caller (1).
Returning an array by reference goes like this:
f() { local b; g b; declare -p b; } # (1) # @param $1: Name of variable to return value into gg() { # (2) # Declare array containing three elements: # - foo # - bar \" cee # Including double quote (") # - dus \n enk # Including newline (\n) followed by two spaces # foo, bar \" cee, dus\n enk local a=(foo "bar \"cee" $'dus\n enk') # Return array local "$1" && upvar $1 "${a[@]}" } f # Ok: declare -a b=([0]="foo" [1]="bar \"cee" [2]=$'dus\n enk')
The `upvar' code makes use of some surprising behaviour of unset which is capable of traversing down the call-stack and unsetting variables repeatedly.  For more information, see Bash: Unset.
The name "upvar" is borrowed from Tcl's  upvar command.
For the upvar code, see #Appendix A: upvars.sh.
Caveat: Subsequent 'upvar' calls may conflict
Consider this example:
f() { local b a; g b a; echo $b $a; }
g() {
    local a=A b=B
    # ...
    if local "$1" "$2"; then
        upvar $1 $a           # (1a)
        upvar $2 $b           # (1b)
    fi
}
f  # Error; got "A A", expected: "A B"
The problem is that in the first call to upvar (1a), `f.b' gets assigned the value "A".  Unfortunately, upvar also unsets the local variable `g.b', so that in the second call to upvar (1b), `f.a' gets assigned the value of not `g.b' but `f.b' which now is "A".
The solution is to pass all variables and values in one call, using upvars, see solution 2.
Solution 2: Upvars: Assign multiple variables by reference
Use upvars like this:
local varname [varname ...] && 
    upvars [-v varname value] | [-aN varname [value ...]] ...
Available OPTIONS:
    -aN  Assign next N values to varname as array
    -v   Assign single value to varname
Example:
f() { local a b; g a b; declare -p a b; }                      # (1)
g() {
    local c=( foo bar )
    local "$1" "$2" && upvars -v $1 A -a${#c[@]} $2 "${c[@]}"  # (2)
}
f  # Ok: a=A, b=(foo bar)
Explicitly declare the variables local in the function whishing to return (2). Then this function can return variables using upvars (2), effectively making the variables appear down the call-stack to the caller (1).
For the upvars code, see #Appendix A: upvars.sh.
Download
- Single file: upvars.sh
Time comparison
                                           time
           ---------------------------------------------------------------------
                         real [%]                  real [s]   user [s]   sys [s]
           -------------------------------------   --------   --------   -------
           0                                 100
           -------------------------------------
subshell   #####################################      4.046      0.968     3.076
eval       ###                                        0.367      0.360     0.008
upvar      #######                                    0.759      0.748     0.008
upvars     ###########                                1.175      1.148     0.024
           -------------------------------------   --------   --------   -------
Table 2: Time consumed doing 1,000 single value assignments. See #Appendix C: time_upvar.sh for the code used.
Examples
Function returning array by reference
# Param $1  Name of variable to return array to
return_array() {
    local r=(e1 e2 "e3  e4" $'e5\ne6')
    local "$1" && upvars -a${#r[@]} $1 "${r[@]}"
}
Function returning optional variables by reference
bash >= 3.1.0
# Params $*  (optional) names of variables to return values to.
#            Supported variable names are:
#            - A1:  Return array 1
#            - A2:  Return array 2
#            - V1:  Return value 1
#            - V2:  Return value 2
return_optional_vars() {
    local a1=(bar "cee  dee") a2=() upargs=() upvars=() v1=foo v2 var
    for var; do
        case $var in
            A1) upargs+=(-a${#a1[@]} $var "${a1[@]}") ;;
            A2) upargs+=(-a${#a2[@]} $var "${a2[@]}") ;;
            V1) upargs+=(-v $var "$v1") ;;
            V2) upargs+=(-v $var "$v2") ;;
            *) echo "bash: ${FUNCNAME[0]}: \`$var': unknown variable"
               return 1 ;;
        esac
        upvars+=("$var")
    done
    (( ${#upvars[@]} )) && local "${upvars[@]}" && upvars "${upargs[@]}"
}
bash >= 2.05b
# Params $*  (optional) names of variables to return values to.
#            Supported variable names are:
#            - A1:  Return array 1
#            - A2:  Return array 2
#            - V1:  Return value 1
#            - V2:  Return value 2
return_optional_vars() {
    local a1 a2 upargs upvars v1=foo v2 var
    a1=(bar "cee  dee") a2=() upargs=() upvars=()
    for var; do
        case $var in
            A1) upargs=("${upargs[@]}" -a${#a1[@]} $var "${a1[@]}") ;;
            A2) upargs=("${upargs[@]}" -a${#a2[@]} $var "${a2[@]}") ;;
            V1) upargs=("${upargs[@]}" -v $var "$v1") ;;
            V2) upargs=("${upargs[@]}" -v $var "$v2") ;;
            *) echo "bash: ${FUNCNAME[0]}: \`$var': unknown variable"
               return 1 ;;
        esac
        upvars=("${upvars[@]}" "$var")
    done
    (( ${#upvars[@]} )) && local "${upvars[@]}" && upvars "${upargs[@]}"
}
Test suite
The test suite uses the bash-completion test suite, which is written on top of the DejaGnu testing framework. DejaGnu is written in Expect, which in turn uses Tcl -- Tool command language.
Install
Git
git clone git@github.com:fvue/BashByRef.git # BashByRef cd BashByRef && git submodule update --init # bash-completion
Dependencies
Debian/Ubuntu
On Debian/Ubuntu you can use `apt-get`:
sudo apt-get install dejagnu tcllib
This should also install the necessary `expect` and `tcl` packages.
Fedora/RHEL/CentOS
On Fedora and RHEL/CentOS (with EPEL) you can use `yum`:
sudo yum install dejagnu tcllib
This should also install the necessary `expect` and `tcl` packages.
Running the tests
The tests are run by calling runUnit:
cd test ./runUnit
Example output:
Test Run By me on Sun May 30 08:51:40 2010
Native configuration is i686-pc-linux-gnu
        === unit tests ===
Schedule of variations:
    unix
Running target unix
Using /usr/share/dejagnu/baseboards/unix.exp as board description file for target.
Using /usr/share/dejagnu/config/unix.exp as generic interface file for target.
Using ./config/default.exp as tool-and-target-specific interface file.
Running ./unit/upvar.exp ...
Running ./unit/upvars.exp ...
        === unit Summary ===
# of expected passes		22
# of expected failures		1
/tmp/BashByRef/test, bash-4.0.33(7)-release
See also
- Passing variables by reference conflicts with local
- Me questioning the problem on the bug-bash mailing list
Appendixes
Appendix A: upvars.sh
# Bash: Passing variables by reference
# Copyright (C) 2010 Freddy Vulto
# Version: upvars-0.9.dev
# See: http://fvue.nl/wiki/Bash:_Passing_variables_by_reference
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program.  If not, see <http://www.gnu.org/licenses/>.
# Assign variable one scope above the caller
# Usage: local "$1" && upvar $1 "value(s)"
# Param: $1  Variable name to assign value to
# Param: $*  Value(s) to assign.  If multiple values, an array is
#            assigned, otherwise a single value is assigned.
# NOTE: For assigning multiple variables, use 'upvars'.  Do NOT
#       use multiple 'upvar' calls, since one 'upvar' call might
#       reassign a variable to be used by another 'upvar' call.
# Example: 
#
#    f() { local b; g b; echo $b; }
#    g() { local "$1" && upvar $1 bar; }
#    f  # Ok: b=bar
#
upvar() {
    if unset -v "$1"; then           # Unset & validate varname
        if (( $# == 2 )); then
            eval $1=\"\$2\"          # Return single value
        else
            eval $1=\(\"\${@:2}\"\)  # Return array
        fi
    fi
}
# Assign variables one scope above the caller
# Usage: local varname [varname ...] && 
#        upvars [-v varname value] | [-aN varname [value ...]] ...
# Available OPTIONS:
#     -aN  Assign next N values to varname as array
#     -v   Assign single value to varname
# Return: 1 if error occurs
# Example:
#
#    f() { local a b; g a b; declare -p a b; }
#    g() {
#        local c=( foo bar )
#        local "$1" "$2" && upvars -v $1 A -a${#c[@]} $2 "${c[@]}"
#    }
#    f  # Ok: a=A, b=(foo bar)
#
upvars() {
    if ! (( $# )); then
        echo "${FUNCNAME[0]}: usage: ${FUNCNAME[0]} [-v varname"\
            "value] | [-aN varname [value ...]] ..." 1>&2
        return 2
    fi
    while (( $# )); do
        case $1 in
            -a*)
                # Error checking
                [[ ${1#-a} ]] || { echo "bash: ${FUNCNAME[0]}: \`$1': missing"\
                    "number specifier" 1>&2; return 1; }
                printf %d "${1#-a}" &> /dev/null || { echo "bash:"\
                    "${FUNCNAME[0]}: \`$1': invalid number specifier" 1>&2
                    return 1; }
                # Assign array of -aN elements
                [[ "$2" ]] && unset -v "$2" && eval $2=\(\"\${@:3:${1#-a}}\"\) && 
                shift $((${1#-a} + 2)) || { echo "bash: ${FUNCNAME[0]}:"\
                    "\`$1${2+ }$2': missing argument(s)" 1>&2; return 1; }
                ;;
            -v)
                # Assign single value
                [[ "$2" ]] && unset -v "$2" && eval $2=\"\$3\" &&
                shift 3 || { echo "bash: ${FUNCNAME[0]}: $1: missing"\
                "argument(s)" 1>&2; return 1; }
                ;;
            --help) echo "\
Usage: local varname [varname ...] &&
   ${FUNCNAME[0]} [-v varname value] | [-aN varname [value ...]] ...
Available OPTIONS:
-aN VARNAME [value ...]   assign next N values to varname as array
-v VARNAME value          assign single value to varname
--help                    display this help and exit
--version                 output version information and exit"
                return 0 ;;
            --version) echo "\
${FUNCNAME[0]}-0.9.dev
Copyright (C) 2010 Freddy Vulto
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law."
                return 0 ;;
            *)
                echo "bash: ${FUNCNAME[0]}: $1: invalid option" 1>&2
                return 1 ;;
        esac
    done
}
Appendix B: time_eval.sh
#--- time_eval.sh -------------------------------------------
# Compare times doing 1,000 single value assignments:
# - subshell
# - eval
echo bash-$BASH_VERSION
echo
echo subshell:
f() {
    local a i
    for (( i=1000; i > 0; i-- )); do
        a=$(g)
    done
}
g() {
    local b=foo
    echo $b
}
time f
echo
echo eval:
f() {
    local a i
    for (( i=1000; i > 0; i-- )); do
        g a
    done
}
g() {
    local b=foo
    eval $1=\$b
}
time f
Example run:
$ . time_eval.sh bash-3.2.39(1)-release subshell: real 0m5.089s user 0m1.056s sys 0m4.032s eval: real 0m0.374s user 0m0.372s sys 0m0.000s
Appendix C: time_upvar.sh
#--- time_upvar.sh ------------------------------------------
# Compare times doing 1,000 single value assignments:
# - subshell
# - eval
# - upvar
# - upvars
# Assign variable one scope above the caller
# Usage: local "$1" && upvar $1 "value(s)"
# Param: $1  Variable name to assign value to
# Param: $*  Value(s) to assign.  If multiple values, an array is
#            assigned, otherwise a single value is assigned.
# NOTE: For assigning multiple variables, use 'upvars'.  Do NOT
#       use multiple 'upvar' calls, since one 'upvar' call might
#       reassign a variable to be used by another 'upvar' call.
# See: http://fvue.nl/wiki/Bash:_Passing_variables_by_reference
upvar() {
    if unset -v "$1"; then           # Unset & validate varname
        if (( $# == 2 )); then
            eval $1=\"\$2\"          # Return single value
        else
            eval $1=\(\"\${@:2}\"\)  # Return array
        fi
    fi
}
# Assign variables one scope above the caller
# Usage: local varname [varname ...] && 
#        upvars [-v varname value] | [-aN varname [value ...]] ...
# Available OPTIONS:
#     -aN  Assign next N values to varname as array
#     -v   Assign single value to varname
# Return: 1 if error occurs
# See: http://fvue.nl/wiki/Bash:_Passing_variables_by_reference
upvars() {
    if ! (( $# )); then
        echo "${FUNCNAME[0]}: usage: ${FUNCNAME[0]} [-v varname"\
            "value] | [-aN varname [value ...]] ..." 1>&2
        return 2
    fi
    while (( $# )); do
        case $1 in
            -a*)
                # Error checking
                [[ ${1#-a} ]] || { echo "bash: ${FUNCNAME[0]}: \`$1': missing"\
                    "number specifier" 1>&2; return 1; }
                printf %d "${1#-a}" &> /dev/null || { echo "bash:"\
                    "${FUNCNAME[0]}: \`$1': invalid number specifier" 1>&2
                    return 1; }
                # Assign array of -aN elements
                [[ "$2" ]] && unset -v "$2" && eval $2=\(\"\${@:3:${1#-a}}\"\) && 
                shift $((${1#-a} + 2)) || { echo "bash: ${FUNCNAME[0]}:"\
                    "\`$1${2+ }$2': missing argument(s)" 1>&2; return 1; }
                ;;
            -v)
                # Assign single value
                [[ "$2" ]] && unset -v "$2" && eval $2=\"\$3\" &&
                shift 3 || { echo "bash: ${FUNCNAME[0]}: $1: missing"\
                "argument(s)" 1>&2; return 1; }
                ;;
            --help) echo "\
Usage: local varname [varname ...] &&
   ${FUNCNAME[0]} [-v varname value] | [-aN varname [value ...]] ...
Available OPTIONS:
-aN VARNAME [value ...]   assign next N values to varname as array
-v VARNAME value          assign single value to varname
--help                    display this help and exit
--version                 output version information and exit"
                return 0 ;;
            --version) echo "\
${FUNCNAME[0]}-0.9.dev
Copyright (C) 2010 Freddy Vulto
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law."
                return 0 ;;
            *)
                echo "bash: ${FUNCNAME[0]}: $1: invalid option" 1>&2
                return 1 ;;
        esac
    done
}
echo bash-$BASH_VERSION
echo
echo subshell:
f() {
    local a i
    for (( i=1000; i > 0; i-- )); do
        a=$(g)
    done
}
g() {
    local b=foo
    echo $b
}
time f
echo
echo eval:
f() {
    local a i
    for (( i=1000; i > 0; i-- )); do
        g a
    done
}
g() {
    local b=foo
    eval $1=\$b
}
time f
echo
echo upvar:
f() {
    local a i
    for (( i=1000; i > 0; i-- )); do
        g a
    done
}
g() {
    local b=foo
    local "$1" && upvar $1 $b
}
time f
echo
echo upvars:
f() {
    local a i
    for (( i=1000; i > 0; i-- )); do
        g a
    done
}
g() {
    local b=foo
    local "$1" && upvars -v $1 $b
}
time f
Example run:
$ . time_upvar.sh bash-3.2.39(1)-release subshell: real 0m4.233s user 0m0.972s sys 0m3.260s eval: real 0m0.372s user 0m0.368s sys 0m0.004s upvar: real 0m0.757s user 0m0.740s sys 0m0.016s upvars: real 0m1.175s user 0m1.148s sys 0m0.024s

