I have a shell script which calls a function that behaves differently based on the value of a global variable and whose output is a list of values that I want to store in an array.
I'm running into a problem because when I try to capture the output of the function using any variation of the obvious syntax:
mapfile -i the_array < <( the_function )
my global variable that's being set inside the_function
reverts to it's previous value once the_function
returns. I understand that this is a known "feature" of capturing the output of a function that has side-effects and I can work around it as shown below but I'd like to know:
To simplify the problem, consider this case where I want the function to print 5 numbers the first time it's called and not print anything the next time it's called (this is the obvious syntax which doesn't produce the expected output):
$ cat tst1
#!/usr/bin/env bash
the_function() {
printf '\nENTER: %s(), the_variable=%d\n' "${FUNCNAME[0]}" "$the_variable" >&2
if (( the_variable == 0 )); then
seq 5
the_variable=1
fi
printf 'EXIT: %s(), the_variable=%d\n' "${FUNCNAME[0]}" "$the_variable" >&2
}
the_variable=0
mapfile -t arr < <( the_function )
declare -p arr
mapfile -t arr < <( the_function )
declare -p arr
$ ./tst1
ENTER: the_function(), the_variable=0
EXIT: the_function(), the_variable=1
declare -a arr=([0]="1" [1]="2" [2]="3" [3]="4" [4]="5")
ENTER: the_function(), the_variable=0
EXIT: the_function(), the_variable=1
declare -a arr=([0]="1" [1]="2" [2]="3" [3]="4" [4]="5")
That doesn't work for the reasons stated above and I can work around it by writing the code as (this one does produce the expected output):
$ cat tst2
#!/usr/bin/env bash
the_function() {
local arr_ref=$1
printf '\nENTER: %s(), the_variable=%d\n' "${FUNCNAME[0]}" "$the_variable" >&2
if (( the_variable == 0 )); then
mapfile -t "$arr_ref" < <( seq 5 )
the_variable=1
else
mapfile -t "$arr_ref" < /dev/null
fi
printf 'EXIT: %s(), the_variable=%d\n' "${FUNCNAME[0]}" "$the_variable" >&2
}
the_variable=0
the_function arr
declare -p arr
the_function arr
declare -p arr
$ ./tst2
ENTER: the_function(), the_variable=0
EXIT: the_function(), the_variable=1
declare -a arr=([0]="1" [1]="2" [2]="3" [3]="4" [4]="5")
ENTER: the_function(), the_variable=1
EXIT: the_function(), the_variable=1
declare -a arr=()
but while that works it's obviously horrible code since it requires the lower level primitive to be more complicated than necessary and tightly coupled to the data structure being used to store it's output (so not reusable if a case arises where we just want the 5 numbers to go to stdout, for example).
So - why do I need to do that and is there a better way?
If you don't want parallelization (and thus a subshell whose scope gets lost), the alternative is buffering. Bash not doing that for you makes it explicit and visible that storage is being used, and where your data gets stored. So:
tempfile=$(mktemp "${TMPDIR:-/tmp}/the_function_output.XXXXXX")
the_function >"$tempfile"
mapfile -i the_array < "$tempfile"
rm -f -- "$tempfile"
To automate this kind of pattern, I'd suggest something like:
call_and_store_output() {
local varname tempfile retval
varname=$1 || return; shift
tempfile=$(mktemp "${TMPDIR:-/tmp}/cso.XXXXXX") || return
"$@" >"$tempfile"
local retval=$?
printf -v "$varname" %s "$(<"$tempfile")"
rm -f -- "$tempfile"
return "$retval"
}
...thereafter:
call_and_store_output function_output_var the_function
mapfile -i the_array <<<"$function_output_var"
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments