Yes, the critizised code is definitely bad. Glues two user-defined constructs together (is-in-list and a variable).
Also, unquoted variables and relying on shell splitting...
But you don't necessarily need to shell out to awk / perl / grep / whatever.
is_in_list() {
local thing=$1
shift
while [ $# -gt 0 ] ; do
[ "$1" = "$thing" ] && return 0
shift
done
return 1
}
Now just do `is_in_list "$thing" "$@"` where the positional array `"$@"` is the list (or by all means use unquoted `$COMPILER_VERSIONS`, i.e. unsafe shell-splitting).
Another possibility, where the list is given as a space-separated string, is with case statements. That approach is logically equivalent to the awk version, but doesn't fork so is much more performant.
is_in_list() {
case $2 in
$1|*" $1 "*|*" $1"|"$1 "*) return 0;;
*) return 1;;
esac
}
Bash is all about shelling out: it's a shell. I would happily use grep here, and if I had an array:
printf "%s\0" "$@" | grep -qz needle
> much more performant
I believe that performance-critical code shouldn't be written in bash. I have seen pipelines be much faster than tweaked bash functions, simply because they run in parallel & thus the cost of forking is paid once, up-front.
I agree with you. I think we are just speaking about different things.
I was more thinking about the things that happen in a configure script, for example. Not a "hoot loop" like a big grep over millions of lines or something.
There are situations in shell code where the difference between a case-match or string-suffix-replace written in shell and a process spawn matters. Not only because forking a process for one simple string operation is wasteful (cost: about 1ms), but also because of the semantic problems that come with child processes (error handling).
The author's first sentence regarding "novice" shell programmers I think may be applicable to the other submission about shell scripts currently on the front page.
For example, in C the idioms often combine several instructions into a single line, i.e., "nesting". Kernighan suggested nesting in an early C tutorial:
while ( putchar( getchar( ) ) != '\0' )
In the shell maybe it is better to test each "expression" on a line by itself. Or maybe not. I do this anyway.
More often I see on the web that other shell scripters prefer to nest as many commands as they can, perhaps to reduce the number of lines.
For example,
variable=$(command1 $(command2));
This could be alternatively expressed as something like
variable1=$(command2);
# can now test variable1 before proceeding to next line
variable2=$(command1 $variable1);
The result of nesting is subshells and complexity that I am not sure reluctant, occasional shell scripters are prepared to think about.
And if I am not mistaken that was at least part of the problem that Jane Street had in the other submission.
But you don't necessarily need to shell out to awk / perl / grep / whatever.
Now just do `is_in_list "$thing" "$@"` where the positional array `"$@"` is the list (or by all means use unquoted `$COMPILER_VERSIONS`, i.e. unsafe shell-splitting).Another possibility, where the list is given as a space-separated string, is with case statements. That approach is logically equivalent to the awk version, but doesn't fork so is much more performant.
Not tested, but that's the idea.