There are some subtle points that the blog is not clear about:
- The use of `1>&2` is to overwrite the stdout of the LHS process so that `echo green` actually never writes to the pipe, so it never gets SIGPIPE.
- `echo` only echoes arguments and always ignores stdin, so putting `echo blue` on the RHS of the pipe only serves to run the two sides of the pipe in parallel
- `bash -c '(echo green 1>&2) | echo blue' 1>stdout 2>stderr` will show you that green and blue actually write to different files
Another observation:
> if it was a separate command (or even if the shell forked to execute it internally), only the 'echo red' process would die from the SIGPIPE instead of the entire left side of the pipeline.
Most linux distributions have /bin/echo as a separate program. Running `(/bin/echo red; echo green 1>&2) | echo blue` will always print both green and blue.
EDIT: fix typo of stdin/stdout/stderr as suggested
Thanks, this is helpful and why I came to the comments. Another question I had was what happens to red? You helped me answer that -- unlike echo green, the output of echo red is still connected to the pipe. The right-hand side of that pipe does nothing with it, so it disappears.
Also regarding one of your examples: maybe I'm misunderstanding, and anyway this wouldn't change your overall point, but should it maybe read like this instead:
bash -c '(echo green 1>&2) | echo blue' 1>stdout 2>stderr
since stdin is normally associated with file descriptor 0
I’ve been doing Linux stuff for ten years and I am apparently just learning that the pipeline commands run in parallel not serially. If I had put any thought into it I would have realized it because otherwise tailing logs to grep wouldn’t work...
Pipes are commonly used when the right side is consuming the output of the left side. That introduces serialization at least insofar as the right side is forced to wait for input.
You could replace the | with an &; and get the same behavior because nothing in the 2nd stage depends on the first stage of the pipe.
In my old man grumbling at clouds voice: This is not arcane.
But physical pipes run in parallel too! If you have a pipe full of water, and you push some more water in one end, there is an instant effect at the other end. If it's empty, then sequentiality is introduced because it takes time for the water to flow through the pipe. That's how a Unix pipe works – (the processes at) both ends of the pipe come into existence at the same time, but if the right side is waiting for input (the pipe buffer is empty), it won't do anything until the left side has produced output.
The shell ties the prior program's (or user input to the first) output to the input of the next program. That's (old) stdout and (new) stdin taken care of.
stderr is sent to the terminal as output by default.
For reference stdin, stdout, and stderr are normally numbered 0 through 2 respectively. When you're directing input and output on the shell (usually) the default is to wire up things as they'd make sense, but (without any IFS characters) a number on either side of the redirection operator codes tells the shell to grab a different input or output.
These will yield different results due to left to right parsing:
echo test >/tmp/test 2>&1
echo test 2>&1 >/tmp/test
In the second if there were any errors they'd be sent to the file, while in the first (assuming you can make temporary files, which normally you can) there won't be any errors printed but they would be sent as standard output on the shell.
Here's a more interesting example
echo test >/dev/null/test 2>/dev/null
echo test 2>/dev/null >/dev/null/test
The second line will silently fail (though it'll still return an exit value of 1) because the standard error output was already sent to /dev/null, while the first attempts to open a file that can't exist and prints the failure message before the error output is redirected.
Here's a cheat sheet which should cover the most common situations.
Send stdout to a file.
ls myFileWhichExists > myStdLog
- or -
ls myFileWhichExists 1> myStdLog
Send stderr to a file.
ls myFileWhichDoesNotExist 2> myErrLog
Send stdout to one file and stderr to a different file.
ls myFileWhichExists myFileWhichDoesNotExist 1> myStdLog 2> myErrLog
Send stdout and stderr to the same file
ls myFileWhichExists myFileWhichDoesNotExist 1> myBothLog 2>&1
I read that last part "2>&1" as "Send stderr (2) to the same place as stdout (1) is already going to".
Notice that if you send stdout and stderr to the same file, because of caching and other issues, the output from stdout and stderr will overlap in unpredictable ways.
This "parallel execution" is one of the interesting things that distinguishes Unix pipelines from DOS pipelines; in DOS a temporary file is used, which was an old source of puzzlement for beginners doing a "dir | more" --- "what's that extra file I see?"
In retrospect, getting non-preemptive pipes (a type of coroutining) working in DOS would not have been all that difficult, if it weren't for the limited memory available to PCs of the time and the fact that most programs assumed they owned it all when they ran.
It appears that sleep (at least on a typical modern Linux desktop) does behave similarly to echo in that it does nothing with the pipeline instead of echoing it to terminal.
It's curious because someone might assume the default behavior would be to forward all file descriptors unless something was done to the data-streams. Clearly that isn't the case, the shell ties the prior standard output to the next programs standard input irrespective of if anything ever happens to it.
I don't know what you mean by "forward" but I have the impression you don't understand how file descriptors (i.e. "open files or streams") work.
The shell sets up a pipe and connects the left side to the pipe's writing end, and the right side to the reading end. Now, the right side is actually a subshell (indicated by the parentheses "(...)"). And that subshell can spawn as many other processes, sequentially or in parallel, as it wants. All of them will get the open file descriptor (the pipe's reading end) inherited by the operating system.
If you had multiple processes in parallel reading from the pipe, the outcome would be totally nondeterministic (dependent on the kernel's scheduling behaviour. In the example case, none of the potential readers actually read (not the subshell itself, not the sleep, and not the echo).
Arcane would be mixing the output of multiple commands in to a single text stream without any readily available means to determine their origin, then writing code based upon that output that relied upon a specific ordering of said output without a preliminary explicit sort, ie. code that was reliant upon this indeterminism to fail. In any event, diff would sort it out. ;)
if you include > redirection, its parsed and processed before execution of the pipe. Even if the subsequent pipe execution moments fail, you have probably smashed /thing/you/didnt/mean/to/smash if you > into it
While testing around I found out that enclosing the commands in parantheses, reverses the commoness of 'blue green' and 'green blue' output.
Can anyone explain why this happens?
I recently found out that you can’t easily spawn a shell and then send commands to it. It’s doable with tmux commands, but you’d think it would be easier. I just wanted to write something that locates npm/virtualenv stuff in bash, nothing fancy.
A distinction you may be having trouble with, because it's kind of hidden from the user, is the difference between a shell and a "pty" (pseudo teletype). You certainly can spawn a shell and send it commands, but because it doesn't have a pty the input is treated very differently.
That's what you get setup for you by running tmux, screen, expect, xterm, ssh etc.
"expect" was mentioned, but what are you actually trying to do that cannot be solved with a shell script, either executed the normal way or sourced in at the start of a new shell?
It even seems to work with named pipes, although in my first test it exited after the first command (I suspect I'm accidentally sending a EOF when I echo the command in).
mkfifo testpipe1
<testpipe1 bash # in separate window
echo ls > testpipe1
Would you also expect "echo something < file.txt" to show the contents of file.txt?
Perhaps you are thinking of cat or some other command, because piping things to echo is such a strange and unexpected thing to do that you normally won't encounter it.
- The use of `1>&2` is to overwrite the stdout of the LHS process so that `echo green` actually never writes to the pipe, so it never gets SIGPIPE.
- `echo` only echoes arguments and always ignores stdin, so putting `echo blue` on the RHS of the pipe only serves to run the two sides of the pipe in parallel
- `bash -c '(echo green 1>&2) | echo blue' 1>stdout 2>stderr` will show you that green and blue actually write to different files
Another observation:
> if it was a separate command (or even if the shell forked to execute it internally), only the 'echo red' process would die from the SIGPIPE instead of the entire left side of the pipeline.
Most linux distributions have /bin/echo as a separate program. Running `(/bin/echo red; echo green 1>&2) | echo blue` will always print both green and blue.
EDIT: fix typo of stdin/stdout/stderr as suggested