pipe
What is a pipe?
In computer programming, especially in Unix operating systems (OSes), a pipe is a technique for passing information from one program process or command to another. Unlike other forms of interprocess communication (IPC) -- i.e., the communication between related processes -- a pipe refers to one-way communication only.
Usually, the OS accepts input from a mouse, keyboard or some other input mechanism and sends the output to the display screen. But, at times, it may require the output from one command or process to be used as the input for a second command or process without passing the data through the input or output device. In these situations, a software pipe is used.
A pipe simply refers to a temporary software connection between two programs or commands. An area of the main memory is treated like a virtual file to temporarily hold data and pass it from one process to another in a single direction.
In OSes like Unix, a pipe passes the output of one process to another process. The second (receiving) process, known as the filter, accepts the output as input. It then performs some operation on that input and writes the result to the standard output, e.g., a display screen.
The system temporarily holds the piped information until it is ready to be read by the receiving process. If the receiving process tries to read the information before it is written to the pipe, the process gets suspended until something is written.
System of one-way communication
A pipe works on the first in, first out principle and behaves like a queue data structure. With a pipe, the output of the first process becomes the input of the second. However, the reverse does not happen. This is why a pipe is a form of one-way communication between processes or commands. For two-way communication, two pipes can be set up, one for each direction.
Even so, a limitation of pipes for IPC is that the processes using the pipes must have a common parent process. Simply put, they must share a common open or initiation process and exist as the result of a fork system call from a parent process.
How a pipe works in Unix
Pipes are most commonly used in programming on Unix systems. In these systems, programs, processes or commands can be linked together. Thus, instead of creating large or complex programs designed to do many different things, programmers create multiple programs to do one job well and work well with each other. This programming model is called pipes and filters. The pipe provides the connection that transforms an input into an output stream, while the filter reads from the standard input, does some processing on what it reads and then writes to a standard output.
In Unix systems, a pipe is specified in a command line as a simple vertical bar ( | ) between two command sequences. An example of the syntax would be the following: Command 1 | command 2 | command 3 |. For this, a Unix interactive command interface or Unix shell is used. The output or result of the first command sequence becomes the input for the second command sequence. The pipe system call is used in a similar way within non-OS programs.
Benefits of pipe in Unix
Generally, a pipe is a form of redirecting output to another destination for further processing. It provides a temporary connection between two or more commands, programs or processes. In Unix and Linux systems, a pipe enables more complex processing.
A pipe can also be used to combine two or more commands or programs. By connecting these commands -- or programs or processes -- the pipe enables them to operate simultaneously and to multiply their individual power. It also enables continuous data transfers between them so there's no need to pass information through temporary text files or the display screen.
Examples of a pipe in Unix
In Unix, pipes are frequently used to filter, sort and display the text in a list. Another common use is to display only the unique, nonredundant entries in a list.
Example 1: Sort the data in a list and display the sorted data on screen
Suppose there's a list of names in a file called names.txt:
- John
- Sarah
- Darren
- Ahmad
- Lakeisha
- Kumar
- Zack
- Maitreyi
To sort this list, this command with a pipe is used:
$ Cat names.txt | sort
This command outputs the names in alphabetical order:
- Ahmad
- Darren
- John
- Kumar
- Lakeisha
- Maitreyi
- Sarah
- Zack
Example 2: Display nonredundant values on screen
Suppose there's a list of books in a file called books.txt with multiple redundant values:
- Harry Potter and the Goblet of Fire
- Catch-22
- To Kill a Mockingbird
- Nineteen Eighty-Four
- The Fault in Our Stars
- Gone Girl
- Catch-22
- Nineteen Eighty-Four
- Harry Potter and the Goblet of Fire
- The Fault in Our Stars
Here's the command and pipe used to sort the names and see only the nonrepeated values on screen:
$ Cat books.txt | sort | uniq
Here is the output:
- Catch-22
- Gone Girl
- Harry Potter and the Goblet of Fire
- Nineteen Eighty-Four
- The Fault in Our Stars
- To Kill a Mockingbird
Parent and child in pipe
A single process, as well as its child processes, can use the pipe for reading and writing. When there is a fork in a process, the file descriptors remain open across both the parent and child processes. Calling the fork after creating a pipe enables the parent and child to communicate via the pipe.
After the parent and child finish writing and reading, they may block the process instead of terminating it. As a result, the read system call either gets the data it requests or the data in the pipe, whichever is lesser. At this point, the program hangs, or freezes.
If a read system call is made when the pipe is empty and no process has the writing end open, the command returns end of file (return value 0). However, if another process has the pipe open for writing, a read call will block in anticipation of new data entering the pipe. The write ends the parent process, but the child process doesn't close so the code output hangs and the program remains unterminated.
See: Bourne again shell and vertical bar. Also, learn basic Linux terminology.