Process Basics
Processes
COP-3402
Table of Contents
The process abstraction
A process is a running program.
Running programs
Diagram
- Program: bytes on disk, machine code
- Load into memory
- Jump to first instruction
- Program runs!
Problem: we want to run many programs, but we only have one CPU.
What can we do?
There are several solutions.
- Batch processing
- Add more CPUs
- Distributed programming
- Virtualize the entire machine
- Time sharing (the one we'll use in this class)
Time sharing
Virtualize the CPU: give programs the illustion of exclusive CPU access
Operating Systems: Three Easy Pieces
- Remember how we talked about the kernel idealizes hardware?
- One thing modern OSes do is virtualize the CPU.
Many running programs
Diagram
- Many virtual CPUs
- One physical CPU
- Kernel mediates the sharing
Processes "freeze" the state of the CPU
- Process control block
- Process ID (unique ID for all running programs)
- Saves register values
- Saves current working directory
- Saves pointer to its memory
- File descriptors
- etc.
UNIX process creation
Copy existing process with fork()
Replace program code with new program with exec()
We will work with process creation syscalls when we get to systems programming.
LPI Figure 24-1
What about the first process?
- Single initial process (init)
- Many variations (sysv init rc system, systemd)
- e.g., runs /usr/bin/login program among other things
- Running processes form a tree
pstree
Diagram
- init process
- process tree
- process creation
View running in this bash session
ps
View all processes on the system
ps aux
View the process tree
ps axjf
Running programs
Type the file name of the executable, hit <enter>
ls
We've already been creating lots of processes from programs such as ls
and mkdir
.
Find out more about bash syntax.
Command-line arguments
Additional space-deliminted strings are arguments passed to the program for processing. They are handled however the program likes, although there are conventions to their format (man 3 getopt
).
ls ./ ../
There are special characters bash recognizes that aren't part of arguments, such as pipes |
. We will look at these more when we get to redirection and advanced processes.
There are ways to handle escaping special characters, e.g., \|
and including spaces in a single-argument, e.g., with double-quotes.
Where are programs?
UNIX convention: the PATH environment variable
echo $PATH
Why have a PATH?
Finding a programs path
The which program does the lookup for you
which ls
which which
Programs not in the path
hello # program not found
Why does running hello
by itself fail? I compiled it. The program is there.
This is one reason for using the dot path, when we want to specify a path to the file, but do not want to have to type out the absolute path to the current working directory.
./hello
Commands that aren't programs
which cd # nothing returned
cd is a builtin command that bash recognizes
How do I run a program that I named cd
?
Quick quiz
How do I run a program that I named cd
?
Give the path to the program, e.g.,
./cd
Standard I/O
Where does printf
's output go?
How does your program know where to send it?
glibc-2.4/stdio-common/printf.c
int __printf (const char *format, ...) { va_list arg; int done; va_start (arg, format); done = vfprintf (stdout, format, arg); va_end (arg); return done; }
UNIX convention
Every process is given three open files when it runs:
- standard input (stdin)
- standard output (stdout)
- standard error (stderr)
man 3 stdio
"At program startup, three text streams are predefined and need not be opened explicitly: standard input (for reading conventional input), standard output (for writing conventional output), and standard error (for writing diagnostic output)."
On process creation, the parent process's stdio files are inherited.
- Why standard I/O?
- Why separate stdout and stderr?
Avoids the need to hardcode I/O decisions or hardware specifics in your program. Instead, I/O decisions are made outside of the program the system user independently of the application.
One reason, I assume, is related to UNIX's interactive design: running a program from the shell means there is already input and output (the terminal).
Standard I/O is also used for the UNIX philosophy of chaining multiple programs together.
You'll see this discussed more in tonight's reading for homework.
Where does standard output/error go?
When running in bash: the terminal itself
echo "hello, world!"
Where does standard in come from?
When running in bash: the terminal itself again
cat # Typing is sent to the cat program's standard in
cat
reads a file from stdin and writes it to stdout.
Mark end of input with Ctrl-D
Use Ctrl-D (ascii EOT character) on an empty line to mark the end of file input
The program is waiting for input from stdin, which is the terminal.
An example with grep
grep "hello" # Typing is sent to the cat program's standard in
grep
reads a file from stdin and writes out lines that contain a given string.
Don't forget to use Ctrl-D if inputting the stdin file from the command-line.
Waiting for stadnard input is why programs that expect input stop and wait when run from bash.
Redirection
We can use bash to change (redirect) where stdio goes to and comes from.
Benefit of standardizing I/O; control I/O without rewriting the program. No hard-coded files, streams, terminals, etc.
Redirecting stdio to files
File | Suffix |
---|---|
stdin | < infile |
stdout | > outfile |
stderr | 2> errfile |
Example with cat
cat < hello.c
What does cat do?
What does cat < infile do?
How does this differ from running cat infile
?
- With
cat < infile
, the shell opens the program and cat runs with an existing program open - With
cat infile
, the cat program reads the filename and opens the file. - The output is the same in both cases (since the file is the same).
Example with grep
grep define < /usr/include/stdio.h
What does grep do?
What does this program do then?
Example of input and output
grep define < /usr/include/stdio.h > grepresults
What does this do? How can I view grepresults?
Example of error
grep define < /root > grepresults # The error message is printed to console
Redirecting stderr
grep define < /root > grepresults 2> greperror # The error message is saved to greperror
Redirect stdout/stderr to the same file
grep define < /root > grepresults 2>&1
2>&1
means redirect stderr (file 2) to file 1 (stdout). Must place this after the redirect of stdout > grepresults
(otherwise it will just use the original stdout.
Pipelines
Create complex tools from simple, existing programs.
UNIX philosophy
- Make each program do one thing well. To do a new job, build afresh rather than complicate old programs by adding new "features".
- Expect the output of every program to become the input to another, as yet unknown, program. Don't clutter output with extraneous information. Avoid stringently columnar or binary input formats. Don't insist on interactive input.
- Design and build software, even operating systems, to be tried early, ideally within weeks. Don't hesitate to throw away the clumsy parts and rebuild them.
- Use tools in preference to unskilled help to lighten a programming task, even if you have to detour to build the tools and expect to throw some of them out after you've finished using them.
Example: searching for a file
find
lists each file in a file treegrep
searches for text
Use stdio to connect these program
# list all files find > findresults # get only files with hello in its path grep hello < findresults
Use a pipe and avoid making new files
find | grep hello
Separate bash commands with a pipe |
symbol to pass stdout to stdin
cat /usr/include/stdio.h | grep head
What are pipes?
- OS creates two "virtual" files
- Writing to first file outputs to second file
- Allows processes to pass data between them
Diagram
The operating system handle intermediate files for us. Made possible because all programs have standard in and out.
We'll use pipes to build our own shell command processor later in the semester.
xargs
executing a command on many arguments
find | xargs file
xargs
takes each line from stdin and turns them into arguments for the given command, in this case the file
command.
What does file
do?
How can we run file
on a lot of things? Use xargs
Building complex tools from simpler ones
Print the first few lines of every header file
find /usr/include | grep "\.h$" | xargs head -n3
head -n3
prints the first 3 lines of a filexargs
takes each line from stdin and executes the given command on itxargs head -n3
takes each line from stdin and executeshead -n3
on it
Complex text processing
I want to get all configuration options defined by the Linux build system.
# search for all configuration options that you can choose when building the Linux kernel find | grep Kconfig | xargs grep "^config" # deduplicate and sort them as well find | grep Kconfig | xargs grep -h "^config" | sort | uniq # trim off the "config " keyword find | grep Kconfig | xargs grep -h "^config" | sort | uniq \ | cut -f2 -d' ' # count the results find | grep Kconfig | xargs grep -h "^config" | sort | uniq \ | cut -f2 -d' ' | wc -l
Pipe both stdout/stderr
Use |&
instead of |
grep define /root |& grep -i Permission
Job management
Managing multipe processes from the shell.
Multiprocessing means we can have multiple programs running concurrently.
Killing jobs
Ctrl-C kills a running program.
Suspending jobs
Ctrl-Z suspends a running program.
find / # type ctrl-z
[1]+ Stopped find /
Resuming a job
fg
resumes a suspended program to the foreground.
fg # find continues running # type ctrl-z to suspend again
Foreground vs. background
- Foreground means process is running over the shell
- Background means process running separately from the shell
Foreground and background are relative to the interactive shell. The kernel doesn't distinguish foreground or background processes, all of which are concurrent.
The technical details are more subtle. See man bash
JOB CONTROL
Foreground processes
- Will block the user from running additional shell commands
- Can receive signals, e.g., Ctrl-z
Background processes
- No bash signals will work (no Ctrl-C or Ctrl-Z)
- Output still to terminal (if not redirected)
Put process into background with bg
find / # type ctrl-z bg # ctrl-c will not terminate it fg # now ctrl-c will work
You won't be able to see your command-line, because the find command is emitted text so quickly, that the bash shell, which using the same terminal as output, is quickly moved up the screen.
Have faith and type fg
then Ctrl-Z
to suspend the program.
Put process into the background immediately &
Suffix the command with &
to put it in background immediately.
cat & # to bring into foreground, use fg fg # suspend again with Ctrl-Z # bring to back again with bg bg
Warning: the stdout/stderr will still be the terminal if not redirected.
Viewing running jobs
find / > /dev/null 2>&1 & cat & grep hello & jobs fg 2 # bring up the second job, cat &
Really killing a process
- Ctrl-C basically asks the process to terminate (
SIGTERM
) SIGTERM
can be ignored by the programSIGKILL
cannot be
find / > /dev/null 2>&1 & ps kill -9
See man kill
for more details
See man 7 signal
for signals and their codes
echo $!
will tell the last command's pid, so we can do this:
find / > /dev/null 2>&1 & ps kill -9 $!
Quick-and-dirty development workflow
- Use job control to keep an editor running
- Don't exit, just use Ctrl-Z to suspend editor
- Compile, test, etc
- Use fg to resume editing
emacs hello.c # type Ctrl-Z gcc -o hello hello.c ./hello fg
There are better workflows, but this is great for simple, quick scripting tasks for instance.
(Optional) Terminal multiplexing
A terminal multiplexer is like a remote desktop for command-line shells.
We'll use byobu
in this class.
byobu
A wrapper for managing terminal multiplexers
byobu
is really a wrapper around multiplexers and using tmux by default. GNU screen is an alternative backend for byobu.
Initialization
Setup the multiplexer for bash/emacs' Ctrl-A keybinding.
byobu-ctrl-a # Type 2 and hit enter
This is done historically multiplexers use ctrl-a to enter multiplexer commands. We will use F# commands instead.
Entering byobu
In eustis, run
byobu
You can tell you are in byobu (tmux), because there is a status bar at the bottom of the screen.
If you already have byobu sessions, byobu will prompt you to connect to an existing one or allow you to make a new one. Otherwise, byobu will create a new multiplexer session.
Creating additional windows
Press F2
to create a new terminal "window".
Note the new "tab" with number 1 near the bottom-left of the terminal.
If you can't see the whole status bar, try expanding the terminal window.
Additionally, you can turn off status notifications interactively with F9. Remember, you will only be able to use keyboard (arrow keys, <tab>, <enter>) to navigate.
Detaching and re-attaching
F6
detaches from the byobu session
Rerun byobu
to reattach
echo "hello, world!" # to show that we indeed are reattaching # press ~F6~ to detach # you will return to the original eustis bash session. exit # you can even exit and reconnect ssh eustis.. byobu # select (1) or your last byobu session
Navigating byobu
F3
or Alt-<left> to go to the previous windowF4
or Alt-<right> to go to the next window
byobu
Cheat Sheet
byobu-ctrl-a # Type 2 and hit enter
Command | Description |
---|---|
F6 | Detach from byobu |
F2 | New byobu terminal |
F3 or Alt-<left> | Go to left terminal |
F4 or Alt-<right> | Go to right terminal |
exit or Ctrl-D | exit terminal (not byobu-specific) |
Takeaways
Processes
- Abstracts away CPU resources
- Manages running programs
- Creation: duplicate then replace
Standard I/O
- stdin, stdout, stderr
- Make I/O decisions independent of program
Pipes
Chain multiple programs together, e.g.,
find / | grep bin | wc -l
Manage multiple processes
Suspend and resume processes into the background or foreground to work with multiple programs.
emacs hello.c # Ctrl-Z to suspend gcc -o hello hello.c ./hello fg # to resume