UP | HOME

Process Basics
Processes
COP-3402

Table of Contents

The process abstraction

A process is a running program.

Running programs

Diagram

  • Program: bytes on disk, machine code
  • Load into memory
  • Jump to first instruction
  • Program runs!

Problem: we want to run many programs, but we only have one CPU.

What can we do?

There are several solutions.

  • Batch processing
  • Add more CPUs
  • Distributed programming
  • Virtualize the entire machine
  • Time sharing (the one we'll use in this class)

Time sharing

Virtualize the CPU: give programs the illustion of exclusive CPU access

Operating Systems: Three Easy Pieces

  • Remember how we talked about the kernel idealizes hardware?
  • One thing modern OSes do is virtualize the CPU.

Many running programs

Diagram

  • Many virtual CPUs
  • One physical CPU
  • Kernel mediates the sharing

Processes "freeze" the state of the CPU

  • Process control block
    • Process ID (unique ID for all running programs)
    • Saves register values
    • Saves current working directory
    • Saves pointer to its memory
    • File descriptors
    • etc.

UNIX process creation

celldivision_small.jpeg

Copy existing process with fork()

Replace program code with new program with exec()

We will work with process creation syscalls when we get to systems programming.

LPI Figure 24-1

What about the first process?

  • Single initial process (init)
    • Many variations (sysv init rc system, systemd)
    • e.g., runs /usr/bin/login program among other things
  • Running processes form a tree

pstree

Diagram

  • init process
  • process tree
  • process creation

View running in this bash session

ps

View all processes on the system

ps aux

View the process tree

ps axjf

Running programs

Type the file name of the executable, hit <enter>

ls

We've already been creating lots of processes from programs such as ls and mkdir.

Find out more about bash syntax.

Command-line arguments

Additional space-deliminted strings are arguments passed to the program for processing. They are handled however the program likes, although there are conventions to their format (man 3 getopt).

ls ./ ../

There are special characters bash recognizes that aren't part of arguments, such as pipes |. We will look at these more when we get to redirection and advanced processes.

There are ways to handle escaping special characters, e.g., \| and including spaces in a single-argument, e.g., with double-quotes.

Where are programs?

UNIX convention: the PATH environment variable

echo $PATH

Why have a PATH?

Finding a programs path

The which program does the lookup for you

which ls
which which

Programs not in the path

hello
# program not found

Why does running hello by itself fail? I compiled it. The program is there.

This is one reason for using the dot path, when we want to specify a path to the file, but do not want to have to type out the absolute path to the current working directory.

./hello

Commands that aren't programs

which cd
# nothing returned

cd is a builtin command that bash recognizes

How do I run a program that I named cd?

Quick quiz

How do I run a program that I named cd?

Give the path to the program, e.g.,

./cd

Standard I/O

Where does printf's output go?

How does your program know where to send it?

glibc-2.4/stdio-common/printf.c

int
__printf (const char *format, ...)
{
  va_list arg;
  int done;

  va_start (arg, format);
  done = vfprintf (stdout, format, arg);
  va_end (arg);

  return done;
}

UNIX convention

Every process is given three open files when it runs:

  • standard input (stdin)
  • standard output (stdout)
  • standard error (stderr)
man 3 stdio

"At program startup, three text streams are predefined and need not be opened explicitly: standard input (for reading conventional input), standard output (for writing conventional output), and standard error (for writing diagnostic output)."

On process creation, the parent process's stdio files are inherited.

  • Why standard I/O?
  • Why separate stdout and stderr?

Avoids the need to hardcode I/O decisions or hardware specifics in your program. Instead, I/O decisions are made outside of the program the system user independently of the application.

One reason, I assume, is related to UNIX's interactive design: running a program from the shell means there is already input and output (the terminal).

Standard I/O is also used for the UNIX philosophy of chaining multiple programs together.

You'll see this discussed more in tonight's reading for homework.

Where does standard output/error go?

When running in bash: the terminal itself

echo "hello, world!"

Where does standard in come from?

When running in bash: the terminal itself again

cat
# Typing is sent to the cat program's standard in

cat reads a file from stdin and writes it to stdout.

Mark end of input with Ctrl-D

Use Ctrl-D (ascii EOT character) on an empty line to mark the end of file input

The program is waiting for input from stdin, which is the terminal.

An example with grep

grep "hello"
# Typing is sent to the cat program's standard in

grep reads a file from stdin and writes out lines that contain a given string.

Don't forget to use Ctrl-D if inputting the stdin file from the command-line.

Waiting for stadnard input is why programs that expect input stop and wait when run from bash.

Redirection

We can use bash to change (redirect) where stdio goes to and comes from.

Benefit of standardizing I/O; control I/O without rewriting the program.  No hard-coded files, streams, terminals, etc.

Redirecting stdio to files

File Suffix
stdin < infile
stdout > outfile
stderr 2> errfile

Additional redirection usage

  • Append

    Append to the given file instead of overwriting it.

    ls >> filetoappendto
    
  • Here documents

    Create a file on the fly to pass into stdin.

    grep hello << EOT
    this is my here document
    hello, world!
    hello: here document
    not included
    EOT
    

Example with cat

cat < hello.c

What does cat do?

What does cat < infile do?

How does this differ from running cat infile?

  • With cat < infile, the shell opens the program and cat runs with an existing program open
  • With cat infile, the cat program reads the filename and opens the file.
  • The output is the same in both cases (since the file is the same).

Example with grep

grep define < /usr/include/stdio.h

What does grep do?

What does this program do then?

Example of input and output

grep define < /usr/include/stdio.h > grepresults

What does this do?

How can I view grepresults?

Example of error

grep define < /root > grepresults
# The error message is printed to console

Redirecting stderr

grep define < /root > grepresults 2> greperror
# The error message is saved to greperror

Redirect stdout/stderr to the same file

grep define < /root > grepresults 2>&1

2>&1 means redirect stderr (file 2) to file 1 (stdout). Must place this after the redirect of stdout > grepresults (otherwise it will just use the original stdout.

Pipelines

Create complex tools from simple, existing programs.

UNIX philosophy

  1. Make each program do one thing well. To do a new job, build afresh rather than complicate old programs by adding new "features".
  2. Expect the output of every program to become the input to another, as yet unknown, program. Don't clutter output with extraneous information. Avoid stringently columnar or binary input formats. Don't insist on interactive input.
  3. Design and build software, even operating systems, to be tried early, ideally within weeks. Don't hesitate to throw away the clumsy parts and rebuild them.
  4. Use tools in preference to unskilled help to lighten a programming task, even if you have to detour to build the tools and expect to throw some of them out after you've finished using them.

Doug McIlroy

Example: searching for a file

  • find lists each file in a file tree
  • grep searches for text

Use stdio to connect these program

# list all files
find > findresults

# get only files with hello in its path
grep hello < findresults

Use a pipe and avoid making new files

find | grep hello

Separate bash commands with a pipe | symbol to pass stdout to stdin

cat /usr/include/stdio.h | grep head

What are pipes?

  • OS creates two "virtual" files
  • Writing to first file outputs to second file
  • Allows processes to pass data between them

Diagram

The operating system handle intermediate files for us. Made possible because all programs have standard in and out.

We'll use pipes to build our own shell command processor later in the semester.

xargs executing a command on many arguments

find | xargs file

xargs takes each line from stdin and turns them into arguments for the given command, in this case the file command.

What does file do?

How can we run file on a lot of things? Use xargs

Building complex tools from simpler ones

Print the first few lines of every header file

find /usr/include | grep "\.h$" | xargs head -n3
  • head -n3 prints the first 3 lines of a file
  • xargs takes each line from stdin and executes the given command on it
  • xargs head -n3 takes each line from stdin and executes head -n3 on it

Complex text processing

I want to get all configuration options defined by the Linux build system.

# search for all configuration options that you can choose when building the Linux kernel
find | grep Kconfig | xargs grep "^config"

# deduplicate and sort them as well
find | grep Kconfig | xargs grep -h "^config" | sort | uniq

# trim off the "config " keyword
find | grep Kconfig | xargs grep -h "^config" | sort | uniq \
     | cut -f2 -d' '

# count the results
find | grep Kconfig | xargs grep -h "^config" | sort | uniq \
     | cut -f2 -d' ' | wc -l

Pipe both stdout/stderr

Use |& instead of |

grep define /root |& grep -i Permission

Job management

Managing multipe processes from the shell.

Multiprocessing means we can have multiple programs running concurrently.

Killing jobs

Ctrl-C kills a running program.

Suspending jobs

Ctrl-Z suspends a running program.

find /
# type ctrl-z
[1]+  Stopped                 find /

Resuming a job

fg resumes a suspended program to the foreground.

fg
# find continues running
# type ctrl-z to suspend again

Foreground vs. background

  • Foreground means process is running over the shell
  • Background means process running separately from the shell

Foreground and background are relative to the interactive shell. The kernel doesn't distinguish foreground or background processes, all of which are concurrent.

The technical details are more subtle. See man bash JOB CONTROL

Foreground processes

  • Will block the user from running additional shell commands
  • Can receive signals, e.g., Ctrl-z

Background processes

  • No bash signals will work (no Ctrl-C or Ctrl-Z)
  • Output still to terminal (if not redirected)

Put process into background with bg

find /
# type ctrl-z
bg
# ctrl-c will not terminate it
fg
# now ctrl-c will work

You won't be able to see your command-line, because the find command is emitted text so quickly, that the bash shell, which using the same terminal as output, is quickly moved up the screen.

Have faith and type fg then Ctrl-Z to suspend the program.

Put process into the background immediately &

Suffix the command with & to put it in background immediately.

cat &
# to bring into foreground, use fg
fg
# suspend again with Ctrl-Z
# bring to back again with bg
bg

Warning: the stdout/stderr will still be the terminal if not redirected.

Viewing running jobs

find / > /dev/null 2>&1 &
cat &
grep hello &
jobs
fg 2 # bring up the second job, cat &

Really killing a process

  • Ctrl-C basically asks the process to terminate (SIGTERM)
  • SIGTERM can be ignored by the program
  • SIGKILL cannot be
find / > /dev/null 2>&1 &
ps
kill -9 

See man kill for more details

See man 7 signal for signals and their codes

echo $! will tell the last command's pid, so we can do this:

find / > /dev/null 2>&1 &
ps
kill -9 $!

Quick-and-dirty development workflow

  • Use job control to keep an editor running
  • Don't exit, just use Ctrl-Z to suspend editor
  • Compile, test, etc
  • Use fg to resume editing
emacs hello.c
# type Ctrl-Z
gcc -o hello hello.c
./hello
fg

There are better workflows, but this is great for simple, quick scripting tasks for instance.

(Optional) Terminal multiplexing

A terminal multiplexer is like a remote desktop for command-line shells.

We'll use byobu in this class.

byobu

A wrapper for managing terminal multiplexers

byobu is really a wrapper around multiplexers and using tmux by default. GNU screen is an alternative backend for byobu.

Initialization

Setup the multiplexer for bash/emacs' Ctrl-A keybinding.

byobu-ctrl-a
# Type 2 and hit enter

This is done historically multiplexers use ctrl-a to enter multiplexer commands. We will use F# commands instead.

Entering byobu

In eustis, run

byobu

You can tell you are in byobu (tmux), because there is a status bar at the bottom of the screen.

If you already have byobu sessions, byobu will prompt you to connect to an existing one or allow you to make a new one. Otherwise, byobu will create a new multiplexer session.

Creating additional windows

Press F2 to create a new terminal "window".

byobu_status.png

Note the new "tab" with number 1 near the bottom-left of the terminal.

If you can't see the whole status bar, try expanding the terminal window.

Additionally, you can turn off status notifications interactively with F9. Remember, you will only be able to use keyboard (arrow keys, <tab>, <enter>) to navigate.

Detaching and re-attaching

F6 detaches from the byobu session

Rerun byobu to reattach

echo "hello, world!" # to show that we indeed are reattaching
# press ~F6~ to detach
# you will return to the original eustis bash session.
exit # you can even exit and reconnect
ssh eustis..
byobu
# select (1) or your last byobu session

Navigating byobu

  • F3 or Alt-<left> to go to the previous window
  • F4 or Alt-<right> to go to the next window

byobu Cheat Sheet

byobu-ctrl-a
# Type 2 and hit enter
Command Description
F6 Detach from byobu
F2 New byobu terminal
F3 or Alt-<left> Go to left terminal
F4 or Alt-<right> Go to right terminal
exit or Ctrl-D exit terminal (not byobu-specific)

Takeaways

Processes

  • Abstracts away CPU resources
  • Manages running programs
  • Creation: duplicate then replace

Standard I/O

  • stdin, stdout, stderr
  • Make I/O decisions independent of program

Pipes

Chain multiple programs together, e.g.,

find / | grep bin | wc -l

Manage multiple processes

Suspend and resume processes into the background or foreground to work with multiple programs.

emacs hello.c
# Ctrl-Z to suspend
gcc -o hello hello.c
./hello
fg # to resume

Author: Paul Gazzillo

Created: 2025-02-05 Wed 10:54

Validate