System Tools
Lecture 2
Table of Contents
Overview
- Great programmers know their tools
- Command-line exposes powerful, fast tools
- We'll cover
- Basic shell commands
- Navigating the file system
- Processes
- Useful utilities
- git
- gcc
- make
Command-line crash course
- Follow along with vagrant
Command-line basics
cd ~/ pwd ls ls -l ls -latrh cd cd /vagrant cd ~/ cat echo echo "hello world" > newfile.txt cat newfile.txt echo "hello world" | cat tac newfile.txt less newfile.txt # q to quit
More basics
mkdir newdir touch emptyfile.txt cd newdir pwd ls cd ../ cd ~/ rmdir newdir rm newdir/empty rmdir newdir mkdir targetdir cp emptyfile.txt targetdir ls -R ls targetdir/ touch filetomove.txt mv filetomove.txt targetdir ls -R cd targetdir mv filetomove.txt ../ cd ../ ls -R
Tab-completion
ls <tab><tab>n<tab> touch none.txt ls <tab><tab>n<tab><tab>o<tab>
Directory hierarchy
tree # to show hierarchy ls /<tab><tab>v<tab><tab> pwd cd subdir # absolute vs relative paths cd ../ # parent dir cd ./ # current dir ls ./ # current dir cat ./newfile.txt # same as cat newfile.txt ls -la
Standard I/O and redirection
ls > lsoutput.txt gcc -o printer printer.c # fprintf(stdout/stderr) ./printer > out # ./printer 1> out ./printer 2> err ./printer > out 2> err ./print >out 2>&1
Pipes
- Redirections stdio between processes (instead of just files)
cat file cat file | grep "hello" cat file | grep "hello" | cut c2-
Processes
ps ps -aux ps -aux | grep paul ls top htop echo $?
Useful tools
man
man cat man atoi man -a exec
find
grep
tar
tar -cvf packagename.tar file1 file2 tar -tvf packagename.tar tar -xvf packagename.tar
editors
- nano, emacs, vim
diff
- Show differences between text files
od
- Show raw bytes of a file
git tutorial
git clone url ssh-keygen -b 4096 # adding a password is a good idea eval $(keychain --eval) ssh-add ~/.ssh/id_rsa cat ~/.ssh/id_rsa.pub # .pub is important! # copy key to github settings under ssh keys git config --global user.name "Your Name" git config --global user.email "youremail@yourdomain.com"
Workflow
cd project git status git add git commit git commit -p git log
From source to an executable
- Preprocessor: file.c -> file.i
- Compiler: file.i -> file.asm
- Assembler: file.asm -> file.o
- Linker: file.o, libraries -> file.exe
(Diagram)
Preprocessor (cpp)
cpp -o file.i file.c
Compiler (gcc)
gcc -S -O0 hello.c file hello.s
Assembler (as)
as -o hello.o hello.c file hello.o
https://www.agix.com.au/creating-a-hello-world-program-in-assembly-language-in-5-minutes/
Linker (ld)
gcc -o hello.exe -v hello.o # shows linker file hello.exe
https://systemoverlord.com/2017/03/19/got-and-plt-for-pwning.html
https://stackoverflow.com/questions/5469274/what-does-plt-mean-here
http://dustin.schultz.io/how-is-glibc-loaded-at-runtime.html
Summary
- Preprocessor: file.c -> file.i
- Compiler: file.i -> file.asm
- Assembler: file.asm -> file.o
- Linker: file.o, libraries -> file.exe
From an executable to a running process
- Loader: file.exe -> running process in RAM
Kernel in a nutshell
- (Diagram) https://thecustomizewindows.com/2012/07/kernel-of-operating-system/
- Mediates hardware access (processor, memory, I/O)
- No (intentional) direct access to hardware permitted (hardware protection)
- Kernel provides a library of system calls (syscalls) to applications
- Kernel manages resources among running programs, i.e., processes
- Illusion of simultaneous execution (time-sharing)
Interacting with the kernel
- In short we need to
- Use an agreed-on binary file format, e.g., ELF for *nix
- Provide the location of the first thing to run:
_start
by convention for C - Use the
exec
-family andexit
syscalls to start and stop our program
- Lots of other details more appropriate for an OS course
- Dynamic linking
- Signals
- I/O
- Memory management
- Process management
Loader (exec
syscall)
- Brings binary file into memory
- Begins execution
- Can take argc/argv, environment variables, pass along to process
- Sets up file, I/O
(exec ./hello.exe)
The C runtime (crt)
- Defines entrypoint
_start
- crt sets up: signals, stdio, args (from loader), exit code (syscall)
- Calls
main
, passing argc/argv - Takes main's return value passes it to the
exit
syscall- This is why main has a return value
- The
https://wiki.osdev.org/How_kernel,_compiler,_and_C_library_work_together https://wiki.osdev.org/Creating_a_C_Library
Summary
- Compiler toolchain: file.c -> file.exe
- Loader: file.exe -> running process in RAM
Separate compilation
- Multiple .c files
- Compile/assemble each to .o files
- Link into single executable
- (Diagram of several C source files)
- Symbol table
Example
- .o files symbol tables have missing entries
- Linker resolves missing entries
gcc -c caller.c gcc -c callee.c objdump -t caller.o objdump -t callee.o gcc -c main.c objdump -t main.o gcc -o main.exe main.o caller.o callee.o # link objdump -t main.exe readelf -a caller.o readelf -a main.exe
Header files: organizing multiple .c files
- Header file has no implementation (by convention)
- Just provides function signature for callers
- Compiler/assembly create table of (missing) symbols
- Linker matches callers/callees
- The preprocessor copies in header declarations
Example
Moving common declarations of external functions to a .h file.
Makefiles: automating the build process
- Stores dependencies between files
- e.g., .c files create .o files
- Finds valid order of builds
- Faster recompilation: only those .c files that change
- Useful for complex projects
- We'll use tools that generate C programs
Example
- Conventions: all, clean (phony)
- built-in rules
- (Diagram) dependency graph
SRC := \ caller.c \ callee.c \ main.c OBJ := $(SRC:%.c=%.o) .PHONY: all clean all: main main: $(OBJ) gcc -o $@ $^ %.o: %.c gcc -c $@ $< clean: rm -f $(OBJ) main
wrap up by adding our separate compilation program to git
Wrap-Up
- We seen
- Lots of shell commands
- git usage
- gcc and make
- C project organization