Unix Essentials Class

Introduction · Table of Contents · Unix vs. other operating systems · Getting Started · Fundamental UNIX Concepts · Unix Files · Directories · Introduction to the Korn Shell · The vi editor · Korn Shell Again · Hyper-Ad Home Page · Technical Tutoring Home Page · Recommended Books · Online Store

Fundamental UNIX Concepts

UNIX Architecture · UNIX File system · UNIX Processes - ps · Man pages - man · Pagers - more and less

We've already touched on a few fundamental UNIX concepts, particularly the notion of session. During a session, the user enters commands that the shell interprets and executes. Another name for a shell is command interpreter. The shell is actually the top layer of a series of layers of software that finally act to turn the millions of transistors in the CPU and memory on and off.

The earliest computers were huge, clunky things that were literally programmed with 0's and 1's - programmers actually told the computer exactly where to stick the 0's and 1's and how to combine them. A big program might be hundreds of lines of 0's and 1's which were completely unreadable to anyone but an expert.

Modern computers are much different - commands and programs are much closer to ordinary languages. Typical programs run into the millions of lines of code.

In order to make this work at all, much less manageable, software on the computer had to be divided into layers that managed different levels of information. This division into layers forms a plan called the system architecture that governs everything about the computer. Most users will work with at least two of the four common layers of system architecture; power users will work with all four.

A. UNIX Architecture

The kernel At the lowest level, the computer has to:

  1. Put 0's and 1's in memory locations initially
  2. Read the 0's and 1's in the memory locations
  3. Calculate using those memory locations
  4. Store results of calculations in new memory locations
  5. Send results of calculations to various hardware devices, like the hard drive
  6. Retrieve 0's and 1's from hardware devices
  7. Manage lots of the above happening very, very fast

There is a specific piece of software that controls all these low-level actions, called the kernel. In my computer, the kernel is in the /boot directory:

lrwxrwxrwx 1 root root 17 May 16 2000 vmlinuz -> vmlinuz-2.2.12-20
-rw-r--r-- 1 root root 694895 May 17 2000 vmlinuz-2.2.12-20

Usually, the kernel has a special name, like vmlinuz in Linux systems or vmunix in Solaris. In the above case, vmlinuz is actually a symbolic link (like a Windows shortcut) to the "real" kernel vmlinuz-2.2.12-20.

The point is that there is a specific executable file on the computer that gets loaded into the RAM memory at boot time that literally runs the computer. This program is the "real" UNIX - all the other programs (hundreds, perhaps thousands) are just helper programs that allow us to talk to the computer in something easier to understand than meaningless sequences of 0's and 1's.

Device driver level The kernel is really rather simple-minded. It just moves data around and does calculations, all in base 2. In fact, the kernel proper does not really talk to all the hardware, just the CPU and memory. Every other piece of hardware has a special piece of software that sends and receives data from the kernel and interprets it for the hardware. These special programs are called device drivers. The collection of all device drivers forms what is called the device driver layer. Most of the device drivers are considered part of the kernel and are in fact built into the kernel.

Application level User programs form yet another layer of software. These are programs and utilities like word processors, spreadsheets, editors, graphical user interfaces, etc. Usually, these are written in some sort of programming language (like C, C++, Java) that has to be compiled into executable files. There are typically hundreds of these programs on a computer.

Shell level Tying all of these together is a program that allows the user to talk to the computer and execute these other programs. This program is called the shell, command interpreter or user interface. Without it, there would be no way to navigate the file system, execute other programs, look at files, make files, move files or delete files. The shell is the program the users constantly talk to, and the shell in turn talks to the other programs.

Most of what we will be doing in this class is learning to use the shell.

These four layers, the kernel, the device drivers, user programs and the shell make up the UNIX system architecture. Ordinary users just use the shell and user programs; system administrators will typically be concerned with all four layers.

[Aside: In Windows, the shell is called win. It has several forms, like the desktop, My Computer, the Explorer, various pop-up windows, but the actions it performs are the same - start and exit programs, manage files, navigate the file system. There is another shell on Windows - DOS (or the NT version cmd) which is a command line interface somewhat similar to UNIX shells.]

B. UNIX File system

Files The UNIX system is organized using a file system. Literally everything in UNIX is a file - all the directories are files, the programs are files, the devices are files. In practice, we divide the files in a UNIX system into categories by type of file:

    1. Ordinary files are just that - files. These might be text files, database files, spreadsheet files, whatever. Executable files are either compiled programs (executable binaries) or text file scripts that call an executable binary program to run their instructions. In either case, UNIX treats them as ordinary files. The difference is that typing the name of an executable file invokes the program and causes it to run. Typing the name of a non-executable ordinary file will just give an error message saying (probably) that the OS could not find the program you just tried to run.
    2. Links are like shortcuts in Windows - they just point to another file. Links are commonly used to allow access to a particular file from different locations in the file system. Links are very small and so are preferred to making extra copies. Links are also used to give a common, standard name to a file that has a different name. Above, I used a link called vmlinuz to refer to my kernel - the operating system expects to find a file of this name as the kernel, but I prefer to give the real kernel a name that says something about it (in this case the version of the kernel) and use a symbolic link so the OS can find it.
    3. Directories are special files that divide the other types of files into more easily managed groups - their purpose is organization.
    4. Device files represent hardware or software constructs that act like hardware.

We have already been introduced to the root directory and some of the subdirectories of the root directory. We'll come back to the matter of files later on.

Standard Input and Output Above we considered a program called "cat". We'll use cat as an example program to study the input and output behavior of a typical program. A lot of standard UNIX programs are called filters - they take input from the standard input (i.e. the screen), do something to it, and output to the standard output (again, the screen). We'll have cat output to a file called "file1" by redirecting the standard output to a file. This is accomplished simply by typing cat, then a greater than sign (>), then the name of the file we want to write:

$ cat > file1
Hello, this is file1. This file is just a single line of text.
$ ls
file1
$ cat file1
Hello, this is file1. This file is just a single line of text.
$

What you can't see here is the ^D typed after the line of text. That caused cat to exit and brought up the second command prompt ($). Then, "ls" was used to look in our directory to see the new file "file1" that was just created.

Then, cat was invoked again, this time with "file1" as an argument, producing the contents of the file "file1". So, the action of "cat" is just to write its standard input to its standard output.

If that seems a bit unclear, let's try another example. This time, since we already have "file1", let's use it as input for cat (by redirecting again, this time with a "less than" symbol < ).

$ cat < file1
Hello, this is file1. This file is just a single line of text.
$

What happened here? The cat program read the file "file1" instead of screen input, and placed its output on the screen (standard output).

Let's try another experiment: this time, let's again use "file1" as redirected input, and make another file, "file2", the redirected output:

$ cat < file1 > file2
$ cat file2
Hello, this is file1. This file is just a single line of text.
$

We then used cat to look at the contents of file2. What is useful to notice here is that we can create files by redirecting the standard output of a program to a file, and that cat is good for looking at the contents of a file.

This might seem a bit confusing, but cat is a very simple program. Let's do another example:

$ cat > file2
Hello, this is file2, another line of text.
$ cat file2
Hello, this is file2, another line of text.
$

What happened here is that I redirected the output of cat to file2, and entered a line of text from the standard input (the screen - I just typed it in, hit return, and then hit ^D to exit cat). Then I ran cat again with file2 as the argument, producing the line of text I just entered. What happened to the previous contents of file2? It was overwritten - the new contents replaced the old contents. Running "cat file2" just displayed the contents of file2 by writing it to the standard output.

The null device There are a lot of devices in the /dev directory - we shall take a look at them later in more detail. Let's look at one of them, a very special device called the null device or /dev/null. This device is empty:

$ cat /dev/null
$

In fact, anything written to it disappears:

$ cat file1
Hello, this is file1. This file is just a single line of text.
$ cat file1 > /dev/null
$ cat /dev/null
$

First, I ran cat on file1 to show its contents. Then, I did it again, this time redirecting the output to /dev/null. You will remember from before that doing that with file2 wrote the contents of file1 to file2. Here, redirecting the output to /dev/null did nothing - another cat on /dev/null showed no output.

Let's make a new file called file3 by redirecting the output of cat to it, show that it has something in it by doing cat file3, then redirect cat /dev/null to file3:

$ cat > file3
This is file3
$ cat file3
This is file3
$ cat /dev/null > file3
$ cat file3
$

What happened? The file comes up empty! What happened here is that I redirected the contents of an empty file to file3 and erased it!

The lesson here is that /dev/null is empty and cannot be filled. It is designed to be an empty device and stay that way. It can be used as a way to write to somewhere and guarantee that the written content disappears, and also as a way to "zero" out files so they are empty.

C. UNIX Processes - ps

When a UNIX program is running, it occupies RAM (memory) and stays in memory as long as it runs. The memory is reclaimed when the program exits. During the time it runs, the program in memory is called a process. We can look at the programs that are running by using the ps utility:

$ ps
PID TTY TIME CMD
860 pts/1 00:00:00 bash
922 pts/1 00:00:00 ps
$

The output of ps shows two programs running - bash and ps. Each process has a process id or PID, both are running on the "terminal" pts/1, neither process was consuming CPU time when ps took its "snapshot". UNIX keeps a table of running processes called the process table. This table collects the name of the process and useful data about the process such as who is running it, how much CPU time is being consumed and where the process is running (the session or "terminal" or "tty" the process is run from).

Adding some options to the process command shows a more complete picture of what is happening in the computer:

$ ps aux
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1 0.0 0.1 1104 460 ? S May26 0:04 init [3]
root 2 0.0 0.0 0 0 ? SW May26 0:00 [kflushd]
root 3 0.0 0.0 0 0 ? SW May26 0:00 [kupdate]
root 4 0.0 0.0 0 0 ? SW May26 0:00 [kpiod]
root 5 0.0 0.0 0 0 ? SW May26 0:00 [kswapd]
root 6 0.0 0.0 0 0 ? SW< May26 0:00 [mdrecoveryd]
bin 286 0.0 0.1 1196 396 ? S May26 0:00 portmap
root 302 0.0 0.1 1088 464 ? S May26 0:00 /usr/sbin/apmd -p 10
root 355 0.0 0.2 1152 560 ? S May26 0:00 syslogd -m 0
root 366 0.0 0.2 1404 732 ? S May26 0:00 klogd
daemon 382 0.0 0.1 1128 484 ? S May26 0:00 /usr/sbin/atd
root 398 0.0 0.2 1300 600 ? S May26 0:00 crond
root 418 0.0 0.1 1124 484 ? S May26 0:00 inetd
root 434 0.0 0.1 1176 488 ? S May26 0:00 lpd
root 473 0.0 0.4 2104 1108 ? S May26 0:00 sendmail: accepting
root 490 0.0 0.1 1132 444 ? S May26 0:00 gpm -t ps/2
root 506 0.0 0.5 2560 1316 ? S May26 0:00 httpd
nobody 510 0.0 0.5 2748 1412 ? S May26 0:00 httpd
nobody 511 0.0 0.5 2748 1412 ? S May26 0:00 httpd
nobody 512 0.0 0.5 2748 1412 ? S May26 0:00 httpd
nobody 513 0.0 0.5 2748 1412 ? S May26 0:00 httpd
nobody 514 0.0 0.5 2748 1412 ? S May26 0:00 httpd
nobody 515 0.0 0.5 2748 1412 ? S May26 0:00 httpd
nobody 516 0.0 0.5 2748 1412 ? S May26 0:00 httpd
nobody 517 0.0 0.5 2748 1412 ? S May26 0:00 httpd
nobody 518 0.0 0.5 2748 1412 ? S May26 0:00 httpd
nobody 519 0.0 0.5 2748 1412 ? S May26 0:00 httpd
xfs 533 0.0 0.3 1920 996 ? S May26 0:00 xfs -droppriv -daemon
root 573 0.0 0.1 1076 384 tty2 S May26 0:00 /sbin/mingetty tty2
root 574 0.0 0.1 1076 384 tty3 S May26 0:00 /sbin/mingetty tty3
root 575 0.0 0.1 1076 384 tty4 S May26 0:00 /sbin/mingetty tty4
root 576 0.0 0.1 1076 384 tty5 S May26 0:00 /sbin/mingetty tty5
root 577 0.0 0.1 1076 384 tty6 S May26 0:00 /sbin/mingetty tty6
root 700 0.0 0.2 1460 708 ? S May26 0:00 in.telnetd
root 701 0.0 0.4 2216 1116 pts/0 S May26 0:00 login -- student
student 702 0.0 0.3 1728 964 pts/0 S May26 0:00 -bash
root 823 0.0 0.3 2196 1000 tty1 S May26 0:00 login -- root
root 844 0.0 0.3 1764 1000 tty1 S May26 0:00 -bash
root 858 0.0 0.2 1460 708 ? S May26 0:00 in.telnetd
root 859 0.0 0.4 2216 1116 pts/1 S May26 0:00 login -- student
student 860 0.0 0.3 1728 964 pts/1 S May26 0:00 -bash
student 924 0.0 0.3 2496 848 pts/1 R 01:29 0:00 ps aux

From this we can see that the ps command without options shows only the processes being run by the user student. Adding options in this last case shows all of the processes being run by all users including root. From this we can see that there is a lot going on in the computer even though we can't usually see it.

D. Man pages - man

There are hundreds of programs on a typical UNIX system, some of them very complicated. Fortunately, there is almost always documentation for these programs in the form of man (short for manual) pages. Modern systems frequently have other documentation systems as well, but the man pages are the traditional online documentation system. To use the man pages, one types "man <program_name>" like this:

$ man cat
CAT(1) FSF CAT(1)
 
NAME
 
cat - concatenate files and print on the standard output
 
SYNOPSIS
 
cat [OPTION] [FILE]...
 
DESCRIPTION
 
Concatenate FILE(s), or standard input, to standard output.
 
-A, --show-all
equivalent to -vET
 
-b, --number-nonblank
number nonblank output lines
 
-e equivalent to -vE
 
-E, --show-ends
display $ at end of each line
 
-n, --number
number all output lines
 
-s, --squeeze-blank
never more than one single blank line
 
-t equivalent to -vT
 
-T, --show-tabs
display TAB characters as ^I
 
-u (ignored)
 
-v, --show-nonprinting
use ^ and M- notation, except for LFD and TAB
 
--help display this help and exit
 
--version
output version information and exit
With no FILE, or when FILE is -, read standard input.
 
REPORTING BUGS
 
Report bugs to bug-textutils@gnu.org.
 
SEE ALSO
 
The full documentation for cat is maintained as a Texinfo manual. If the info and cat programs are properly installed at your site, the command info cat should give you access to the complete manual.
 
COPYRIGHT
 
Copyright © 1999 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
 
GNU textutils 2.0 August 1999 1
 
(END)

To get out of the current man page, type "q" - the program exits immediately (no need to hit return).

Suppose you wanted more information on how to use the man pages. There is a man entry for man:

$ man man
 
man(1) man(1)
 
NAME
 
man - format and display the on-line manual pages
manpath - determine user's search path for man pages
 
SYNOPSIS
 
man [-acdfFhkKtwW] [-m system] [-p string] [-C config_file] [-M path] [-P pager] [-S section_list]
[section] name ...
 
DESCRIPTION
 
man formats and displays the on-line manual pages. This version knows about the MANPATH and (MAN)PAGER environment variables, so you can have your own set(s) of personal man pages and choose whatever program you like to display the formatted pages. If section is specified, man only looks in that section of the manual. You may also specify the order to search the sections for entries and which preprocessors to run on the source files via command line options or environment variables. If name contains a / then it is first tried as a filename, so that you can do man ./foo.5 or even man /cd/foo/bar.1.gz.
.
. (omitted)
.
before being printed.
:

(note that much of the screen output has been omitted)

Notice the colon at the bottom of the entry - that is a signal that the man entry was too big to fit in one screen full of text and that there is more info off screen. Hit the spacebar to get another screen full (omitted here).

Suppose you were not quite sure what the name of a command was, but you needed to know that, too.

$ man -k edit
ed, red (1) - text editor
edquota (8) - edit user quotas
gEdit (1) - GTK+ based text editor
gnp+ (1) - gnotepad+, a GTK-based notepad/text editor
mcedit (1) - Full featured terminal text editor for Unix-like systems.
netwave_cs (4) - Xircom Creditcard Netwave device driver
pico (1) - simple text editor in the style of the Pine Composer
prompter (1) - prompting editor front-end for nmh
readline (3) - get a line from a user with editing
sed (1) - a Stream EDitor
tcsh (1) - C shell with file name completion and command line editing
tksysv (8) - a runlevel service editor
vim (1) - Vi IMproved, a programmers text editor
vipw, vigr (8) - edit the password or group files
FvwmConsoleC.pl (1) - Command editor for FVWM command input interface
bitmap, bmtoa, atobm (1x) - bitmap editor and converter utilities for the X Window System
editres (1x) - a dynamic resource editor for X Toolkit applications
xedit (1x) - simple text editor for X
(END)

In many systems, the command "apropos" is a synonym for "man -k". This option to the man pages searches for all references to the argument and outputs a list of appropriate man pages that might have something to do with the search topic.

GNU abandoned man pages some years ago - they are still included but no longer maintained. The current documentation for Linux systems uses the texinfo program, invoked by typing the word info:

$ info
File: dir Node: Top This is the top of the INFO tree
This (the Directory node) gives a menu of major topics.
Typing "q" exits, "?" lists all Info commands, "d" returns here,
"h" gives a primer for first-timers,
"mEmacs<Return>" visits the Emacs topic, etc.
In Emacs, you can click mouse button 2 on a menu item or cross reference
to select it.
* Menu:
Texinfo documentation system
* Texinfo: (texinfo). The GNU documentation format.
.
. (omitted)
.
* groups: (sh-utils)groups invocation. Print group names a user is in.
-----Info: (dir)Top, 240 lines --Top-------------------------------------------
Welcome to Info version 3.12h. "C-h" for help, "m" for menu item.

The texinfo program is self-documenting. You are welcome to try it yourself. There is a brief online tutorial (type "h" when you get the above screen). We won't go any further than mentioning it here since not all versions of UNIX use it.

E. Pagers - more and less

Try typing the command:

$ ls -l /etc
total 1300
drwxr-xr-x 3 root root 4096 Feb 12 2000 CORBA/
-rw-r--r-- 1 root root 2045 Sep 24 1999 DIR_COLORS
-rw-r--r-- 1 root root 9 May 26 15:27 HOSTNAME
-rw-r--r-- 1 root root 41 May 29 2000 MACHINE.SID
-rw-r--r-- 1 root root 5421 Sep 25 1999 Muttrc
drwxr-xr-x 12 root root 4096 May 16 2000 X11/
.
.
.
-rw-r--r-- 1 root root 1332 Sep 26 1999 up2date.conf
drwxr-xr-x 2 root root 4096 Feb 12 2000 vga/
-rw------- 1 root root 529 May 13 2000 wvdial.conf
-rw-r--r-- 1 root root 361 Feb 12 2000 yp.conf
$

there is so much info returned (1300 blocks are in the directory, although 138 lines are returned - most have been omitted here for brevity) that it flies off the screen faster than you can see it. Only the bottom screen full can be seen. This is fixed by using pagers - programs that only allow 1 screen full of information to be displayed at a time.

Now try writing the output of the same command to a file via

$ ls -l /etc > list
$

Try looking at the contents of list using

$ cat list

The same long output again blows past the screen. Now try

$ more list
total 1300
drwxr-xr-x 3 root root 4096 Feb 12 2000 CORBA/
-rw-r--r-- 1 root root 2045 Sep 24 1999 DIR_COLORS
-rw-r--r-- 1 root root 9 May 26 15:27 HOSTNAME
-rw-r--r-- 1 root root 41 May 29 2000 MACHINE.SID
-rw-r--r-- 1 root root 5421 Sep 25 1999 Muttrc
drwxr-xr-x 12 root root 4096 May 16 2000 X11/
.
.
.
-rw-r--r-- 1 root root 1262 Feb 12 2000 localtime
-rw-r--r-- 1 root root 1180 Sep 22 1999 login.defs
-rw-r--r-- 1 root root 589 Jun 16 1999 logrotate.conf
--More--(52%)

the first part of the output (in fact, 52% of it) is displayed. You know there is more to the file because of the

--More--(52%)

at the bottom of the screen. Hit the spacebar - more of the file is displayed. Keep going until you get back to a command prompt. Instead of the spacebar, you can hit the enter key to scroll through the rest of the file one line at a time.

Now try the same experiments with

$ less list

Notice the similarity to the "more" output. Less is actually a more sophisticated version of more - and its name is typical UNIX word game. On some systems, there is another pager "pg", but not on Linux. We'll come back to pagers later.