Unix Help - Fundamental Unix Concepts - Hyper-Ad Communications

Unix Essentials Class

Introduction · Table of Contents · Unix vs. other operating systems · Getting Started · Fundamental UNIX Concepts · Unix Files · Directories · Introduction to the Korn Shell · The vi editor · Korn Shell Again · Hyper-Ad Home Page · Technical Tutoring Home Page · Recommended Books · Online Store

Fundamental UNIX Concepts

UNIX Architecture · UNIX File system · UNIX Processes - ps · Man pages - man · Pagers - more and less

We've already touched on a few fundamental UNIX concepts, particularly the notion of session. During a session, the user enters commands that the shell interprets and executes. Another name for a shell is command interpreter. The shell is actually the top layer of a series of layers of software that finally act to turn the millions of transistors in the CPU and memory on and off.

The earliest computers were huge, clunky things that were literally programmed with 0's and 1's - programmers actually told the computer exactly where to stick the 0's and 1's and how to combine them. A big program might be hundreds of lines of 0's and 1's which were completely unreadable to anyone but an expert.

Modern computers are much different - commands and programs are much closer to ordinary languages. Typical programs run into the millions of lines of code.

In order to make this work at all, much less manageable, software on the computer had to be divided into layers that managed different levels of information. This division into layers forms a plan called the system architecture that governs everything about the computer. Most users will work with at least two of the four common layers of system architecture; power users will work with all four.

A. UNIX Architecture

The kernel At the lowest level, the computer has to:

Put 0's and 1's in memory locations initially
Read the 0's and 1's in the memory locations
Calculate using those memory locations
Store results of calculations in new memory locations
Send results of calculations to various hardware devices, like the hard drive
Retrieve 0's and 1's from hardware devices
Manage lots of the above happening very, very fast

There is a specific piece of software that controls all these low-level actions, called the kernel. In my computer, the kernel is in the /boot directory:

lrwxrwxrwx 1 root root 17 May 16 2000 vmlinuz -> vmlinuz-2.2.12-20

-rw-r--r-- 1 root root 694895 May 17 2000 vmlinuz-2.2.12-20

Usually, the kernel has a special name, like vmlinuz in Linux systems or vmunix in Solaris. In the above case, vmlinuz is actually a symbolic link (like a Windows shortcut) to the "real" kernel vmlinuz-2.2.12-20.

The point is that there is a specific executable file on the computer that gets loaded into the RAM memory at boot time that literally runs the computer. This program is the "real" UNIX - all the other programs (hundreds, perhaps thousands) are just helper programs that allow us to talk to the computer in something easier to understand than meaningless sequences of 0's and 1's.

Device driver level The kernel is really rather simple-minded. It just moves data around and does calculations, all in base 2. In fact, the kernel proper does not really talk to all the hardware, just the CPU and memory. Every other piece of hardware has a special piece of software that sends and receives data from the kernel and interprets it for the hardware. These special programs are called device drivers. The collection of all device drivers forms what is called the device driver layer. Most of the device drivers are considered part of the kernel and are in fact built into the kernel.

Application level User programs form yet another layer of software. These are programs and utilities like word processors, spreadsheets, editors, graphical user interfaces, etc. Usually, these are written in some sort of programming language (like C, C++, Java) that has to be compiled into executable files. There are typically hundreds of these programs on a computer.

Shell level Tying all of these together is a program that allows the user to talk to the computer and execute these other programs. This program is called the shell, command interpreter or user interface. Without it, there would be no way to navigate the file system, execute other programs, look at files, make files, move files or delete files. The shell is the program the users constantly talk to, and the shell in turn talks to the other programs.

Most of what we will be doing in this class is learning to use the shell.

These four layers, the kernel, the device drivers, user programs and the shell make up the UNIX system architecture. Ordinary users just use the shell and user programs; system administrators will typically be concerned with all four layers.

[Aside: In Windows, the shell is called win. It has several forms, like the desktop, My Computer, the Explorer, various pop-up windows, but the actions it performs are the same - start and exit programs, manage files, navigate the file system. There is another shell on Windows - DOS (or the NT version cmd) which is a command line interface somewhat similar to UNIX shells.]

B. UNIX File system

Files The UNIX system is organized using a file system. Literally everything in UNIX is a file - all the directories are files, the programs are files, the devices are files. In practice, we divide the files in a UNIX system into categories by type of file:

Ordinary files are just that - files. These might be text files, database files, spreadsheet files, whatever. Executable files are either compiled programs (executable binaries) or text file scripts that call an executable binary program to run their instructions. In either case, UNIX treats them as ordinary files. The difference is that typing the name of an executable file invokes the program and causes it to run. Typing the name of a non-executable ordinary file will just give an error message saying (probably) that the OS could not find the program you just tried to run.
Links are like shortcuts in Windows - they just point to another file. Links are commonly used to allow access to a particular file from different locations in the file system. Links are very small and so are preferred to making extra copies. Links are also used to give a common, standard name to a file that has a different name. Above, I used a link called vmlinuz to refer to my kernel - the operating system expects to find a file of this name as the kernel, but I prefer to give the real kernel a name that says something about it (in this case the version of the kernel) and use a symbolic link so the OS can find it.
Directories are special files that divide the other types of files into more easily managed groups - their purpose is organization.
Device files represent hardware or software constructs that act like hardware.

We have already been introduced to the root directory and some of the subdirectories of the root directory. We'll come back to the matter of files later on.

Standard Input and Output Above we considered a program called "cat". We'll use cat as an example program to study the input and output behavior of a typical program. A lot of standard UNIX programs are called filters - they take input from the standard input (i.e. the screen), do something to it, and output to the standard output (again, the screen). We'll have cat output to a file called "file1" by redirecting the standard output to a file. This is accomplished simply by typing cat, then a greater than sign (>), then the name of the file we want to write:

$ cat > file1

Hello, this is file1. This file is just a single line of text.

$ ls

file1

$ cat file1

Hello, this is file1. This file is just a single line of text.

What you can't see here is the ^D typed after the line of text. That caused cat to exit and brought up the second command prompt ($). Then, "ls" was used to look in our directory to see the new file "file1" that was just created.

Then, cat was invoked again, this time with "file1" as an argument, producing the contents of the file "file1". So, the action of "cat" is just to write its standard input to its standard output.

If that seems a bit unclear, let's try another example. This time, since we already have "file1", let's use it as input for cat (by redirecting again, this time with a "less than" symbol < ).

$ cat < file1

Hello, this is file1. This file is just a single line of text.

What happened here? The cat program read the file "file1" instead of screen input, and placed its output on the screen (standard output).

Let's try another experiment: this time, let's again use "file1" as redirected input, and make another file, "file2", the redirected output:

$ cat < file1 > file2

$ cat file2

Hello, this is file1. This file is just a single line of text.

We then used cat to look at the contents of file2. What is useful to notice here is that we can create files by redirecting the standard output of a program to a file, and that cat is good for looking at the contents of a file.

This might seem a bit confusing, but cat is a very simple program. Let's do another example:

$ cat > file2

Hello, this is file2, another line of text.

$ cat file2

Hello, this is file2, another line of text.

What happened here is that I redirected the output of cat to file2, and entered a line of text from the standard input (the screen - I just typed it in, hit return, and then hit ^D to exit cat). Then I ran cat again with file2 as the argument, producing the line of text I just entered. What happened to the previous contents of file2? It was overwritten - the new contents replaced the old contents. Running "cat file2" just displayed the contents of file2 by writing it to the standard output.

The null device There are a lot of devices in the /dev directory - we shall take a look at them later in more detail. Let's look at one of them, a very special device called the null device or /dev/null. This device is empty:

$ cat /dev/null

In fact, anything written to it disappears:

$ cat file1

Hello, this is file1. This file is just a single line of text.

$ cat file1 > /dev/null

$ cat /dev/null

First, I ran cat on file1 to show its contents. Then, I did it again, this time redirecting the output to /dev/null. You will remember from before that doing that with file2 wrote the contents of file1 to file2. Here, redirecting the output to /dev/null did nothing - another cat on /dev/null showed no output.

Let's make a new file called file3 by redirecting the output of cat to it, show that it has something in it by doing cat file3, then redirect cat /dev/null to file3:

$ cat > file3

This is file3

$ cat file3

This is file3

$ cat /dev/null > file3

$ cat file3

What happened? The file comes up empty! What happened here is that I redirected the contents of an empty file to file3 and erased it!

The lesson here is that /dev/null is empty and cannot be filled. It is designed to be an empty device and stay that way. It can be used as a way to write to somewhere and guarantee that the written content disappears, and also as a way to "zero" out files so they are empty.

C. UNIX Processes - ps

When a UNIX program is running, it occupies RAM (memory) and stays in memory as long as it runs. The memory is reclaimed when the program exits. During the time it runs, the program in memory is called a process. We can look at the programs that are running by using the ps utility:

$ ps

PID TTY TIME CMD

860 pts/1 00:00:00 bash

922 pts/1 00:00:00 ps

The output of ps shows two programs running - bash and ps. Each process has a process id or PID, both are running on the "terminal" pts/1, neither process was consuming CPU time when ps took its "snapshot". UNIX keeps a table of running processes called the process table. This table collects the name of the process and useful data about the process such as who is running it, how much CPU time is being consumed and where the process is running (the session or "terminal" or "tty" the process is run from).

Adding some options to the process command shows a more complete picture of what is happening in the computer:

$ ps aux

USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND

root 1 0.0 0.1 1104 460 ? S May26 0:04 init [3]

root 2 0.0 0.0 0 0 ? SW May26 0:00 [kflushd]

root 3 0.0 0.0 0 0 ? SW May26 0:00 [kupdate]

root 4 0.0 0.0 0 0 ? SW May26 0:00 [kpiod]

root 5 0.0 0.0 0 0 ? SW May26 0:00 [kswapd]

root 6 0.0 0.0 0 0 ? SW< May26 0:00 [mdrecoveryd]

bin 286 0.0 0.1 1196 396 ? S May26 0:00 portmap

root 302 0.0 0.1 1088 464 ? S May26 0:00 /usr/sbin/apmd -p 10

root 355 0.0 0.2 1152 560 ? S May26 0:00 syslogd -m 0

root 366 0.0 0.2 1404 732 ? S May26 0:00 klogd

daemon 382 0.0 0.1 1128 484 ? S May26 0:00 /usr/sbin/atd

root 398 0.0 0.2 1300 600 ? S May26 0:00 crond

root 418 0.0 0.1 1124 484 ? S May26 0:00 inetd

root 434 0.0 0.1 1176 488 ? S May26 0:00 lpd

root 473 0.0 0.4 2104 1108 ? S May26 0:00 sendmail: accepting

root 490 0.0 0.1 1132 444 ? S May26 0:00 gpm -t ps/2

root 506 0.0 0.5 2560 1316 ? S May26 0:00 httpd

nobody 510 0.0 0.5 2748 1412 ? S May26 0:00 httpd

nobody 511 0.0 0.5 2748 1412 ? S May26 0:00 httpd

nobody 512 0.0 0.5 2748 1412 ? S May26 0:00 httpd

nobody 513 0.0 0.5 2748 1412 ? S May26 0:00 httpd

nobody 514 0.0 0.5 2748 1412 ? S May26 0:00 httpd

nobody 515 0.0 0.5 2748 1412 ? S May26 0:00 httpd

nobody 516 0.0 0.5 2748 1412 ? S May26 0:00 httpd

nobody 517 0.0 0.5 2748 1412 ? S May26 0:00 httpd

nobody 518 0.0 0.5 2748 1412 ? S May26 0:00 httpd

nobody 519 0.0 0.5 2748 1412 ? S May26 0:00 httpd

xfs 533 0.0 0.3 1920 996 ? S May26 0:00 xfs -droppriv -daemon

root 573 0.0 0.1 1076 384 tty2 S May26 0:00 /sbin/mingetty tty2

root 574 0.0 0.1 1076 384 tty3 S May26 0:00 /sbin/mingetty tty3

root 575 0.0 0.1 1076 384 tty4 S May26 0:00 /sbin/mingetty tty4

root 576 0.0 0.1 1076 384 tty5 S May26 0:00 /sbin/mingetty tty5

root 577 0.0 0.1 1076 384 tty6 S May26 0:00 /sbin/mingetty tty6

root 700 0.0 0.2 1460 708 ? S May26 0:00 in.telnetd

root 701 0.0 0.4 2216 1116 pts/0 S May26 0:00 login -- student

student 702 0.0 0.3 1728 964 pts/0 S May26 0:00 -bash

root 823 0.0 0.3 2196 1000 tty1 S May26 0:00 login -- root

root 844 0.0 0.3 1764 1000 tty1 S May26 0:00 -bash

root 858 0.0 0.2 1460 708 ? S May26 0:00 in.telnetd

root 859 0.0 0.4 2216 1116 pts/1 S May26 0:00 login -- student

student 860 0.0 0.3 1728 964 pts/1 S May26 0:00 -bash

student 924 0.0 0.3 2496 848 pts/1 R 01:29 0:00 ps aux

From this we can see that the ps command without options shows only the processes being run by the user student. Adding options in this last case shows all of the processes being run by all users including root. From this we can see that there is a lot going on in the computer even though we can't usually see it.

D. Man pages - man

There are hundreds of programs on a typical UNIX system, some of them very complicated. Fortunately, there is almost always documentation for these programs in the form of man (short for manual) pages. Modern systems frequently have other documentation systems as well, but the man pages are the traditional online documentation system. To use the man pages, one types "man <program_name>" like this:

$ man cat; CAT(1) FSF CAT(1)
: NAME
: cat - concatenate files and print on the standard output
: SYNOPSIS
: cat [OPTION] [FILE]...
: DESCRIPTION
: Concatenate FILE(s), or standard input, to standard output.
: -A, --show-all; equivalent to -vET
: -b, --number-nonblank; number nonblank output lines
: -e equivalent to -vE
: -E, --show-ends; display $ at end of each line
: -n, --number; number all output lines
: -s, --squeeze-blank; never more than one single blank line
: -t equivalent to -vT
: -T, --show-tabs; display TAB characters as ^I
: -u (ignored)
: -v, --show-nonprinting; use ^ and M- notation, except for LFD and TAB
: --help display this help and exit
: --version; output version information and exit; With no FILE, or when FILE is -, read standard input.
: REPORTING BUGS
: Report bugs to bug-textutils@gnu.org.; SEE ALSO
: The full documentation for cat is maintained as a Texinfo manual. If the info and cat programs are properly installed at your site, the command info cat should give you access to the complete manual.
: COPYRIGHT
: Copyright © 1999 Free Software Foundation, Inc.; This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
: GNU textutils 2.0 August 1999 1
: (END)

To get out of the current man page, type "q" - the program exits immediately (no need to hit return).

Suppose you wanted more information on how to use the man pages. There is a man entry for man:

$ man man
: man(1) man(1)
: NAME
: man - format and display the on-line manual pages; manpath - determine user's search path for man pages
: SYNOPSIS
: man [-acdfFhkKtwW] [-m system] [-p string] [-C config_file] [-M path] [-P pager] [-S section_list]; [section] name ...
: DESCRIPTION
: man formats and displays the on-line manual pages. This version knows about the MANPATH and (MAN)PAGER environment variables, so you can have your own set(s) of personal man pages and choose whatever program you like to display the formatted pages. If section is specified, man only looks in that section of the manual. You may also specify the order to search the sections for entries and which preprocessors to run on the source files via command line options or environment variables. If name contains a / then it is first tried as a filename, so that you can do man ./foo.5 or even man /cd/foo/bar.1.gz.; .; . (omitted); .; before being printed.; :

(note that much of the screen output has been omitted)

Notice the colon at the bottom of the entry - that is a signal that the man entry was too big to fit in one screen full of text and that there is more info off screen. Hit the spacebar to get another screen full (omitted here).

Suppose you were not quite sure what the name of a command was, but you needed to know that, too.

$ man -k edit

ed, red (1) - text editor

edquota (8) - edit user quotas

gEdit (1) - GTK+ based text editor

gnp+ (1) - gnotepad+, a GTK-based notepad/text editor

mcedit (1) - Full featured terminal text editor for Unix-like systems.

netwave_cs (4) - Xircom Creditcard Netwave device driver

pico (1) - simple text editor in the style of the Pine Composer

prompter (1) - prompting editor front-end for nmh

readline (3) - get a line from a user with editing

sed (1) - a Stream EDitor

tcsh (1) - C shell with file name completion and command line editing

tksysv (8) - a runlevel service editor

vim (1) - Vi IMproved, a programmers text editor

vipw, vigr (8) - edit the password or group files

FvwmConsoleC.pl (1) - Command editor for FVWM command input interface

bitmap, bmtoa, atobm (1x) - bitmap editor and converter utilities for the X Window System

editres (1x) - a dynamic resource editor for X Toolkit applications

xedit (1x) - simple text editor for X

(END)

In many systems, the command "apropos" is a synonym for "man -k". This option to the man pages searches for all references to the argument and outputs a list of appropriate man pages that might have something to do with the search topic.

GNU abandoned man pages some years ago - they are still included but no longer maintained. The current documentation for Linux systems uses the texinfo program, invoked by typing the word info:

$ info

File: dir Node: Top This is the top of the INFO tree

This (the Directory node) gives a menu of major topics.

Typing "q" exits, "?" lists all Info commands, "d" returns here,

"h" gives a primer for first-timers,

"mEmacs<Return>" visits the Emacs topic, etc.

In Emacs, you can click mouse button 2 on a menu item or cross reference

to select it.

* Menu:

Texinfo documentation system

* Texinfo: (texinfo). The GNU documentation format.

. (omitted)

* groups: (sh-utils)groups invocation. Print group names a user is in.

-----Info: (dir)Top, 240 lines --Top-------------------------------------------

Welcome to Info version 3.12h. "C-h" for help, "m" for menu item.

The texinfo program is self-documenting. You are welcome to try it yourself. There is a brief online tutorial (type "h" when you get the above screen). We won't go any further than mentioning it here since not all versions of UNIX use it.

E. Pagers - more and less

Try typing the command:

$ ls -l /etc

total 1300

drwxr-xr-x 3 root root 4096 Feb 12 2000 CORBA/

-rw-r--r-- 1 root root 2045 Sep 24 1999 DIR_COLORS

-rw-r--r-- 1 root root 9 May 26 15:27 HOSTNAME

-rw-r--r-- 1 root root 41 May 29 2000 MACHINE.SID

-rw-r--r-- 1 root root 5421 Sep 25 1999 Muttrc

drwxr-xr-x 12 root root 4096 May 16 2000 X11/

-rw-r--r-- 1 root root 1332 Sep 26 1999 up2date.conf

drwxr-xr-x 2 root root 4096 Feb 12 2000 vga/

-rw------- 1 root root 529 May 13 2000 wvdial.conf

-rw-r--r-- 1 root root 361 Feb 12 2000 yp.conf

there is so much info returned (1300 blocks are in the directory, although 138 lines are returned - most have been omitted here for brevity) that it flies off the screen faster than you can see it. Only the bottom screen full can be seen. This is fixed by using pagers - programs that only allow 1 screen full of information to be displayed at a time.

Now try writing the output of the same command to a file via

$ ls -l /etc > list

Try looking at the contents of list using

$ cat list

The same long output again blows past the screen. Now try

$ more list

total 1300

drwxr-xr-x 3 root root 4096 Feb 12 2000 CORBA/

-rw-r--r-- 1 root root 2045 Sep 24 1999 DIR_COLORS

-rw-r--r-- 1 root root 9 May 26 15:27 HOSTNAME

-rw-r--r-- 1 root root 41 May 29 2000 MACHINE.SID

-rw-r--r-- 1 root root 5421 Sep 25 1999 Muttrc

drwxr-xr-x 12 root root 4096 May 16 2000 X11/

-rw-r--r-- 1 root root 1262 Feb 12 2000 localtime

-rw-r--r-- 1 root root 1180 Sep 22 1999 login.defs

-rw-r--r-- 1 root root 589 Jun 16 1999 logrotate.conf

--More--(52%)

the first part of the output (in fact, 52% of it) is displayed. You know there is more to the file because of the

--More--(52%)

at the bottom of the screen. Hit the spacebar - more of the file is displayed. Keep going until you get back to a command prompt. Instead of the spacebar, you can hit the enter key to scroll through the rest of the file one line at a time.

Now try the same experiments with

$ less list

Notice the similarity to the "more" output. Less is actually a more sophisticated version of more - and its name is typical UNIX word game. On some systems, there is another pager "pg", but not on Linux. We'll come back to pagers later.