Introduction
First, a credit. This
tutorial has been inspired and enriched by a turorial written my M.Stonebank@surrey.ac.uk..
The majority of bioinformatics software, and scientific software in general,
is developed to run on the UNIX platform. For this reason, it is important
that scientists know how to use UNIX. This workshop is designed to give
you an introduction to some basic concepts about how UNIX works, and commands
that you will use frequently.
There are many tutorials for UNIX which can be found on the Web.
Each tutorial will introduce you UNIX in its own way. This workshop
will introduce UNIX to you from the perspective of bioinformatics.
While we will probably not use these commands frequently, understanding
how an operating system works, under the hood, is useful for scientists.
Besides, you can add UNIX to your resume!
You are welcome to use your UNIX account and the multiplicity of UNIX tutorials you can google to develop these skills for your own use. The good news is that you can do very fine bioinformatics investigations these days never having to care much about what operating system you are using.
UNIX and Servers
UNIX is a common multi-user operating system. By operating system, we
mean the suite of programs which make the computer work. UNIX is used
by some the workstations and servers within the school. Other workstations
may be running different operating systems such as Linux or Windows.
A multi-user operating system is one which allows multiple users to interact
with a computer at the same time on a server. Traditionally, a single-user operating system, such
as most versions of Microsoft Windows you have dealt with, are designed best to be used by one person at a time. These defiiitions are blurring, however.
On X terminals and workstations, X-Windows provides a graphical interface
between the user and UNIX. However, knowledge of UNIX is required for
operations which are not covered by a graphical program, or for when there
is no X windows system, for example, in a (secure) shell session.
The UNIX operating system
The UNIX operating system is made up of three parts; the kernel, the
shell and the programs.
The kernel
The kernel of UNIX is the heart of the operating system: it allocates
time and memory to programs and handles the filesystems and communications
in response to system calls.
As an illustration of the way that the shell and the kernel work together,
suppose a user types rm myfile (which has the effect of removing
the file called myfile). The shell searches the filesystem for the file
containing the program rm, and then requests the kernel, through system
calls, to execute the program rm on myfile. When the process rm myfile
has finished running, the shell then returns the UNIX prompt to the user,
indicating that it is waiting for further commands.
The shell
The shell acts as an interface between the user and the kernel. When
a user logs in, the login program checks the username and password, and
then starts another program called the shell. The shell is a command line
interpreter (CLI). It interprets the commands the user types in and arranges
for them to be carried out. The commands are themselves programs: when
they terminate, the shell gives the user another prompt waiting for the
user to enter another command.
The adept user can customise his/her own shell, and users can use different
shells on the same machine. Staff and students in the school have the
kcsh shell, or Korn shell by default.
The Korn shell is the most advanced of the shells that are "officially"
distributed with UNIX systems. Some of the features of Korn shell include:
- Command-line editing, issuing text instructions (and getting text output)
- Integrated programming features: the functionality of several external
UNIX commands, including test, expr, getopt, and echo, has been integrated
into the shell itself, enabling common programming tasks to be done
more cleanly and without creating extra processes.
- Control structures, especially the select construct, which enables
easy menu generation.
- Debugging primitives that make it possible to write tools that help
programmers debug their shell code.
- Regular expressions, well known to users of UNIX utilities like grep
and awk, have been added to the standard set of filename wildcards and
to the shell variable facility.
- Advanced I/O features, including the ability to do two-way communication
with concurrent processes (coroutines).
- New options and variables that give you more ways to customize your
environment.
- Increased speed of shell code execution.
- Security features that help protect against "Trojan horses"
and other types of break-in schemes.
Files and processes
Everything in UNIX is either a file or a process.
A process is an executing program identified by a unique PID (process
identifier).
A file is a collection of data. They are created by users using text
editors, running programs, etc.
Examples of files:
- a document (report, essay etc.)
- the text of a program written in some high-level programming language
- instructions comprehensible directly to the machine and incomprehensible
to a casual user, for example, a collection of binary digits (an executable
or binary file);
- a directory, containing information about its contents, which may
be a mixture of other directories (subdirectories) and ordinary files.
The Directory Structure
All the files are grouped together in the directory structure. The file-system
is arranged in a hierarchical structure, like an inverted tree. The top
of the hierarchy is traditionally called root.
In this diagram, the directory /users2/joe contains a subdirectory classes. As you will see, on our systems, most or all you you will be listed in a directory called /home
[ Next: Logging in and Basic Commands ]
Page last modified
January 9, 2008
|