Introduction to Linux
and PC Cluster
October 5, 2010
Morris Law, IT Coordinator,
Science Faculty, Hong Kong
Baptist University
Outline - Linux












Introduction to Linux
History of UNIX and Linux
Login, logout and changing the password
Basic Linux command
Linux hierarchical file system
Linux shell environment
Editors: vi, pico, emacs, joe, nano
Basic shell scripts
Compiling, link and run C, C++, Fortran programs
Foreground and Background jobs
File transfer from other PCs in different platform
Linux distributions
Outline – PC Cluster



Introduction to the PC Clusters
What is a PC cluster
The different kinds of PC clusters



High Performance Computing (HPC) cluster vs
Single System Image (SSI) cluster
How to build your own PC cluster
Introduction to existing HPC cluster in Faculty
of Science, HKBU
Introduction to UNIX/Linux







UNIX/Linux are multi-tasking, multi-user OS.
UNIX is originated from UNICS and MULTICS, 1969.
time sharing environment
UNIX/Linux commands are reusable and compact.
hierarchical file systems with easy-to-manage file
permission scheme
In 1991, Linus Torvalds, released the first version of
Linux kernel on PCs.
Linux is an open source system, it grew to be a
powerful and competitive operating system in PCs,
MACs and even some brand name workstations.
History of UNIX / Linux
1969
Unics, by Ken Thompson at Bell Laboratory, runs on Digital Equipment PDP-7.
Multics, developed by Bell, MIT and General Electric
1970
Unics moved to PDP-11/20 Ritchie designed and wrote first C compiler for UNIX
1973
Ritchie and Thompson rewrote UNIX kernel in C
1975
Sixth Edition (V6) was released
1978
first version of BSD was built by Bill Joy, University of California, Berkeley (UCB).
1979
Seventh Edition (V7) was released and implemented on DEC PDP-11, the Interdata 8/32, and the
VAX. first VAX version of BSD (3BSD) was released
1980
Bill ported the 32V version of UNIX to DEC's VAX machine 4BSD was released
1981
4.1BSD was released
1983
System V developed by AT&T based on V7 was first released. 4.2BSD was released
1984
AT&T start market UNIX hardware and software. System V release 2 X was developed by MIT as
part of Project Athena
1986
System V release 3
1987
4.3BSD was released
1988
BSD Networking Release 1 X Consortium was formed. The aim is to formulate the generally
accepted standard in X
1989
System V release 4 (SVR4) largely written by SUN Microsystems, included many features in BSD
1990
AT&T established UNIX System Laboratory (USL) for marketing System V and handle license and
further development
History of UNIX/Linux
1991
BSD Networking Release 2, led to development of 386BSD
1991
Linus Torvalds released Linux version 0.2
1993
USL was acquired by Novell. Novell gave the UNIX trademark to X/Open. Novell add
NetWare support to System V
1993
Slackware, the oldest linux distribution was first released. Debian project was
established.
1994
Linux kernel version 1.0 released. RedHat and SUSE published version 1.0 of their
Linux distributions
1995
Linux was ported to DEC alpha and to Sun SPARC
1996
Linux Kernel 2.0 was released. It supported multi-CPU.
1998
Major companies like IBM, Compaq and Oracle announce their support to Linux. The
Graphical Environment KDE began development.
1999
The Graphical Environment GNOME began development.
2003
Linux kernel 2.6 released on 18 December, 2003.
2004
XFree86 team split up and joined the existing X Windows standards body to form the X
Org Foundation.
2005
The project OpenSUSE began a free distribution from Novell's community. The
OpenOffice.org project introduced version 2.0.
What is Linux





When Linus Torvalds was still a student in Helsinki University, he developed his
hobby to Minix, a small UNIX system and decided to develop a system that
exceeded the Minix standards. He began his work in 1991 when he released version
0.02 and worked steadily until 1994 when version 1.0 of the Linux Kernel was
released.
The current full-featured version is 2.6 (released 18 December 2003) and
development continues.
Since Linus only developed the Linux kernel, to make Linux a popular operating
system nowadays, the contribution of GNU software paid an important role. The
GNU project was started in 1984 by Richard Stallman who would like to develop free
software. The decision of Linux development under GNU public license accelerated
the growth of GNU project after Linux was released in 1991. Some GNU software
even develop on Linux platform first before it will be ported to other platform.
At the same year, the Internet grew and became a solid ground for collaborating
work by volunteers all over the world.
Many distribution of Linux was released on different hardware such as PCs,
PowerPC, Macintosh and even brand name UNIX. Though Linux kernel is free and
open source, these distribution may not be free since the software packaged may
include some commercial software.
Login and Logout




To login to UNIX/LINUX system, you have to find a terminal.
There are two kinds of terminal, namely ASCII terminal and
graphical terminal.
In ASCII terminal, command-line input are supported while in
graphical terminal, users can input their command by mouse and
keyboard and it also support graphical display.
Once you find an ASCII terminal, a login prompt like the following
can be found.
Fedora release 13 (Goddard)
kernel 2.6.33.3-85.fc13.x86_64 on an x86_64
(tty3)
cf8200-07 login:
Basic Linux commands –
working with files & directories
cat
cat f1 f2
type the content of file f1 and f2
cd
cd $HOME
change to home directory
cp
cp f1 f2 dir1
copy file f1 and f2 into directory dir1
ls
ls -la
list all files (including hidden) in long format
mkdir
mkdir abc
make new directory
more
more a1 a2
list out files a1, a2 in pages
mv
mv f1 dir1
move/rename file f1 into dir1
pwd
pwd
display current working directory
rm
rm -rf lab1
delete all files in lab1 without confirmation
rmdir
rmdir lab1
delete an empty directory
Basic Linux commands –
working in the shell (1/2)
cal
cal 11 2010
display the calendar of November, 2010
compress
compress file1
form a compress file file1.Z
date
date
display the current time and date
df
df
List information of space used in the system
diff
diff f1 f2
compare text between two files
du
du
summarized disk usage of your home directory
find
find ./ -name .cshrc -print search and print the file .cshrc
grep
grep student *
search all files with the word student
history
history 50
find the last 50 commands stored in the shell
kill
kill -9 2036
terminate the process with pid 2036
Basic Linux commands –
working in the shell (2/2)
logout
logout
leave the systems
lpr
lpr -h f1 f2
print f1, f2 without header page
man
man tar
displaying the manual page on-line, e.g. tar
nohup
nohup matlab < a &
run matlab (a.m) without hang up after logout
ps
ps -ef
find out all process run in the systems
sort
sort -r -n studno
sort studno in reverse numerical order
tar
tar cvf abc.tar abc/
create archive file
uncompress
uncompress file1.Z
the opposite of compress
wc
wc -l f1
count the number of lines in f1
who
who
who is on-line
whoami
whoami
identify yourself
Linux file system


Linux is a file-oriented
system. In Linux, files
can be regular files,
directories or special
files such as devices,
sockets.
A hierarchical directory
structure similar to an
inverted tree can be
found.
/ (root)
usr
lib
users
tmp
bin
staff
student
visitor
var
dev
sbin
null
/dev/null
guest
gu09
/users/staff/guest/gu09
File




Files are identified by their file names, File names are up to 255 character long.
Hidden files are files with name preceding with dot (.).
Each file in UNIX/Linux has its own ownership and permissions which can be shown by
listing the directory content in long format (ls -l).
The following show a file, stafflist, 34 bytes in size, which last modified on 19/09/97. It is
owned by a user called morris which is a staff of the Dean's Office.


The ownership can be changed by the command chown and chgrp,


chown cwyeung stafflist; chgrp math_stf stafflist
The first field in the above example represents the permission bits of the files.






-rw-rw-r-- 1 morris dean 34 Sep 19 1997 stafflist
The first column shows its kind, `d' represent a directory,
`-' represent a regular file.
The rest can be divided into 3 groups showing its user permission, group permission and other
permission respectively. Each group can have read (r), write (w) and/or executable(x) permission bits.
A `-' deny the corresponding permission of the file.
Refer to the last example, stafflist is a regular file which can be updated (rw- in user bit and group bit)
by morris and dean staff. It can be read by other users (r-- in other bit). Unfortunately, the file cannot
be executed by any body since a `-' is found in each executable bit.
One can change the permission bit by using chmod, two methods can be used.

use u,g,o,a flag with +, - to add or delete their permission



Use 3 octal numbers calculated using 4 for `r', 2 for `w' and 1 for `x'


chmod g-rw,o-r stafflist - deny rw permission for users in same group - deny r permission for other users
chmod a+x stafflist - add executable permission to all users
chmod 700 stafflist - same effect as the above
Use ls -l to check the result.
Path





To locate a file, one should use the absolute path or relative path.
Absolute path is the path describe starting from root (/).
 /users/staff/guest/gu01/sampledir/sample.txt
Relative path is the path describe from the current working
directory(.).
 sampledir/sample.txt refer to the same file when gu01's current
working directory is /users/staff/guest/gu01.
Use pwd to find the current working directory.
In path definition,
 Current directory can be described by `.'.
 Parent directory can be described by `..'.
 Home directory can be described by `~' or the environmental
variable '$HOME'.
Linux shell environment


Shell is the front end for users to interact with the Linux kernel.
Commands can be typed in from the shell prompt to do file manipulation





Different shells can be found in Linux. The most common shells are






file copying,
renaming and deleting,
start an text editor or
compile and run a program, etc.
Bourne Again Shell (bash), Bourne (sh, old and standard),
Korn (ksh, the default),
C (csh, C like command) shell.
These shell support both foreground and background processes, pipes, filters and other
standard features in Linux. Besides handling Linux commands, these shells support the
executions of batch files called shell scripts.
The default shell prompt for the Bourne again, Bourne and Korn shells are ($) and that for the C
shell is (%).
A typical command line have the following syntax,

command [-options] arg1 arg2 arg3 ...
where arg1, arg2, arg3, etc. are argument input based on the nature of the commands.



Built-in command are interpreted directly.
If the command contains a path, the shell will only search for the command in the path.
If no path is declared, the shell will find in the search path ($PATH) for the command.
Linux editor

The most frequently used program in Linux is an editor. A good choice
of editor to suit your need is crucial to most program developer.
Common editor in Linux are,








vi (standard Linux full-screen editor)
emacs (macro reach)
pico/nano (command driven full-screen editor)
joe (word star like editor)
Since all UNIX systems have installed vi editor, UNIX experts learn vi.
Emacs editor are reached in macro for formatting text. Therefore, it is
good for program developer to write code in different programming
languages.
Pico, nano and joe editor support full screen and cursor editing. They
are good for novices.
X-window editors are editors which support window and mouse editing.
Xemacs and gedit are two examples.
Shell script examples (1/3)
A bash shell script (CheckTemp.sh) for reporting
high temperature given an input.
#!/bin/bash
high=33
if [ $1 -ge $high ];
then
echo "*** High temperature signal!"
else
echo "Normal temperature!"
fi

Shell script examples (2/3)
A bash script (hosts.sh) for setting up hostnames
and IP tables for 256 nodes in a cluster
#!/bin/bash
for i in $(seq 0 255)
do
k=`expr $i / 16 + 1`
l=`expr $i % 16 + 1`
echo "compute-0-$i
10.1.$k.$l"
done

Shell script examples (3/3)

A csh script (fingerall.sh) for listing the finger
information of all users in the linux workstation
#!/bin/csh
set username = `cat /etc/passwd |awk –F':'
'{print $1}'`
foreach i ( $username )
echo $i
finger $i
end
Compiling, link and run C, C++,
Fortran programs

Compiling C programs



Compiling Fortran programs



cc [-o a.exe] a.c
Without -o option, the executable file will be
named as a.out.
f77 [-o t1] a.f
the name of the executable can be set freely.
Compiling C++ programs

g++ [-o t1] a.C
Background Jobs



Program with long running time should be
placed in background.
UNIX/Linux allowed background running of
programs with nohup command.
Run the program preceeding with nohup and
end with an ‘&’.

nohup abc &
File transfer from other PCs in
different platforms





File transfer between MS Windows and UNIX can be done by starting secure ftp program from
Windows. For examples, winscp.
Install and run winscp downloadable from www.openssh.org
Connect a host session with your username and password
On the left listed your windows desktop, on the right, you will see your linux file/directories.
Just drag and drop files or directories between them to perform file transfer.
Assorted Linux distributions














SuSE (commercial supported with open source variant OpenSuSE)
RedHat (commercial supported)
Caldera OpenLinux (SCO open server)
Turbo Linux (Japanese, support HA)
RedFlag (Chinese based)
Xandros (Commercial, fit for netbook and handheld device)
Slackware (Earliest distribution)
Debian (First community based linux)
Mandriva (Derived from Mandrake, good desktop interface)
Ubuntu (Charity formed in South Africa)
Gentoo Linux (High optimized)
Fedora (Redhat support open source variants)
CentOS (Enterprised level community support)
Knoppix (Live CD/DVD)
Outline – PC Cluster



Introduction to the PC Clusters
What is a PC cluster
The different kinds of PC clusters



High Performance Computing (HPC) cluster vs
Single System Image (SSI) cluster
How to build your own PC cluster
Introduction to existing HPC cluster in Faculty
of Science, HKBU
What is a PC cluster?

An ensemble of networked, stand-alone
common-off-the-shelf computers used
together to solve a given problem.
Different kinds of PC cluster



High Performance Computing Cluster
(Beowulf cluster)
Load Balancing
High Availability
High Performance Computing
Cluster (Beowulf)





Start from 1994
Donald Becker of NASA assemble the world’s
first cluster with 16 sets of DX4 PCs and 10
Mb/s ethernet
Also called Beowulf cluster
Built from commodity off-the-shelf hardware
Applications like data mining, simulations,
parallel processing, weather modelling,
computer graphical rendering, etc.
Examples of Beowulf cluster

Scyld Cluster O.S. originated by Donald Becker


ROCKS from NPACI


http://oscar.sourceforge.net
OpenSCE from Thailand


http://www.rocksclusters.org
OSCAR from open cluster group


http://www.scyld.com
http://www.opensce.org
SCore from PC Cluster Consortium, Japan

http://www.pccluster.org/
Load Balancing Cluster



PC cluster deliver load balancing
performance
Commonly used with busy ftp and web
servers with large client base
Large number of nodes to share load
High Availability Cluster




Avoid downtime of services
Avoid single point of failure
Always with redundancy
Almost all load balancing cluster are with HA
capability
Examples of Load Balancing
and High Availability Cluster

RedHat Cluster Suite


Turbolinux Cluster Server


http://www.turbolinux.com/products/middleware/tlc
s8.html
Linux Virtual Server Project


http://www.redhat.com/cluster_suite/
http://www.linuxvirtualserver.org/
Single System Image Cluster for Linux

http://www.openssi.org
Screenshots 1
An example of Beowulf Cluster:
ROCKS
(http://www.rocksclusters.org)
ROCKS SNAPSHOTS

The schematic diagram of a rocks cluster
ROCKS SNAPSHOTS

Installation of a compute node
ROCKS SNAPSHOTS

Ganglia Monitoring tools
HPCC Cluster and parallel
computing applications





Message Passing Interface
 MPICH (http://www-unix.mcs.anl.gov/mpi/mpich/)
 LAM/MPI (http://lam-mpi.org)
Mathematical
 fftw (fast fourier transform)
 pblas (parallel basic linear algebra software)
 atlas (a collections of mathematical library)
 sprng (scalable parallel random number generator)
 MPITB -- MPI toolbox for MATLAB
Quantum Chemistry software
 gaussian, qchem, amber
Molecular Dynamic solver
 NAMD, gromacs, gamess
Weather modelling
 MM5 (http://www.mmm.ucar.edu/mm5/mm5-home.html)
NAMD2 – Software for
Quantum Chemistry
Single System Image
(SSI) Cluster
MOSIX
openMosix
MOSIX and openMosix

MOSIX: MOSIX is a software package that enhances the Linux
kernel with cluster capabilities. The enhanced kernel supports any
size cluster of X86/Pentium based boxes. MOSIX allows for the
automatic and transparent migration of processes to other nodes
in the cluster, while standard Linux process control utilities, such
as 'ps' will show all processes as if they are running on the node
the process originated from.

openMosix: openMosix is a spin off of the original Mosix. The first
version of openMosix is fully compatible with the last version of
Mosix, but is going to go in its own direction.
OpenMosix installation


Install Linux in each nodes
Download and install

openmosix-kernel-2.4.26-openmosix1.i686.rpm
openmosix-tools-0.3.6-2.i386.rpm
and related packages like thoses in

www.openmosixview.com



Reboot with openmosix kernel
Screenshots 2
OpenMosix cluster management
openMosix cluster
management tools



openMosixView
openMosixmigmon
3dmosmon
Advantage of SSI cluster




Not need to parallelize code
Automatic process migration, i.e. load
balancing
Add / delete nodes at any time
Well aware of hardware and system
resources
PC clusters in Faculty
of Science, HKBU
PII 4-node clusters started in
1999 (obsolete)
PIII 16 node
cluster
purchased in
2001.
(obsolete)


Plan for grid
For test base
HKBU - 64-nodes P4-Xeon
cluster at #300 of top500
TDG cluster configuration

Master node:





DELL PE2650 P4 Xeon
2.8GHz x 2
4GB ECC DDR RAM
36GB x 2 internal HD
running RAID 1 (mirror)
73GB x 10 HD array
running RAID 5 with hot
spare
Compute nodes x 64
each with



DELL PE2650 P4 Xeon
2.8GHz x 2
2GB ECC DDR RAM
36GB internal HD
Interconnect
configuration

Extreme
BlackDiamond
6816 Gigabit
ethernet switch
16-node P4 Xeon Cluster for
computational research from 2005

16 compute nodes
each with




P4 Xeon 3.2GHz x 2
2GB RAM
36GB SCSI harddisk
ROCKS 4.0.0
Sciblade Cluster
256-node clusters supported by fund from RGC in 2009
51
Hardware Configuration of the
newest PC cluster -- sciblade

Master Node



IO nodes (Storage)




Dell PE1950, 2x Xeon E5450 3.0GHz (Quad Core)
16GB RAM, 73GB x 2 SAS drive
Dell PE2950, 2x Xeon E5450 3.0GHz (Quad Core)
16GB RAM, 73GB x 2 SAS drive
3TB storage Dell PE MD3000
Compute nodes x 256 each



Dell PE M600 blade server w/ Infiniband network
2x Xeon E5430 2.66GHz (Quad Core)
16GB RAM, 73GB SAS drive
52
Hardware Configuration





Blade Chassis x 16
 Dell PE M1000e
 Each hosts 16 blade servers
Management Network
 Dell PowerConnet 6248 (Gigabit Ethernet) x 6
Inerconnect fabric
 Qlogic SilverStorm 9120 switch
Console and KVM switch
 Dell AS-180 KVM
 Dell 17FP Rack console
Emerson Liebert Nxa 120kVA UPS
53
PC cluster nowadays

Node hardware





Multi-core CPUs with L2 and L3 cache
DDR RAM
Large harddisk (over 500GB per disk)
Blade / Rack mount server
Storage

SAN, I/O nodes, parallel file systems
PC cluster nowadays (cont)

Interconnect

Gigabit Ethernet, Myrinet, Infiniband, Quadrics
Reference URLs







Clustering and HA
Beowulf , parallel Linux cluster.
ROCKS from NPACI
OPENMOSIX , scalable cluster computing with
process migration
High Performance Cluster Computing Centre
Supported by Dell and Intel
Linux Cluster Information Center
The Quantian Scientific Computing Environment
Thank you!
Welcome to visit HPCCC, HKBU
http://www.sci.hkbu.edu.hk/hpccc/
Descargar

Introduction to Linux and PC Cluster