PERL
Variables and data structures
Andrew Emerson, High Performance Systems, CINECA
The “Hello World” program
Consider the following:
#
#
Hello World
#
$message=“Ciao, Mondo”;
print “$message \n”;
exit;
Perl Variables
$message is called a variable, something with a
name used to hold one or more pieces of
information.
All computer languages have the ability to create
variables to store and manipulate data.
Perl differs from other languages because you do
not specify the “type” (i.e. integer, real, character,
etc.) only the “complexity” of the data.
Perl Variables
Perl has 3 ways of storing data:
1. Scalar
 For single data items, like numbers or strings.
2. Arrays
 For ordered lists of scalars. Scalars indexed by
numbers.
3. Associative arrays or “hashes”
 Like arrays, but uses “keys” to identify the scalars.
Scalar Variables
Examples
#
$no_of_chrs=24;
# integer
$per_cent_identity=0;
# also integer
$per_cent_identity=99.50;
# redefined as real
$pi = 3.1415926535;
# floating point (real)
$e_value=1e-40;
# using scientific notation
$dna=“GCCTACCGTTCCACCAAAAAAAA”; # string -double quotes
$dna=‘GCCTACCGTTCCACCAAAAAAAA’; # string -single quotes
Scalar Variables
CASE is important, $DNA ≠ $dna;
(true for all variables)
Scalars must be prefixed with a $ whenever they are used (is
there a $? Yes → it is a scalar). The next character should
be a letter and not a number (true for all variables).
Scalars can be happily redefined at any time (e.g. integer →
real → string):
# unlikely example
$dna = 0; # integer
$dna = “GGCCTCGAACGTCCAGAAA”; # now it’s a
# string
Doing things with scalars..
#
$a =1.5;
$b =2.0; $c=3;
$sum = $a+$b*$c; # multiply by $b by $c, add to $a
#
while ($j<100) {
$j++; # means $j=$j+1, i.e. add 1 to j
print “$j\n”;
}
#
$dna1=“GCCTAAACGTC”;
$polyA=“AAAAAAAAAAAAAAAA”;
$dna1 .= $polyA; # add one string to another
# (equiv. $dna1 = $dna1.$polyA)
$no_of_bases = length($dna2); # length of a scalar
More about strings..
There is a difference between strings with ‘ and “
double quotes
#
OUTPUT
$nchr = 24;
$message=“chromosones in human cell
=$nchr”;
print $message;
$message = ‘chromosones in human cell
=$nchr’;
print $message;
exit;
single quotes
chromosones in
human cell =24
chromosones in
human cell
=$nchr
More about strings
Double quotes “ interpret variables, single quotes ‘
do not:
$dna=‘GTTTCGGA’;
OUTPUT
print “sequence=$dna”;
sequence=GTTTCGGA
print ‘sequence=$dna’;
sequence=$dna
Normally you would want double quotes
when using print.
Arrays
Collections of numbers, strings etc can be stored in arrays.
In Perl arrays are defined as ordered lists of scalars and
are represented with the @ character.
@days_in_month=(31,28,31,30,31,30,31,31,30,31,30,31);
@days_of_the_week=(‘mon’, ‘tue’, ‘wed’
,’thu’,’fri’,’sat’,’sun’);
@bases = (‘adenine’, ‘guanine’, ‘thymine’, ‘cytosine’,
‘uracil’);
@GenBank_fields=( ‘LOCUS’,
‘DEFINITION’,
‘ACCESSION’,
...
);
Initializing arrays with lists
Arrays - elements
To access the individual array elements you use [ and ] :
@poly_peptide=(‘gly’,’ser’,’gly’,’pro’,’pr
o’,’lys’,’ser’,’phe’);
# now mutate the peptide
$poly_peptide[0]=‘val’;
$i=0;
Look
# print out what we have
while ($i<8) {
print “$poly_peptide[$i] “;
$i++;
}
The numbers used to identify the elements are
called indices.
array index
Arrays - elements
When accessing array elements you use $ - why ?
Because array elements are scalar and scalars must
have $;
@poly_peptide=(..);
$poly_peptide[0] = ‘val’;
This means that you can have a separate variable
called $poly_peptide because $poly_peptide[0] is part
of @poly_peptide, NOT $poly_peptide.
This may seem a bit weird, but that's
okay, because it is weird.
Unix Perl Manual
Array elements
Array indices start from 0 not 1 ;
$poly_peptide[0]=‘var’;
$poly_peptide[1]=‘ser’;
$poly_peptide[7]=‘phe’;
The last index of the array can be found from
$#name_of_array, e.g. $#poly_peptide. You can
also use negative indices: it means you count back from
the end of the array. Therefore
$poly_peptide[-1]=
$poly_peptide[$#poly_peptide] =
$poly_peptide[7]
Array properties
Length of an array:
$len = $#poly_peptide+1;
The size of the array does not need to be defined – it can grow
dynamically:
# begin program
$i=0;
while ($i<100) {
$polyA[$i]=‘A’;
$i++;
}
Useful Array functions
PUSH and POP
Functions commonly used for manipulating a stack:
PUSH
POP
F.I.L.O = First In
Last Out
Very common in computer programs
Array functions – PUSH and POP
# part of a
array
program that reads a database into an
# open database etc first..
@dblines=();
# resets @dblines
while ($line=<DB>) {
push @dblines,$line; # push $line onto array
}
...
while (@dblines) {
$record = pop @dblines; # pop line off and use it
.... do something
}
Scalar Contexts
If you provide an expression (e.g. an array) when Perl
expects a scalar, Perl attempts to evaluate the expression
in a scalar context. For an array this is the length of an
array:
[email protected]_peptide;
This is equivalent to
$length=$#poly_peptide+1;
Hence:
while (@dblines) {
..
array in scalar
context = length of
array
Special variables
Perl defines some variables for special purposes,
including:
$_
Set in many situations such as reading from a file or in a foreach
loop.
$0
Name of the file currently being executed.
$]
Version of Perl being used.
@_
Contains the parameters passed to a subroutine.
@ARGV
Contains the command line arguments passed to the program.
Some are read-only and cannot be changed: see man
perlvar for more details.
Associative Arrays (Hashes)
Similar to normal arrays but the elements are identified by
keys and not indices. The keys can be more complicated,
such as strings of characters.
Hashes are indicated by % and can be initialized with lists
like arrays:
%hash = (key1,val1,key2,val2,key3,val3..);
Associative Arrays (Hashes)
Examples
%months=(‘jan’,31,’feb’,28,’mar’,31,’apr’,30);
Alternatively,
key
%months=(‘jan’=> 31,
’feb’=> 28,
’mar’=> 31,
’apr’=> 30);
=> is a synonym for ,
value
Associative Arrays (Hashes)
Further examples
#
%classification = (‘dog’ => ‘mammal’, ‘robin’ =>
‘bird’, ‘snake’ => ‘reptile’);
%genetic_code = (
‘TCA’ => ‘ser’,
‘TTC’ => ‘phe’,
‘TTA’ => ‘leu’,
‘TTA’ => ‘STOP’
‘CCC’ => ‘pro’,
...
);
Associative Arrays (Hashes) - elements
The elements of a hash are accessed using curly
brackets, { and } :
$genetic_code{TCA} = ‘ser’;
$genetic_code{CCC} = ‘pro’;
$genetic_code{TGA} = ‘STOP’;
Note the $ sign: the elements are scalars
and so must be preceded by $, even
though they belong to a % (just as for
arrays).
Associative Arrays (Hashes) – useful
functions
exists
indicates whether a key exists in the hash
if (exists $genetic_code{$codon}) {
...
}else {
print “Bad codon $codon\n”;
exit;
}
Associative Arrays (Hashes) – useful
functions
keys and values
makes arrays from the keys and values of a
hash.
@codons = keys %genetic_code;
@amino_acids = values %genetic_code;
Often you will see code like the following:
foreach $codon (keys %genetic_code) {
if ($genetic_code{$codon} eq ‘STOP’) {
last; # i.e. stop translating
} else {
$protein .= $genetic_code{$codon};
}
Descargar

PERL - unito.it