Tuesday, January 1, 2013

Effective Perl Programming - The Basics of Perl - Items 1 to 7

I'll create a serie in my blog named Effective Perl Programming with my notes of the items from the book of same name Effective Perl Programming and other notes form different sources about the same topic. The table of content of the book can be found in the blog of the author as well as more topics for Effective Perl Programming.

The first chapter is "The Basics of Perl" with 14 items which I'll start in this post. The other chapters are: Idiomatic Perl, Regular Expression, Subroutines, Files and filehandlers,  References, CPAN, Unicode, Distributions, Testing, Warnings, Databases and Miscellaneous. If you cannot wait for me to post my notes here then buy the book or check the author blog.






Item #1: Find the documentation for Perl and its Modules.

Perl documentation can reviewed with perldoc. A few interesting documentation to read are:
# Perldoc documentation
$ perldoc perldoc
# Perl table of content
$ perldoc perltoc
# Perl syntax
$ perldoc perlsyn
# Perl built-ins
$ perldoc perlfunc
# An specific perl function
# perldoc -f >function name="name">
# Example:
$ perldoc -f split
# A perl module
# perldoc >module>
# Example:
$ perldoc Pod::Simple
# FAQ about any subject/word
# perldoc -q >word>
# Example:
$ perldoc -q random
It is also recommended to document your own script, program or module. This documentation can be read with perldoc with:
$ perldoc program.pl
$ perldoc module.pm
# Read perlpod to know how to document your perl program or module
$ perldoc perlpod
The documentation will go at the end of the file after __END__ and will have several POD sections. This is an example:
__END__

=head1 NAME

MyProgram.pl - This is a program to explain POD

=head1 SYNOPSIS

MyProgram.pl [--help or -h] [--verbose] 

 Options:
   --help    or -h                         Brief help message

=head1 OPTIONS

=over 8

=item B<--help> or B<-h>

Print a brief help message and exits.

For more information enter: B<perldoc myprogram.pl="myprogram.pl">

=back

=head1 DESCRIPTION

This is a program to explain POD.

B<author>: "Johandry Amador" < johandry@gmail.com >

=cut
Other sources for documentation can be found online, a few sites are:
For local documentation you can install:
$ perl -MPod::POM::Web -e "Pod::POM::Web->server"

Item #2: Enable new Perl features when you need them.

From Perl 5.10 you must enable new features. For example, Perl 5.10 adds the say built-in function and to use it you need to enable it with the use directive or with the -E switch.
$ perl -E using_say.pl
Or
use 5.010; # use new features up to 5.10

say "Hello!";

If you use 5.012; automatically will turns on strict. If you want to enable just a few features then use the feature pragma.
use feature qw(switch say);

For a list of new features and future features review the History/Changes for Perl 5, the most relevant to me are:

Item #3: Enable strictures to promote better coding.

Perl is a very permissive language. That permissiveness is not attractive for longer programs. The strict pragma makes Perl much less permissive and caught most of the common errors by adding this line:
use strict;

Or, if you are using Perl 5.12 or later, it is strict automatically requiring that version or later.
use 5.012;

Or, with the switch -Mstrict if the program do not use the strict pragma:
$ perl -Mstrict program_without_strict.pl

If you want to not use specific versions of Perl, use the pragma no:
no

There are 3 parts of strictures: vars, subs and refs.

Declare your variables.
A common error is misspell a variable name. Using strict vars pragma catch and prevent such errors by forcing you to declare the variables. There are 3 ways to declare variables:

Declaring them with my or own:
use strict 'vars';

my @temp;
our $temp;
Using the full package specification:
use strict 'vars';

$main::name = 'Johandry';
List variables in use vars:
use strict 'vars';
use vars qw($lastName);

$lastName = 'Amador';

Special variables such as, $_, %ENV and so on, do not need to be declared. Also strict ignore the versions of $a and $b in sort.

Be careful with barewords.
Poetry mode, another source of errors, is Perl default treatment of identifiers with no other interpretation as strings. For example:
for ($i =0; $i <10; $i++) {
  print $a[i];   # Meant to say $a[$i]
}

Using strict 'subs' turns off poetry mode. So, the previous code will throw an error.

Using strict 'subs' also allows to not use string form in hash keys inside hash-key braces or to the left of the fat arrow.
use strict 'subs';

$a{name} = 'ok';
$a{-name} = 'ok';
my %h = (
  last  => 'Amador',
  first => 'Johandry'
);

Avoid soft references
The strict 'refs' pragma disable soft references. Without strictures, Perl uses the value in the string as the name of the variable:
no strict 'refs';

$var_name = 'list';
@{$var_name} = qw(1 2 3);

Item #4: Understand what sigils are telling you.

Sigils are the characters in front of Perl variables names and in dereferencing. Thy are related to the data, not necessarily the variable type.
$ means you are working with a single value, a scalar or a single value in an array or hash.
$scalar
$array[3]
$hash{'key'}
@ means you are working with multiple values such as arrays or hashes.
@array
@array[0, 2, 6]
@hash{ qw(key1 key2) }
% means you are treating something as a hash.
%hash
Perl has sigils for subrutines (&) and typeglobs (*).

To know what sort of variable you are looking at, there are 3 factors to consider: the sigil, the identifier and possible indexing syntax for arrays or hashes.

SIGILIDENTIFIERINDEX
$name[3]
$name{'id'}

For:
$name[3]
$name{'id'}
You know that you are working with an array or hash because you use the [] or {} for indexing. The sigil $ tells you that this variable contains a single element from the array or hash.

Item #5: Know your variable namespace
There are 7 kinds of package variables or variables-like elements in Perl: scalar variables, array variables, hash variables, subroutine names, format names, filehandles and directory handles. Each one with its own namespace, for example, the scalar variable $a is independent of the array variable @a:
my $a = 1;
my @a = (1, 2, 3, 4);  # $a is still 1
Also, $a in package main is independent of $a in package foo:
$a = 1;
package foo;
$a = 2;  # $foo::a is 2; $main::a is still 1
You have to look at the right as well as the left of an identifier to determine what kind of variable the identifier refers to. The $ means that the result is a scalar value, not that you are referring to a scalar variable.

Subroutines name can be prefixed with & but as they are optional, as wel as the parentesis, there are 3 ways to call a subroutine.
sub hi {
  my $name = shift;
  return "hi, $name\n";
}

print &hi("Johandry");  # old style
print hi("Johandry";    # with parentesis
print hi "Johandry";    # just if hi has been declared or defined before use it
Filehandles and directory handles are not prefixed either but recognized in context.
open TEST, '>', "$$.test";
print TEST, "test data\n";

opendir TEST, ".";
format TEST = @<<<<<<<<<<<< @<<<< @<<<<
$name, $lo, $hi
.
If it is confusing, store filehandles and directory handles in the objects from IO::FIle and IO::Dir.

Item #6: Know the difference between string and numeric comparisons.

Perl has 2 set of comparison operators, strings and numbers.

String comparison operators use letters and look like words. String are compared character by character. Should not be used to compare numbers.
'a' lt 'b'                 # TRUE
'a' eq 'A'                 # FALSE
"johandry" eq "johandry "  # FALSE
'10' gt '2'                # FALSE
"10.0" eq "10"             # FALSE
Numeric comparison operators use punctuation and look like algebra. Should not be used to compare Strings.
0 < 5           # TRUE
10 == 10.0      # TRUE
'abc' == 'def'  # TRUE both looks like 0 to ==
Perl sort operator use string comparison by default. Do not use string comparison to sort numbers. You can also let Perl choose eq or == by using the smart match operator (~~) looking at both sides of to figure out what to do.
use 5.010;

if (123     ~~ '456') { ... } # Number and numish: FALSE
if ('123.0' ~~ 123)   { ... } # String and number: TRUE
if ('Mimi'  ~~ 456)   { ... } # String and number: TRUE
if ('Mimi'  ~~ 'Mimi'){ ... } # String and string: TRUE
if ('123.0' ~~ '123') { ... } # Numish and numish: FALSE

my $var  = '123';
my $var2 = 123;
$var2 = "$var2";

if (('123' + 0) ~~ '123.0') { ... } # Number and numish: TRUE
if (($var + 0)  ~~ '123.0') { ... } # Number and numish: TRUE
if ($var2       ~~ '123.0') { ... } # String and numish: FALSE

Item #7: Know which values are false and test them accordingly.

Since numeric and string data have the same scalar type and Boolean operations can be applied to any scalar type, the logical truth works for numbers and strings.

The basic test is: 0, '0', undef and '' (empty string) are false.
Everything else is true.

More precisely, every value in Boolean context is first converted to string and Perl test the string result. If the result is empty string or '0', the result is false. Otherwise is true. undef will evaluate as false, since it looks like 0 or empty string except for the defined operator.

If you have a problem probably you test a quantity to see if it is false instead of test to see if it is defined. For example, the loop will end when glob return undef (no more files in the current directory), which is the empty string and therefore false. If there is a file named '0' then will loop will end. To avoid this use the defined operator to test for undef.
# WRONG
while ( my $file = glob('*') ) {
 ...
}

# CORRECT
while ( defined( my $file = glob('*') ) ) {
 ...
}
Same here:
# WRONG
while (  ) {
 ...
}

# CORRECT
while ( defined( $_ =  ) ) {
 ...
}

The end of the array
Something similar to arrays. As you can have undef elements in an array it is not correct to loop until you find an undef value. Instead, use foreach and skip the undefined.
my @cats = qw( Buster Roscoe Mimi );
$cats[1] = undef;  # R.I.P Roscoe

# WRONG
while ( defined( my $cat = shift @cats ) ) {
  ...
}

# CORRECT
foreach my $cat (@cats) {
  next unless defined $cat;
  ...
}
To know the last element of an array, do not look for the undefined, use the $# syntax. Example: $#cats.

Hash values
undef elements can also be values of a hash. Use defined and exists to check for the key value as false.
my %hash;

if ( $hash{'foo'} )         { ... } # FALSE
if ( defined $hash{'foo'} ) { ... } # FALSE
if ( exists $hash{'foo'} )  { ... } # FALSE

$hash{'foo'} = undef;

if ( $hash{'foo'} )         { ... } # FALSE
if ( defined $hash{'foo'} ) { ... } # FALSE
if ( exists $hash{'foo'} )  { ... } # TRUE

$hash{'foo'} = '';

if ( $hash{'foo'} )         { ... } # FALSE
if ( defined $hash{'foo'} ) { ... } # TRUE
if ( exists $hash{'foo'} )  { ... } # TRUE

To be continue ...

Sources:

Post a Comment