OSCON 2005: Best Practice Perl, Damian Conway, second half
- Tue Aug 2 2005
- OSCON 2005
Unclassified - Trackback URL
- comment feed
- digg this post
The second half starts off with a discussion about built-in functions. Damian Conway’s most important recommendation about built-in functions is to use built-in functions, as there’s a lot of them and they’re incredibly useful and generally better than anything you can write yourself. In modern versions of Perl, the values function can be used as an lvalue, something I didn’t know (I think my Perl books are too old). Always use a block with functions like map and grep because the transform/test stands out more clearly.
There are also a load of functions that aren’t core but are in modules that are incredibly useful. Use things like List::Util and Scalar::Util instead of reinventing the wheel.
Subroutines are the single-most important problem decomposition tool available in Perl. They should be used to abstract out behaviour and be short enough to fit in the human brain (Conway says ten to twelve lines). Always unpack @_ first, don’t use things like $_[0] and $_[1]. Named arguments should be used if you have more than three arguments to a subroutine. It’s perfectly okay to mix positional and named arguments, and use a hash reference to pass in named (or optional) arguments. Don’t use prototypes. Always return with an explicit return. Use a bare return to return failure, don’t return undef because it doesn’t work properly in list context.
Moving to I/O, input and output are especially important. Don’t use bareword filehandles. Ever. They’re a bad bad idea because they clobber any other filehandle opened to the same name anywhere else in the same package. Use lexical filehandles instead. Use three-argument open because it doesn’t get subverted by bizarre filenames (say, filenames that start with a >). Putting braces around your lexical filehandle makes it stand out in print statements.
Regular expressions: always use the /x flag on every regex you write (it allows you to break up regexes onto multiple lines, and it allows you to put comments in them). It makes regexes more maintainable and readable. Always use the /m flag (it turns ^ and $ into matching the start and end of a line, instead of the start and end of a string). If you want to match the start and end of the string, use \A and \Z (or \z). Always use the /s flag (it modifies . to match any character instead of any character but newline). If you want to bundle all of these together, use the Regexp::DefaultFlags module. If you want to group, use (?: … ). Don’t use $1, $2, etc., use something like my ($statement) = $source =~ m{\G ([^;]+) ;}gcxs;.
Error handling is necessary because programs are built by humans and operated by humans. When you detect an error the only safe thing to do is to go up-scope. Prefer exceptions over special return values, because it’s incredibly easy in Perl to ignore return values. Throw an exception as soon as you have the chance. Use croak instead of die. Write your error messages in English so that the user can understand what’s gone wrong.
Use CPAN.
When writing a module, design the interface first. When refactoring code, it’s always okay to cut, it’s never okay to paste. Perl 5.10 will come with a core module called version that will handle versioning much easier than before. Consider the Perl6::Export::Attrs module for exporting subroutines from modules. Have a standard module template for any modules you write.
That’s the end of the tutorial.

2 Responses to “OSCON 2005: Best Practice Perl, Damian Conway, second half”
Wed Aug 10 2005
10:53 am
thanks for the writeup. i like/noticed these ones..
1. don’t use bare FILEHANDLER
2. use croak. i always use die though and can’t see why it’s bad.
3. the regex notes are pretty nice trick.
Wed Aug 10 2005
11:09 am
Using ‘die’ instead of ‘croak’ is fine if all you’re writing is small scripts. If you’re writing a module and you use ‘die’, then when the ‘die’ is thrown you’ll get back an error message reporting where in the module the error was, or more accurately, which line in your module the ‘die’ is on. If you use ‘croak’ then the error reports where in the script the error occurred.
Nobody (except you) cares where in your code the error was detected, what they care about is where in their code the error was caused. They want to be told where in their code the fatal subroutine was used, not at which line in that subroutine the error occurred. Using ‘croak’ means you give useful information back to the user, using ‘die’ means you don’t.
Leave a Reply