OSCON 2005: Best Practice Perl, Damian Conway, first half


Is there actually a right way to do things in Perl? Of course not, but Damian Conway just wants to give his impressions of what the best practices might be. “Best practices” refers to writing code that is maintainable, robust, efficient and concise. Best practices is a set of rules to combat “Intuitive Programmer Syndrome,” where a programmer programs by using The Force instead of following some standard set of rules.

Damian Conway wants us to become lady and gentlemen programmers who, six months down the line, don’t hurt anybody unintentionally. That’s what best practices are about, not hurting the person who’s going to have to maintain the code we write today.

Uhoh. “Nobody wants Perl programs more than 10,000 lines long.” ORAC-DR sits at over 300,000 lines of code, but a lot of that is documentation.

None (or at least, very few) of the rules Conway will be presenting are absolute. Consider them guidelines. Use them as a starting point, not a finishing point. Negotiate the rules with your fellow developers, and enforce them strictly and rigidly once they’ve been established.

Starting with the most controversial part of coding standards: layout. It doesn’t matter which style you choose, as long as when you pick a style you stick to it. Conway suggests bracketing and parenthesizing in K&R style, separating control keywords from the following opening bracket, not separating subroutines or variable names from the following opening bracket, and separating complex key/index computations from the surrounding brackets. Semicolon after every statement, comma after every list element, 4-column tabs. Code in paragraphs, meaning put an empty line between distinct chunks of code that do one thing. Each one thing might have a few lines of code, but if you separate these things by blank lines, the code becomes easier to read. If you’re breaking up long expressions into multiple lines, Conway suggests breaking before the operator (which is something I don’t do).

Only slightly less controversial: naming. Conway claims that a consistent naming convention will make your code at least twice as maintainable. Using underscores_in_variable_names insteadOfInterCaps is considered easier to read, as is abbreviating names by prefix instead of removing vowels. Name hashes in the singular and arrays in the plural.

“Thinking causes problems.” So don’t think, use templates for forming identifiers.

Moving on to variables, watch out for the so-called “punctuation variables” — nobody is comfortable with all of them. Don’t use package variables unless you absolutely have to. One thing I didn’t know is that $a and $b (variables used for sorting) are completely immune to use strict. Lexical variables should be used within modules, and subroutines used to modify those variables. Watch out for modifying $_.

Conway doesn’t like postfix-style control structures (do this if that), and prefers it the other way around (if that do this). He doesn’t like the C-like three-statement for loop, prefering the more Perlish for loop (for my $n ( 4..$max )), because he doesn’t have to think. Don’t subscript within loops; if you’re going through values in an array then ideally do something like for my $client ( @clients ) { ... }, or at worst take a copy of that value within the loop (suppose you need the iterator variable to print out client numbers), then do operations on it. Same for the values of a hash, as long as you’re using Perl 5.6 or higher. (Note for self for when I get back to work: look into Data::Alias)

When you write a for loop, don’t use explicit $_, use an explicit iterator instead (i.e. for my $candidate (@candidates) { ... }). Never leave off the my when naming an explicit iterator. Also, don’t trust use strict. Use it, but don’t trust it. “Perl will shoot you in the head if it possibly can. Doesn’t believe in the body shot.”

This coming section would be Tim’s favourite: when to use map instead of for (I always use for and Tim always changes them to map). You get more compact code, use less memory, and it’s probably going to be faster. Use map when you want to generate a new list from an old one, and use for to transform a list in-place.

When producing a value that depends on a series of tests, use the ternary operator, and line things up so the question marks line up in a column and the colons line up in another column.

Documentation! Development programmers hate documentation, maintenance programmers love it. Since we’re both, just do documentation. “Documentation is a love letter we write to our future self.” Create a standard POD template and just fill it in — h2xs -X, ExtUtils::ModuleMaker and Module::Starter (or Module::Starter::PBP) can do this for you. The DIAGNOSTICS section should be pulled directly from every die, croak, and carp statement in your module, as long as every output to STDERR. Document what’s already broken. Remember: “the primary use of user documentation is so you don’t have to interact with the user.” A disclaimer of warranty is probably the most important thing to put in these days. Acknowledgements are a good thing to put in as well, because it makes people happy.

Develop a consistent commenting system and use it. I was the only person to stick their hand up when Conway asked who has colleagues who write good comments (I’m thinking primarily of Malcolm’s code here, although most of the Starlink code is extremely well-commented). Use =for sections comments for larger discussions. A comment belongs anywhere something has puzzled you or tricked you. “If it fooled you once, it will fool you, or whoever comes after you, again.”

Thus ends the first half.

[link to second half]

  1. No comments yet.
(will not be published)