Friday, December 4, 2009

Joinery

The join command has some useful options.

Hal Pomeranz has a nice example of using join to combine the output of two different commands in this week's Command-Line Kung Fu column.

After some discussion, he ends up with this:

$ join -1 1 -2 2 <(openssl sha1 * | sed -r 's/SHA1\((.*)\)= (.*)/\1 \2/') <(wc -c *) \
| awk '{print $2 " " $1 " " $3}'

One reason he uses openssl is to help teach that in process substitution, the contents of <( ) can be pipeline. If you relax his didactic requirement, join's options let you do the job with a lot less typing.

$ join -j2 -o 1.1,0,2.1 <(sha1sum *) <(wc -c *)

Note that sha1sum has the same output format as wc -c, which makes the join easier.

Non-Linux boxes might not have sha1sum, but if I didn't have it, I'd see if I had md5sum, which has the same output format. Their Wikipedia entries say these commands are widely available on lots of non-Linux OSs.

No comments: