Monday, March 31, 2008

Old Dogs, New Tricks: Brace Expansion

I revisit my habits a lot.

It's a sentinel. You can't teach an old dog new tricks, so when I start being able to learn new things, I'll know I'm entering my second childhood.

For years, I used square brackets to match filenames. Now, I use the newer, Berkeley-esque, curlies more often.
$ echo x.[ab]
x.a x.b
$ echo x.{a,b}
x.a x.b
They're different. For curlies, order counts:
$ echo x.[ba]
x.a x.b
$ echo x.{b,a}
x.b x.a
Also, x.[ba] must match something.
$ mkdir /tmp/FOO; cd /tmp/FOO
$ ls # nothing there
$ touch x.[ba]
$ ls
x.[ba] # one file!
$ rm -f *
$ touch x.{b,a}
$ ls
x.a x.b
They're conceptually different. While square brackets are a file-name-match operation, curlies are text-substitution. When you use square brackets, you're asking for existing files that match the pattern; when you use curlies, it's a typing shorthand that the shell expands for you before it does anything else.

The items can nest, and they don't even have to have anything in them.
$ echo x{,a,b{s,t}}
x xa xbs xbt
Curly-expansion is officially called brace expansion, and square-bracket expansion is part of pathname expansion, or (more commonly) globbing.

Friday, March 28, 2008

(a)Time Is On My Side

When I'm developing and debugging, It helps me to know when files were read and commands were executed.

I take debugging advice from the Rolling Stones.

Linux, like Unix, gives you this information as a file's "atime" (access time). I do ls -ulrt several times a week -- often several times a day -- to find out what files were used most recently.

Think of "-u" as the "useful" flag.

Here's an example experiment:
$ echo hello > foo
$ sleep 120; cat foo; date; ls -l foo; ls -ul foo
$ sleep 120; cat foo; date; ls -l foo; ls -ul foo
Note that ls -l says when the file was created, but ls -ul matches the output of date.

Executing a command requires the system read it, so that, too, updates the command's atime.

I find things being used in a subtree like this:
$ touch /tmp/sentinel
# do some stuff
$ find . -anewer /tmp/sentinel
When I was trying to find out which files were read by kickstart, I just ran kickstart and did an ls -ulrt on my kickstart server ... and found out that Fedora-8 has broken atime.

It's added a real-time hack, relatime (relative atime), in which the *first* ls -ul works, but later ones don't. This gives better performance on running systems, but takes away a traditional, and handy, debugging tool.


Reading documentation, Googling, and begging for help on the BLUG mailing list all failed. Finally, Kevin Fenzi cheerily gave me the solution:
$ sudo bash -c 'echo 0 > /proc/sys/fs/default_relatime'
$ sudo mount -o remount,atime /
Nice. Thanks Kevin.

How did Kevin figure it out? He read the kernel code.

Thursday, March 27, 2008

Checking a Serial Cable -- The Coolest Trick Ever

Yesterday afternoon, Jim Black showed me how to check a serial cable.

I was having trouble connecting to a board over a serial port, and thought it might be my cable. Jim showed me that if you connect pins 2 & 3, it will connect transmit to receive, and if you type at it, the characters will echo to the screen.

Here, Jim's connecting the pins with a screwdriver. When he first showed me, he used a pair of scissors that were lying around.

Jim says it's one of his "old man tricks."

He also taught me not to touch the case with the connector, since that will ground the connector and send my transmissions to /dev/dirt .

"Serial cables: the most non-standard standard, ever." -- Jim Black

Wednesday, March 26, 2008

Mounting File Systems with Ssh and Fuse

Yesterday, I mounted my laptop, at home, from my desktop, at work, with ssh. Not NFS, not Samba: ssh. Different domain, different location. Secure, fast, hassle-free.

I started with a recipe out of a book I got for Christmas, Carla Schroeder's Linux Networking Cookbook, then Googled around until I found a mount table entry that would do the trick. Now, I can just say mount /mnt/sshfs from my desktop, at work, and my home directory from my laptop at home appears.

Here's the fstab entry from the work box:

sshfs#user@laptopname:/home/user /mnt/sshfs fuse port=portnumber,user,noauto 0 0

You also need to install FUSE, the "filesystem in user space" module, (apt-get install fuse), and, of course, ssh, but I'd already done those for other reasons.

The whole shebang goes by the name sshfs.

(Wikipedia's FUSE entry, linked to above, says there's even user-space filesystem built on Gmail.)

Monday, March 24, 2008

Freezing Lines in Oocalc

We got our bikes serviced, so they're ready for the season, so I added a new column to Vital_statistics.ods, the spreadsheet I'm keeping my daily steps and weight data in: Bike Miles.

She was watching over my shoulder as I was putting in the first entry -- the trip we took to Baseline and Broadway -- and she said, "You know, you can make the column headers stay in the window when you scroll."

She showed me how. Click on the number of the blank row below the column headers, to mark it, then Window->Freeze. Now, as I scroll, the headers stay visible.

Learning one trick at a time.

Sunday, March 23, 2008

Constant As a Northern Star: Read-only Variables in Bash

I didn't know bash had read-only variables.

I found out by peeking at the code for s3-bash.

Well, I guess I did know, since I can't make assignments to variables like $1, or $UID. I just didn't know I could create them. Well, I can.
$ PI=3.14159
$ echo $PI
$ PI=3
$ echo $PI
$ readonly PI=3.14159
$ PI=3.14159
$ echo $PI
$ PI=3
./constant: line 11: PI: readonly variable
$ echo $PI
Caveat programmor: In a scripts, they're great for creating constants; however, once you make a variable readonly, there's no way to change it back to read/write. If you make a variable readonly in an interactive shell, you have to exit the shell to get rid of it.

"Adult supervision required. Don't try this at home, kids."

Friday, March 21, 2008

PMS Bad. RPMS, Worse.

I have been playing with Red Hat Package Manager: creating rpms, installing them, and so on. I've done this before but had nearly blocked out the experience.

The process is gawky. It feels like a robot arm with extra elbows.

(In the early days of robotics, at MIT, researchers built robot arms with ever-increasing numbers of elbows. In the end, they discovered that an elbow and a shoulder sufficed. The cost of building, debugging, and maintaining software that handled more wasn't worth it.)

I wonder if making Debian packages is easier.

Thursday, March 20, 2008

Here's a header.

Now that I can use Google Docs as my word processor, for blogging, I'll take a little time to explore things it can do -- get comfy with it.

First, I need to organize what's there. I have a long list of documents, by now (I started using it when it was Writely), but I've never really lived in it. It has "folders" now, so I suppose I should make some. Lessee ...

Creating a folder and moving a document to it requires a little clickery, but isn't awful.

What all's in the menus? I'll go left-to-right. First, File.㍾

Control-s saves. That's good.

What else? File->Document settings? Oooh. I can set a default font. Time for a little, sample Font book:

I was able to switch fonts yesterday, but not, it seems, today. Perhaps that's what they mean by Beta.

Still, it does look like I can do color, And size. Here's some blue text.

Here's a link to Google Docs and Spreadsheets. And here is my email address.

Now to do kinkier stuff. What's Insert->Comment do?

I'm not sure. Looks like it puts it into a little dotted box. I wonder if it'll appear in the blog post.
(Also, it seems to have some trouble switching back to normal font sizes when I tell it to. Maybe that's a beta thing, or maybe I just have to get used to it.)

And a bookmark? I'll set some bookmarks up in the red-backgrounded text. Hmm.

Insert->Special Characters? ¡Hola! Not a great inverted bang, but usable. Headers and footers? Sure. See for yourself.

Okay, enough playing for a bit. I'll post the sucker, and see how it looks.

Wednesday, March 19, 2008

Blogging With a Word Processor

Can I make blog entries through Google Docs ? The last time I tried this, it went to the wrong blog. The Settings interface looks like it's fixed now. This text is in Comic Sans. Here comes a bunch of quoted courier:

for i in *
echo $i

What Happened to Atime? (And What Do I Do Without It?)

On Fedora 8, atime's gone.

Historically, the access time (atime) of a file changes whenever you read or execute it. You change the atime of a directory by doing an ls. All of these are requests for the system to go out and look at the contents of something in the file system.

You can look at the atime of a file with the -u ("useful") option of ls. "Is my application actually reading its configuration file? I'll just do ls -ul app.conf."

As the accompanying screenshot shows, this no longer works in Fedora 8. The terminal in the southwest corner is Fedora, the one in the northeast corner is Ubuntu. [Click on the screenshot to enlarge it.]

I think I can still get the information I need with systrace, but it's clumsy. I wish I knew how to turn atime back on. I thought I could do it with chattr -A, or with mount -o remount,atime, but neither seems to do the trick.

If you know how get Fedora 8 to give me atimes, I'd love to know.

Tuesday, March 18, 2008

A Mystery: fgconsole and check-foreground-console

Linux, like Unix before it, lets me have several virtual terminals.

On my Ubuntu box, I switch between these with Control-Alt-F[1-7], where Control-Alt-F7 is my default, GUI environment, and the others are garden-variety, full-screen consoles -- tty[1-6].

(The terminals I put up with gnome-terminal are pseudo-ttys: /dev/pts/1 and so on.)

I spend all of my time in the windowing system. The others are primarily useful when my GUI is mis-behaving, though they can also show useful information at boot time.

I can see that Gnome is running on tty7 with ps.
$ ps ajax | grep '[t]ty7'
4879 4884 4884 4884 tty7 4884 Ss+ 0 10:56 /usr/bin/X :0 -br -audit 0 -auth /var/lib/gdm/:0.Xauth -nolisten tcp vt7
If I'm writing a script, can I see which terminal is in the foreground? Yep. fgconsole does the trick.
$ fgconsole
$ sleep 10; fgconsole
$ sleep 20
While I'm on virtual terminal 1, fgconsole runs, and prints a 1.

There's another command in /bin that seems like it could be related, but I can't figure out what it does: check-foreground-console.

There's no man page or usage message, Google turns up nothing useful, and no experiment I've done seems to have any effect on it. It always exits silently with an exit code of 0, though it requires I be root to run it.

If you know what it's about, tell me.

Monday, March 17, 2008

What's in /bin? bzip2 and friends, for starters.

What's in /bin? /bin contains, in theory, everything the system needs to come up.

On some systems, /usr is a mounted filesystem, so nothing in /usr/bin is around until the system gets far enough up to do mounts.

Other than that constraint, it's up in the air. I've seen systems where /bin has almost nothing, and others where /usr/bin is just a symlink to /bin.

On my Ubuntu box? Let's look.
$ ls /bin | wc -l
That's a lot. How about just the first 10%?
$ ls /bin | head
There's bash and ... a whole lot of bzip2 utilities?

The next three are also bzip2 utilities, so that's over 10%. What are they?

bzip2 is a compression algorithm that's (much) slower but (slightly) tighter than gzip. Why's it in /bin and what are all these related things?
$ bzexe
compress executables. original file foo is renamed to foo~
usage: bzexe [-d] files...
-d decompress the executables
A quick look at the man page says that bzexe produces compressed, but self-decompressing versions of executables. If you have a tiny disk, you can run bzexe on all your executables to compress them, but still use them as though they were uncompressed.


Let's try it.
$ cp /bin/date .
$ bzexe date ; ls -l
date: 1.993:1, 4.014 bits/byte, 49.82% saved, 45648 in, 22906 out.
$ ls -l date*
-rwxr-xr-x 1 jsh jsh 23561 2008-03-17 06:44 date
-rwxr-xr-x 1 jsh jsh 45648 2008-03-17 06:44 date~
(bzexe leaves the original version in date~)

And now, I'll show off the result.
$ ./date
./date: 22: /usr/bin/bzip2: not found
Cannot decompress ./date
Oops. Sure enough, a quick peek inside the file shows it's invoking /usr/bin/bzip2 to unzip itself. Changing it to invoke /bin/bzip2, instead, fixes the problem.

Do any other bzip2-related utils have this problem? I won't bother to find out, I'll just file a bug report.

When someone tells me he's fixing bugs in my products, I'll look puzzled and say, "I thought all the bugs were fixed." It usually gets a laugh.

Sunday, March 16, 2008

CLUE Installfest

Dave Anselmi has just put together another, successfull CLUE installfest. He's doing them every three months or so.

I went to the first one just to put faces to names: two guys who said they were going, Dave and Collins Ritchie, are so consistently helpful and sane on mailing lists that I wanted to see what they looked like.

Here they are, in the flesh. Okay, "in the pixels." To see them in the flesh, come to an installfest.

The other picture is Mike, Richard, and Kathryn. Richard's in the middle, helping Mike and Kathryn install Linux on their boxes. Mike's putting Fedora 7 on his desktop, and Kathryn's putting Ubuntu on her laptop.

It's fun to help folks with stuff. When they thank me for helping them, I sometimes say, "Well, it's not free. I do charge my standard fee ...."

When they start to look concerned. I say, "You have to help someone else."

Richard came over to Caffe Sole a few weeks ago, to ask for help installing Linux on his laptop. Now, he's passing it on.

Setting Deadlines: cutoff-after

When a command runs too long, I give it the hook.

If it's running too long but I want it to continue, I suspend it and put it in the background.
$ sleep 100; echo hello
[1]+ Stopped sleep 100
$ bg
[1]+ sleep 100 &
Other times, though, I just kill it.
$ sleep 100
Something's gone wrong and the job has hung, or I misjudged and I need to re-design something to speed up my code.

All fine, if I'm watching, but what about when the job's running at night, from a cron job? What if I come in the next morning, nothing's finished, my job's still running, and my machine's on its knees?

For these, I have a tool, cutoff-after, that sets deadlines. It runs my jobs for a certain amount of time, but cuts them off if they don't finish. [Click on the screenshot to enlarge.]

The example illustrates that it offers three outcomes: success, failure, and timeout. If a command completes on time, its exit status is preserved. [Here, I'm showing you success or failure by coloring my prompt.] If the command times out, cutoff-after fails, but announces the timeout. All three outcomes are announced on standard error.

Here's the code:
$ cat cutoff-after
# adapted from the Perl Cookbook, pp. 594-595

# play nice
use warnings;
use strict;

# parse and sanity-check args
sub usage {
$0 =~ s(.*/)();
die "usage: $0 Nsecs cmd-with-args\n";
@ARGV > 1 or usage;

my ( $deadline, @cmd ) = @ARGV;
( $deadline =~ /^\d+$/ ) && ( $deadline > 0 ) or usage;

sub try {

eval {

# set the deadline
$SIG{ALRM} = sub { die; };

# execute the command
system "@_"; # command and arguments

# say what happened
die( ( $? ? "failure" : "success" ) . ": @cmd\n" );


sub catch {
if ( $@ =~ /^success: / ) {
elsif ( $@ =~ /^failure: / ) {
die "$@";

# cleanliness is next to godliness
local $SIG{TERM} = 'IGNORE';
kill "TERM", -$$;

die "timeout: @cmd\n";

try @cmd;
There's probably a shorter implementation in other languages.
If you want to try, here are test cases encoded in a shell script:
$ cat cutoff-after.t
#!/bin/bash -x

cutoff-after --help
echo bad usage returns $?
cutoff-after 2 sleep 1
echo success returns $?
cutoff-after 1 ls nonexistentfilename
echo failure returns $?
cutoff-after 1 sleep -1
echo different failure returns $?
cutoff-after 1 sleep 2
echo timeout returns $?
cutoff-after 1 'false; echo hello'
echo compound statement returns $?

Saturday, March 15, 2008

Ned McClain

Ned McClain is CTO of Applied Trust, which does IT infrastructure consulting.

Ned's interested in Green IT -- he's calling it Ecoinfrastructure and came over to the Boulder Linux Users Group to talk about it, night before last. It would, especially in Boulder, have been easy to make it a management-level, Political Correctness talk; instead, Ned had lots of useful, pragmatic, non-trivial tips.

This weekend, for example, I plan to try out Amazon's EC2 and S3 facilities because of his talk.

Ned is also one of my Barbie's Guide to Linux co-conspirators.

Friday, March 14, 2008

Ninh Nguyen

Here's a picture of Ninh at his going-away lunch, day before yesterday. Ninh has been at Aztek a couple of times, and I hope he comes back again. He's leaving to trade his 80-minute commute for a 15-minute one.

Ninh's dad was imprisoned by the communists in North Vietnam, where Ninh was born, escaped to freedom, but had to flee again, with his family, when the communists invaded South Vietnam in turn.

His dad's now 100 years old. Ninh sometimes worries about him. "He eat too much fat."

Thursday, March 13, 2008

Free Usenix Proceedings

Usenix conference proceedings are now free, here.

There's a lot of good stuff in these, and some odd stuff, too.

You don't have to be a member to see them any more, but join anyway. It's a good organization. If you're in a big company, ask them to buy you a membership.

And, speaking of organizations, there's a BLUG meeting tonight.

Wednesday, March 12, 2008

Color-Coding My Prompt

Unless you're the 1-in-12 -- a colorblind white male -- color is a great cue.

I use stoplight colors to tell me when commands succeed and fail: red for failure, green for success. [Click on the image to enlarge it.]

It's less jarring than a flash or a beep.

It's also more persistent, like logging. If I get distracted, the next time I look at the screen, I still see the outcome; there doesn't even need to be an error message.

This is often a good companion to logging, since I can send all the error messages to a log and still know whether the job succeeds or fails, so long as I'm careful to set exit codes correctly in my scripts.

Here's code from my .bashrc that does the trick. $PROMPT_COMMAND is documented in the bash man page.

I also use the usual trick to put stuff in the window decoration, so I've thrown it in to make the whole code block easy to cut-and-paste.
# put the current directory into the X window decoration
echo -ne "\e]0;${USER}@${HOSTNAME}: ${PWD/$HOME/~}\007"

# set up a pair of prompts -- one for success, one for failure
prompt ()
# color names

# color success and failure differently
# for the shell, success is 0, failure 1

# color constant pieces of the prompt
_PCH="\$ "

# I use the directory name for my indicator

# the array of prompts
_PS[0]="$_MACH:${_DIR[0]}$_PCH "
_PS[1]="$_MACH:${_DIR[1]}$_PCH "

# choose the prompt based on the exit status of the last command
if [ $? -eq 0 ]


Aw heck. I cleaned up this code to post it. Last night, I stuck it into my .bashrc, and immediately found a bug. I'll fix it tonight. Grumble.

Update 2:

Okay. Fixed now. Sorry about that.

Unfortunately, Blogger seems not to be very good about indenting code the way I want. I'll have to investigate that further.

Tuesday, March 11, 2008

Distro-specific code: lsb_release

I like to write portable code. Sometimes, I can't. lsb_release helps me solve this problem.

To find out what distro I'm on I've always written ad-hoc code that looked for some distribution-specific feature, like /etc/redhat-release and, if necessary, parsed the output.

Now, the Linux Standards Base has a command that'll do the job for me.
$ lsb_release -a 2>/dev/null
Distributor ID: Ubuntu
Description: Ubuntu 7.10
Release: 7.10
Codename: gutsy
(There's an error message saying "No LSB modules are available." on this distro, which I'm discarding.)

This is roughly analogous to "uname -a", but about the distro, not the box.
$ uname -a
Linux jsh-laptop 2.6.22-14-generic #1 SMP Tue Feb 12 07:42:25 UTC 2008 i686 GNU/Linux
It lets me sequester distro-specific actions in code like this:
case $(lsb_release -i 2>/dev/null) in
*Ubuntu*) echo This is Ubuntu ;;
*) echo This is NOT Ubuntu ;;
Old pal, Nick Stoughton, our Usenix Standards Rep, points me at the relevant standard, here.

Monday, March 10, 2008

Nancy Parker (bumped)

Nancy Parker works at ProtoTest, headhunting for testers. Who'd have thought that was a business? They have, Nancy says, "fewer than 100 employees."

Her business card says "Technical Recruiter," but she's actually a geek. Back when she was trying to find me a job, she gave me a quiz, and asked me questions about vi. (Which, I'll add, she knew the answers to.)

Nancy's also the PR person for the Software Quality Association of Denver (SQuAD).

Friday afternoon, Marcia Derr and Nancy and I went to coffee at Pete's. I wanted to introduce them because they're both saxophone players.


Nancy was nervous about having her pictures up on the web, so she asked me to take 'em off. No problem!

If you want to see Nancy, you'll have to see her in person. Call her up and ask her to find you a job, for example.

Easy Debugging for Shell Scripts

To fix things that aren't working, I want to know where and when they failed.

Here's how you can get that information for shell scripts, without much work:
$ cat
# -- turn on command timestamping

timestamps() {
if [ "$1" = "off" ]
set +x
PS4='== $(date)\n'
set -x

$ cat example


timestamps on
echo hello
timestamps off
echo goodbye
$ ./example 2>/dev/null
$ ./example
=== Mon Mar 10 06:55:06 MDT 2008
echo hello
=== Mon Mar 10 06:55:06 MDT 2008
timestamps off
=== Mon Mar 10 06:55:06 MDT 2008
'[' off = off ']'
=== Mon Mar 10 06:55:06 MDT 2008
set +x
This trick has nice features:
  1. All commands are printed out before they're executed.
  2. Each command is timestamped, so you know when it's executed. (This helps when you're logging.)
  3. All the debug information is sent to stderr; you can get rid of it and just see normal output with 2>/dev/null
  4. You can choose sections of your code to track or ignore.
  5. It's a trivial amount of code: a pauci-line module, and a single line added to your script.

Sunday, March 9, 2008

Spanish Ladies


Here's my first attempt to make and post a video in a blog. Kristina took this with my cell: Vergil Mueller and are playing Spanish Ladies. Yes, it's distorted. Yes, the audio and video aren't synced. Yes, the phone couldn't even record the tune once through.

It's a video, taken on a cell phone and posted on a blog.

One of my friends, BA, said of Paul MacReady's Solar Challenger, "That's not just a solar powered airplane, it's a Solar. Powered. Airplane."

Gotta start somewhere.

His Master's Voice

Just now, I got bluetooth proximity detection working.

When I sit down at my laptop, the laptop unlocks. When I walk away, it locks.

It's detecting the presence of the cell phone in my pocket, via bluetooth.

If it's locked and I don't have my cell, I just log in normally.

It's Nipper-in-a-box. Or another case of Enterprise computing.

(Why would someone even want this? Oh, come on. Next question?)

Saturday, March 8, 2008

Amazon Primery

I'm an Amazon Prime member.

I do a lot of my shopping at Amazon, so it's a huge win. For years, I've bought all my gifts from them and just had them shipped. Lately, I'm buying household items, like coffee filters, from Amazon, too. I have free, two-day shipping, so why not?

It's $80/year, but I share it with four other people, so that's really $16 apiece.

It can be a little time consuming to look for the Amazon-Prime-eligible items, but John Hernandez just sent me a bookmarklet that winnows my Amazon searches down to Amazon Prime items.
Simple, elegant, and it works. John's a minimalist.

Friday, March 7, 2008

Free Shipping from Amazon

I've been an Amazon Prime member since Amazon first offered it. I buy almost all my Christmas and birthday presents from them, so the service pays for itself.

I can also share it with up to four other people, so my sibs are Amazon Prime members, too.

Not everybody is. My pal, Spider, asked me the other day what she could buy at Amazon that would bring her total high enough to get free shipping. Here's the tool for her: the Amazon Filler Item Finder.

This is cool technology and interesting design: a micro-service that does one thing, well. The web site is, itself, cool because it's so minimalist.

Thursday, March 6, 2008

Exporting Shell Variables "Upwards"

A common question is "How can I set a variable in a shell script and then use it after the shell script finishes executing?"

You can't.
$ cat anotherscript
$ x=69; ./anotherscript; echo $x

"Can't I do this with export?" Nope, export only exports downwards in the process tree.

When you want to modularize your shell scripts, by wrapping pieces in little modules, how can you do it?

One way is to write the result to standard out:
$ cat anotherscript
echo $x
$ x=69; x=$(./anotherscript); echo $x
This works, but if you want to pass up more than one variable, it gets awkward.

A second is to source the module.
$ cat anotherscript
$ source anotherscript
$ echo $x, $y
foo, bar
This runs everything in the same shell, so it exposes everything the module's doing. That's not exactly "modular."

Here's a third approach that lets you "export" variables "upwards." You have to use eval, but it gives you what you want with a reasonable syntax.
$ cat example
source # a magic module

raise foo bar # mark variables to be "raised"

foo=hello # do stuff with them in the script

$ eval $(example) # now use the script
$ echo $foo, $bar # and see that they're available.
hello, world
$ cat
raise() {
trap "4eval $*" EXIT

4eval() {
for i in $*
echo $i=${!i}

Tuesday, March 4, 2008

Rob Savoye

Yes, I haven't posted pictures for a few days. I have recent ones, but some idiot keeps sticking his finger in front of my lens.

In penance, here's a picture of an intense Rob Savoye, from dinner before the last BLUG meeting.

This isn't even the best picture I have of Rob, but it's the best one from the dinner. I'll wait to post pictures from his talk until we get the YouTube video up.

(It'll go up as soon as Ted Logan, our videographer, can free up the 30G of disk space he needs for the data conversions. Disk space isn't quite free yet.)

Great talk, too. Once it's up, watch it.

Logging: How Did Things Turn Out?

A post, today, to the Boulder Linux Users' Group mailing list asks this interesting question:
I have a bash script that is setting a series of processes in the background and I want to see what their exit codes are. I have played around trying to find a way to do it and I haven't found a reliable way, yet. Any suggestions?
Here's my solution, which contains a generally useful trick: using a filename to contain information you can use at a glance.
The biggest problem is saving $? for each command, especially when you're throwing some or all of them into the background, so they run in parallel.

An easy solution, if practical, is to wrap each process in a shell script that saves its exit status.

$ cat foo
ls bogus-filename
echo $? > $0.status
$ foo &> /dev/null &
$ cat foo.status

As a variant on that, I sometimes use a logging module, which redirects all output to a logfile and renames the logfile on completion. Some variant on this will do the trick:

$ cat foo
source ./
ls $* # your command goes here
$ cat
# save a log, with the error status

rm -f $0.[0-9]* $0.out
exec &>$LOG
trap 'mv $LOG ${LOG/out/$?}' EXIT
$ foo; ls
$ foo bogusfile; ls

Hope this helps.
You can see from the logfile name whether the job's still running or how it exited. Moreover, while the job is still running, you can watch its progress with a tail -f on the logfile.

Encapsulating tricks like this in shell modules, like means you don't have to re-invent the wheel every time.

More Head

The last post lamented the lack of an easy way to get "all but the last N lines" of a file without programming.

Here, just in case you need it, is such a program
$ cat ~/bin/allbut
#!/usr/bin/perl -ws
use strict;

die "usage: $0 [-h] [-n=N] [filename]\n" if our $h;

our $n ||= 10; # how many on the end to omit; default=10;

@_ = (undef) x $n;

while (<>) {
push @_, $_ ;
$_ = shift @_;
print $_ if defined $_;
$ for i in {1..20}; do echo $i; done | allbut -n=13
Like head and tail, allbut defaults to 10, so allbut foo prints all but the last 10 lines of foo.

I was tempted to call it callipygian; however, that would be "all butt," and risk confusion with tail.

Heads or Tails

If you want the beginning or end of a file, head and tail work fine.

Before head, I used sed. These two commands do the same thing:

$ sed 5q foo
$ head -5 foo
I don't know a convenient sed synonym for tail, which does several, non-trivial things.
$ tail -5 # last 5 lines of the file
$ tail -n +5 # all but the first 5 lines
$ tail -f # start with the last 10 lines of the file
# but keep on spitting out lines as they appear
I use tail -f all the time, for glancing at output that I've been saving, from long running processes.
$ make &> make.OUT &
... # do some other stuff
$ tail -f make.OUT
... # watch it for a minute, to see how it's progressing
$ # go back to other stuff
But what about the upside-down of tail -n +5? What if I want all but the last five lines of the file? I don't know how to do this without writing a program.
(If you do, tell me.)

An exception to this, at my fingertips, is "all but the last one line."

If I have a directory full of logs, and I want to get rid of all but the most recent, here's how:
$ ls -rt

$ ls -rt | sed '$d'

$ rm -rf $(ls -rt | sed '$d')
$ ls
Because rm -rf won't complain if you invoke it without arguments, this last idiom is idempotent: if there's only one file in the directory, it leaves it alone, so it's still safe to use.

Sunday, March 2, 2008

Old Dogs, New Tricks: Spreadsheets and Running Medians

I look for excuses to use new tools.

A few months ago, I needed to report SVN use at work. Ah, an excuse.

I wrote code to extract the number of revisions per week, massaged it into a comma-separated-variable (CSV) file, pulled that into an OpenOffice spreadsheet (OOCalc), graphed it, printed the graph as a PDF, turned the PDF into PNG, and posted it on a blog that my boss could go to, whenever he wanted to look.

A lot of work.

Except that now I know how to convert PDF to PNG, how to import a graph into a blog, and a lot more about how to use spreadsheets.

The data extraction is a script that runs from a weekly cron job, and I now know how to use OOCalc templates and keyboard shortcuts to turn the data into a posted graph with just a few keystrokes.

Even Debi, our HR maven, can understand the graphs, and she's impressed.

The hard part, now, is just having a year's worth of data to analyze.

I've been recording my daily weight and pedometer readings, in hopes of using them as an excuse to learn something else about spreadsheets.

Today, it paid off.

Looking through my bookshelf, just now, I came on Mosteller & Tukey's Data Analysis and Regression. This book is quirky, but the authors may be America's most famous statisticians (if there is such a thing). John Tukey invented the words "software" and "bit." Their advice is good.

They say to try smoothing data with successive running medians, until the data stabilize.

"Can I do that with OOCalc?"

You bet. Google says it has a MEDIAN() function built right in.

Here are the steps I tried, in less time than it took to type this post.
  • Pick a column to hold the running medians. Click on a cell in that column, and type =median .
  • Go to the column of raw data, and highlight the three cells you want a median of with the mouse. They appear in the formula, as if by magic, with the correct syntax. (Don't forget to close the parens.)
  • Press Enter. Ta-da, the median appears in the selected cell.
  • Click the little button in the lower-right-hand corner of that cell and drag it down the page. All the cells in the column are instantly filled in with the right formulae and values. Now, I have an entire column of running medians.
  • Move over one column, and repeat, to get the running medians of the running medians.
  • Continue this until the data stabilize -- in my case, just one more time.
  • Graph it.

Tukey and Mosteller was written in 1977. I was still using a Monroe.

I'd just taken an elementary statistics class. Everyone ooh'd and ah'd over the HP-35 pocket calculator one of the students had bought. We'd never seen one before.

To calculate running medians, I wrote a program, in FORTRAN, on cards.

From seeing the book on my shelf, through discovering OOCalc's MEDIAN function, to getting a slick graph of the stabilized running medians took me perhaps ten minutes.

Okay, I had to collect two months' of data to have something to analyze, but it was worth it.