Sunday, March 16, 2008

Setting Deadlines: cutoff-after

When a command runs too long, I give it the hook.

If it's running too long but I want it to continue, I suspend it and put it in the background.
$ sleep 100; echo hello
^Z
[1]+ Stopped sleep 100
hello
$ bg
[1]+ sleep 100 &
Other times, though, I just kill it.
$ sleep 100
^C
$
Something's gone wrong and the job has hung, or I misjudged and I need to re-design something to speed up my code.

All fine, if I'm watching, but what about when the job's running at night, from a cron job? What if I come in the next morning, nothing's finished, my job's still running, and my machine's on its knees?

For these, I have a tool, cutoff-after, that sets deadlines. It runs my jobs for a certain amount of time, but cuts them off if they don't finish. [Click on the screenshot to enlarge.]

The example illustrates that it offers three outcomes: success, failure, and timeout. If a command completes on time, its exit status is preserved. [Here, I'm showing you success or failure by coloring my prompt.] If the command times out, cutoff-after fails, but announces the timeout. All three outcomes are announced on standard error.

Here's the code:
$ cat cutoff-after
#!/usr/bin/perl
# adapted from the Perl Cookbook, pp. 594-595

# play nice
use warnings;
use strict;

# parse and sanity-check args
sub usage {
$0 =~ s(.*/)();
die "usage: $0 Nsecs cmd-with-args\n";
}
@ARGV > 1 or usage;

my ( $deadline, @cmd ) = @ARGV;
( $deadline =~ /^\d+$/ ) && ( $deadline > 0 ) or usage;

sub try {

eval {

# set the deadline
$SIG{ALRM} = sub { die; };
alarm($deadline);

# execute the command
system "@_"; # command and arguments

# say what happened
die( ( $? ? "failure" : "success" ) . ": @cmd\n" );
};

}

sub catch {
if ( $@ =~ /^success: / ) {
exit(0);
}
elsif ( $@ =~ /^failure: / ) {
die "$@";
}

# cleanliness is next to godliness
local $SIG{TERM} = 'IGNORE';
kill "TERM", -$$;

die "timeout: @cmd\n";
}

try @cmd;
catch;
There's probably a shorter implementation in other languages.
If you want to try, here are test cases encoded in a shell script:
$ cat cutoff-after.t
#!/bin/bash -x

cutoff-after --help
echo bad usage returns $?
cutoff-after 2 sleep 1
echo success returns $?
cutoff-after 1 ls nonexistentfilename
echo failure returns $?
cutoff-after 1 sleep -1
echo different failure returns $?
cutoff-after 1 sleep 2
echo timeout returns $?
cutoff-after 1 'false; echo hello'
echo compound statement returns $?

No comments: