Friday, October 23, 2009

Google's Bulgarian Picture Dictionary

Language translation is serious business, but playing with automatic translation tools is just fun.

I use Google Translate a lot, along with their translation bots. I've played with the Word Monkey gadget for iGoogle.

I've even built spreadsheets to do translation with Google's spreadsheet functions; I'd bet even beginning corpus-linguistics students could do some cool experiments with these.

Today, my pal, Kevin Cohen, put up this Google Chat status message:
My favorite Bulgarian word is хляб. What's yours?
I put it into Google Image Search, to see what would happen, and got this.

I think I can guess what хляб means.

Thursday, October 22, 2009

A GUI Replacement for sshfs

I've been using sshfs to remote-mount my home machine onto my work machine. Now, I've stumbled on an alternative that feels better integrated with my Gnome desktop: gvfs (gnome virtual file system) in Nautilus.

The short version:
$ nautilus sftp://test.com
brings up a file-browser window with test.com in it, which I can then browse and click around in. Also, test.com is mounted as ~/.gvfs/sftp on test.com/

Yes, it has all those blanks. Oh well. You can still get to it on the command line and in scripts.

If you already have a file-browser window up, just use the URI sftp://test.com

If you need to go in through a different port from 22, say 666, you'll first need to add a pair of lines to your .ssh/config file that look like like this:
Host test.com
Port 666
You can test whether they work or not by using ssh by hand.
$ ssh test.com

Monday, October 19, 2009

Bugs

All programs have bugs. A "mature program" is a program with more obscure bugs.

When I first started programming, I was convinced that half my programs had uncovered bugs in the compiler.

They never were. "Oh. 'Missing semicolon on line 71' goes away if I put a semicolon in."

Because compilers are more mature than my code, the probability is higher that bugs are mine.

When a shell script I wrote last week failed, I began carving it down to see what I'd done wrong. This is usually fast and easy--chop pieces out of the script until you can get a very simplified statement that doesn't do what I thought it should do, then go back to read the man page to see why.

Here's the simplified test case I ended up with:
set -x # this is required
PS4='$(true)$ ' # must be a command in the prompt, but any command will do

false || A=3 # first expression must fail, second must be a variable assignment
echo $? # Should be 0, but isn't. Odd.
Much to my surprise, it's bug in bash, both versions 3 and 4.

Obscure? Sure. You have to have a lot of special things going on, and the symptom is a bad exit code. I reported it and used a workaround (if .. fi instead of the shortcut logical or).

Always remember: every program has bugs. They're only almost always yours.

Saturday, October 3, 2009

Hello, World Again

I'd like to try saying "We sometimes take the shell too much for granted," another way.

Advice from masters is often good advice.

Here's my favorite chunk of the greatest of all programming texts, Kernighan and Ritchie's The C Programming Language (Prentice-Hall, 1978).

1.1 Getting Started

The only way to learn a new programming language is by writing programs in it. The first program to write is the same for all languages:

Print the words
hello, world
This is the basic hurdle; to leap over it you have to be able to create the program text somewhere, compile it successfully, load it, run it, and find out where your output went. With these mechanical details mastered, everything else is comparatively easy.

In C, the program to print "hello, world" is
#include
main()
{
printf("hello, world\n");
}

Just how to run this program depends on the system you are using. As a specific example, on the UNIX operating system you must create the program in a file whose name ends in ".c", such as hello.c, then compile it with the command
cc hello.c
If you haven't botched anything, such as omitting a character or misspelling something, the compilation will proceed silently, and make an executable file called a.out. Running that by the command
a.out
will produce
hello, world
as its output. On other systems, the rules will be different; check with a local expert.

Exercise 1-1. Run this program on your system. Experiment with leaving out parts of the program, to see what error messages you get.

Fine advice. Let's do exercise 1-1, but in the shell.

Run the analogous program? Okay.
$ echo hello, world
hello, world
Leave out parts? Let's leave out a part of the string.
$ echo hell, world
hell, world
Or part of the command.
$ eco hello, world
bash: eco: command not found
How about whitespace? It's okay to leave it out of the string,
$ echo hello,world
hello,world
but you need some between a command and its arguments.
$ echohello, world
bash: echohello,: command not found
My point: Programing in the shell is quick and easy. You just type. There's no editing, no special naming, no compiling, no a.out file, no loading and running, no need to consult a local expert.

If you type something incomprehensible, the shell gives you an error message and lets you try again, right away.

Thursday, October 1, 2009

The Shell Enters a Beauty Contest

I'd never tout the shell as the be-all and end-all of programming languages, but it gets less attention and respect than it deserves.

For example, folks will remark, casually, that shell syntax is ugly. Who would design a language that doesn't even let you put spaces around the '=' in an assignment?
$ x=3
$ y = 3
-bash: y: command not found
$ z= 5
-bash: 5: command not found
Eeew. Real programs are in C. Or Perl. Or Python. Or Haskell. Or ...

Yep, the shell syntax has some design flaws all right. But let's run another beauty contest.

First, contestant #1:
#include <unistd.h>
#include <stdlib.h>

int main(void)
{
int     fd[2], nbytes;
pid_t   pid;

pipe(fd);

if ((pid = fork()) == 0) {
  dup2(fd[0], 0);
  close(fd[1]);
  execlp("/bin/grep", "/bin/grep", "^z", NULL);
} else {
  dup2(fd[1], 1);
  close(fd[0]);
  execlp("/bin/ls", "/bin/ls", "-1", "/bin", NULL);
  wait(NULL);
  exit(0);
}

return(0);
}

Next, contestant #2:

ls /bin | grep ^z
And contestant #1 doesn't even have normal error checking, which would make it much longer, uglier, and hard-to-follow.

Programmers get so used to the shell that they focus on its flaws, but take its virtues for granted.

Don't it always seem to go that you don't know what you've got till it's gone? -- Joni Mitchell
What tastes of paradise does the shell offer besides pipes? I/O redirection. Ease of process creation. Multi-process programming. Parallelism. Command-line editing. For that matter, the entire idea of a CLI, a "command-line interface."

Once I start listing things, it's hard to stop.

All this in a language you can use in scripts, or just by doing nothing harder than typing at a prompt.