Ever since the (alas failed, due to my botching one of the sub-interviews) Google interview, I've had this hunger for programming puzzles. About a month ago I posted a chunk of numbers on LJ from one of those efforts, now I'm trying to figure out how to most efficiently solve, given a large cluster of computers with a low-latency interconnect, a paraphrase of a problem just posed by the QI Elves on twitter: determine the 10 longest placenames in Britain that do not have any repeated letters.
On a single computer, a first approach might be to use perl's regex engine, or to implement something that would do what it would do in some other language, but getting the biggest 10 out of that would be tricky, and there's a lot of presumably-repeated work that could be pruned. Plus we're missing out on some potential (not necessarily worthwhile) optimisations that come from knowing how words are put together in English (that might be violated by some Gaelic, Norman, or other placenames). Having midsized or large clusters of computers available to us changes things.