I never thought that Bradford Uni was anything more than an overblown technical college and it looks like I'm right. If one of their "professors" thinks that indentifying 80% of the words in a novel with all the punctuation and spaces removed is a valid test for an algorithm to find things in the genetic code the brain hasn't spotted then it's even worse than I thought. Just to test this take the same text and give it to a ten year-old - I'll wager that they'll get more than an 80% hit rate. Sorry Si ... come back when your algorithm can do something a child can't eh?