Randomly-generated passphrases offer a major security upgrade over user-chosen passwords. Estimating the difficulty of guessing or cracking a human-chosen password is very difficult. It was the primary topic of my own PhD thesis and remains an active area of research. (One of many difficulties when people choose passwords themselves is that people aren’t very good at making random, unpredictable choices.)
Measuring the security of a randomly-generated passphrase is easy. The most common approach to randomly-generated passphrases (immortalized by XKCD) is to simply choose several words from a list of words, at random. The more words you choose, or the longer the list, the harder it is to crack. Looking at it mathematically, for k words chosen from a list of length n, there are kn possible passphrases of this type. It will take an adversary about kn/2 guesses on average to crack this passphrase. This leaves a big question, though: where do we get a list of words suitable for passphrases, and how do we choose the length of that list?
Several word lists have been published for different purposes; thus far, there has been little scientific evaluation of their usability. The most popular is Arnold Reinhold’s Diceware list, first published in 1995. This list contains 7,776 words, equal to the number of possible ordered rolls of five six-sided dice (7776=65), making it suitable for using standard dice as a source of randomness. While the Diceware list has been used for over twenty years, we believe there are several avenues to improve the usability and are introducing three new lists for use with a set of five dice (as part of its Summer Security Reboot Campaign, EFF is providing a dice set to donors).
Enhancements over the Diceware list
The Diceware list can provide strong security, but offers some challenges to usability. In particular, some of the words on the list can be hard to memorize, hard to spell, or easy to confuse with another word.
- It contains many rare words such as buret, novo, vacuo
- It contains unusual proper names such as della, ervin, eaton, moran
- It contains a few strange letter sequences such as aaaa, ll, nbis
- It contains some words with punctuation such as ain’t, don’t, he’ll
- It contains individual letters and non-word bigrams like tl, wq, zf
- It contains numbers and variants such as 46, 99 and 99th
- It contains many vulgar words
- Diceware passwords need spaces to be correctly decoded, e.g. in and put are in the list as well as input.
Note that several of these problems are exacerbated for users with a soft keyboard or other typing systems that relies on word recognition. Using only valid dictionary words makes this setup much easier.