Next Previous Contents

4. Password Cracking

In most cases, computer access is protected by username and password. Usually it is not too difficult to find out some or all user names on a given computer. Names leak as email addresses and in usenet posts. Utilities like finger or rwho may give some. There are many standard user names, root being the most obvious one. System logs and similar may be visible on the web, and found using Google.

Finding the password is not so simple. Usually one has to brute-force, trying all words in a dictionary, a list of first names, or just all strings of at most six printable symbols. A good password cracker is John the Ripper. Given the passwd file of some Unix machine, say with two or three dozen user names and passwords, one normally finds two or three vulnerable ones within a day or two.

For Windows NT there is the very fast L0phtCrack password cracker. Later versions also work on W2000.

How does one obtain the passwd file? On a local machine it is just readable. Sometimes one can obtain it remotely via anonymous ftp, or via a CGI script, using a .../../../../../etc/passwd parameter. Of course, nowadays people often use shadow password files, and these may be more difficult to obtain.

On most Unix systems, passwords are at most 8 characters long. Picking control characters or non-ASCII characters is bound to give problems when logging in remotely via other systems, so it is reasonable to expect characters in the range 32-126. Now 95^6 = 0.74 . 10^12 and 95^8 = 0.66 . 10^16 so if one can check one password in a microsecond then nine days suffice to check all strings of length at most six. (On my computer a DES-type check takes 10 microseconds.)

Of course, it is not necessary to try all possible strings. Trying all words in a fat dictionary takes only a few minutes.

Exercise Crack some or all of the following passwords.

aap:$1$ucQls3qa$sUbnWL2cEHtjM5qnGGs1q/:501:501::/home/aap:/bin/bash
noot:$1$rPntDw4c$kf5jBMhdI7JfT7C2FlkBs1:502:502::/home/noot:/bin/bash
mies:$1$0AMQsYdo$bLi7rpEK7IaYzmt3bVYH70:503:503::/home/mies:/bin/bash
wim:$1$PRrAVTSy$3xDb1kZi9Rz/pTG4KecQK/:504:504::/home/wim:/bin/bash
zus:$1$ou8y1Y/R$NTwWWHtWN.TBYEi.1ul.l.:505:505::/home/zus:/bin/bash

4.1 Common passwords

I find that common passwords include '' (the empty string), 'secret', 'password' (and in Holland the Dutch versions 'geheim', 'wachtwoord'), strings of consecutive digits or letters like '123', '12345', '1234567', 'abc', and proper names like 'eric', 'kevin', 'sandra', 'melissa', 'Nikita'.

On 2010-12-12 hackers published about 750000 encrypted passwords of users of Gawker blogs such as Lifehacker, Consumer, Gizmodo, Gawker, Jezebel, io9, Jalopnik, Kotaku, Deadspin, Fleshbot. (This explains the occurrence of these words below.) I cracked a bit more than 425000 of these passwords in about 12 hours. The list below gives the 350 most frequent passwords with their frequencies in this cracked set.

   2516 123456        124 swordfis       89 bubbles        74 tennis         62 fuckyou2
   2190 password      124 summer         88 startrek       74 scooby         62 fender
   1205 12345678      122 asdf           88 diamond        74 naruto         62 butter
    782 lifehack      121 matthew        88 coffee         74 mercedes       61 wolverin
    696 qwerty        121 asdfgh         88 butterfl       74 maxwell        61 samsung
    499 abc123        120 mustang        88 brooklyn       74 fluffy         61 rush2112
    459 12345         119 yankees        88 amanda         74 eagles         61 podnTN76
    439 monkey        117 hannah         88 adidas         74 11111111       61 pa55word
    413 111111        117 asdfghjk       87 test           73 penguin        61 lalala
    385 consumer      117 1qaz2wsx       86 wordpass       73 muffin         61 christin
    376 letmein       116 cookie         86 sparky         73 bullshit       60 yourmom
    351 1234          115 midnight       86 morgan         73 6Lthpku5       60 westside
    318 dragon        115 123qwe         86 merlin         72 steelers       60 rocket
    307 trustno1      114 scooter        86 maverick       72 jasper         60 melissa
    303 baseball      114 purple         86 elephant       72 flower         60 icecream
    302 gizmodo       114 banana         86 Highlife       72 ferrari        60 casper
    300 whatever      113 matrix         85 poopoo         71 slipknot       59 spooky
    297 superman      113 jezebel        85 nirvana        71 pookie         59 random
    276 1234567       113 daniel         85 love           71 murphy         59 prince
    266 sunshine      111 hunter         85 liverpoo       71 joseph         59 mookie
    266 iloveyou      111 freedom        85 lauren         71 calvin         59 greenday
    262 fuckyou       110 secret         84 stupid         71 apples         59 cooper
    256 starwars      110 redsox         84 chelsea        71 159753         59 arsenal
    255 shadow        108 spiderma       84 8FNFYgx1       70 tucker         58 hello1
    241 princess      108 phoenix        83 compaq         70 martin         58 guinness
    234 cheese        108 joshua         83 boomer         70 11235813       58 gamecube
    231 123123        108 jessica        82 yellow         69 whocares       58 diablo
    229 computer      108 asshole        82 sophie         69 pineappl       58 999999
    226 gawker        108 asdf1234       82 q1w2e3r4       69 nicholas       58 98765432
    223 football      107 william        82 fucker         69 jackass        58 333333
    204 blahblah      107 qwertyui       82 coolness       69 goober         58 131313
    203 nintendo      107 jackson        82 cocacola       69 chester        58 00000000
    199 000000        107 foobar         82 blink182       69 8675309        57 xbox360
    198 soccer        106 nicole         81 zxcvbnm        69 222222         57 school
    195 654321        106 123321         80 snowball       68 winston        57 iceman
    193 asdfasdf      105 peanut         80 snoopy         68 somethin       57 goldfish
    184 master        104 samantha       80 rachel         68 please         57 friends
    182 passw0rd      104 mickey         80 gundam         68 dakota         57 defamer
    182 michael       104 booger         80 alexande       68 112233         57 555555
    175 hello         103 poop           79 jasmine        67 wonkette       56 winter
    170 kotaku        102 hockey         79 danielle       67 rosebud        56 welcome1
    167 pepper        100 thx1138        79 basketba       67 dallas         56 qaz159
    165 jennifer      100 121212         79 7777777        67 696969         56 porsche
    165 666666         99 ashley         78 thunder        66 shithead       56 monkey1
    164 welcome        98 silver         78 snickers       66 popcorn        56 kermit
    164 buster         98 gizmodo1       78 patrick        66 peaches        56 jackie
    161 Password       98 chocolat       78 darkness       66 pakistan       56 hardcore
    159 batman         98 booboo         78 boston         66 dexter         56 donkey
    158 1q2w3e4r       97 metallic       78 abcd1234       66 canada         55 success
    157 maggie         97 1q2w3e         77 pumpkin        65 victoria       55 richard
    154 michelle       96 bailey         77 creative       65 rockstar       55 redrum
    153 killer         95 google         77 88888888       65 qwe123         55 rainbow
    153 andrew         95 babygirl       76 smokey         65 newyork        55 poohbear
    151 pokemon        94 thomas         76 sample12       65 nathan         55 jordan23
    151 internet       94 simpsons       76 godzilla       65 lovely         55 heather
    150 biteme         94 remember       76 december       65 benjamin       55 fireball
    148 orange         94 gateway        76 corvette       64 robert         55 dietcoke
    148 jordan         93 oliver         76 bandit         64 pickle         55 access
    147 ginger         93 monster        76 123abc         64 passport       54 warcraft
    145 123            92 qazwsx         75 voodoo         64 mother         54 skippy
    144 aaaaaa         91 taylor         75 turtle         64 harley         54 ranger
    138 tigger         91 madison        75 spider         64 forever        54 radiohea
    137 charlie        91 guitar         75 london         64 falcon         54 qwerty1
    136 chicken        91 anthony        75 jonathan       63 trinity        54 qwer1234
    135 nothing        90 justin         75 hello123       63 barney         54 michael1
    132 fuckoff        90 elizabet       75 hahaha         62 willow         54 drpepper
    130 deadspin       90 1111           75 chicago        62 ncc1701        54 destiny
    125 valleywa       89 november       75 brandon        62 kitten         54 cowboy
    125 qwerty12       89 monkey12       75 austin         62 jesus          54 changeme
    125 george         89 drowssap       74 yvyfCbI6       62 hacker         53 xxxxxx

4.2 Unix password algorithms

Having passwords in cleartext in a file is a bad idea - they will be compromised. Unix introduced the idea of feeding the password to some one-way hash function and storing the result. Now the password file /etc/passwd can (and does) have general read permission.

M-209

Unix V5 and V6 used a simulation of the M-209 cipher machine, and encrypted (the first 8 characters of) the password to an 8-char string.

Anecdote In the Unix V6 days I once gave a Polish colleague a username and password, and told him his username and said that he could guess the password. He sat down and logged in, and was surprised that it worked. `How did you know I was going to try "ladne"?' But I had given him the password "aline".

(Thus, we found a collision. Rechecking:

# passwd ka
Usage: passwd user password
# passwd ka aline
# grep ka /etc/passwd
ka:ugiTjezp:11:1::/usr/ka:
# passwd ka ladne
# grep ka /etc/passwd
ka:ugiTjezp:11:1::/usr/ka:
#
We see another weakness here: this version of passwd required the password on the command line. This means that it would be visible to someone who did ps at the same time.)

Modified DES

As Morris & Thompson wrote, the encryption algorithm was too fast, and brute force search was too easy. So, in Unix V7 the algorithm was replaced by a modified DES, repeated 25 times. DES because it was slower and safer, repeated to make it even slower, and modified in order to protect against hardware implementations of the actual DES. (Moreover, there has always been some uncertainty: could it be that there is some backdoor in DES? Maybe a modified DES is more secure than the actual DES.)

The input to this encryption consists of a 12-bit salt concatenated with the user's password. The 64-bit output is converted to an 11-char string and compared to the entry in /etc/passwd, which has a 13-char string representing salt and encrypted password. (DES has two inputs: key and data. Here salt plus password is used as 64-bit key, and the initial data is the constant zero.)

Bits are converted to printing characters in groups of 6, using the alphabet ./0-9A-Za-z (in this order).

The salt makes sure that one cannot precompute encryptions for all dictionary words, say - each word has 4096 different encryptions. It is chosen at random when the user sets his password.

These passwords are recognized by their length of 13 characters.

root:VwL97VCAx1Qhs:0:1::/:
Exercise Crack this.

This is what the standard Unix routine crypt() does. Today it is fairly insecure. Exhaustive search is feasible with special purpose hardware, and the speed of 100000 attempts/second is too high. Only 8 characters of the password are used. The salt is too small - it is quite feasible to precompute the encryption for all possible 4096 salts and all words in a large dictionary or word list and store the result on disk.

(Also Windows NT uses a form of DES. It is even weaker and allows 800000 attempts/second. There is no salt.)

What to do about the weakness of crypt()? The main defense is now the use of shadow password files, that is, the hiding of the password file from the users. But that has all the problems that caused Unix to abandon a plaintext password file. It is better to replace crypt().

Various cryptographic hash functions are designed to be fast, and such that constructing collisions or finding preimages is infeasible. That latter property is precisely what is needed for password encryption, but a password hash must be slow. Brute force cracking of raw MD5 is very easy.

FreeBSD-MD5 and bcrypt

According to US law, exporting cryptographic software was a form of munitions export. This caused a lot of stupid annoyances. Of course everybody in the whole world had DES source code, but nevertheless distribution was restricted.

In order to overcome this difficulty, FreeBSD 4.2 switched to a complicated algorithm based on MD5. That had several advantages: it is a bit stronger, with 128-bit output instead of 64-bit, it uses the entire password instead of only the first 8 characters, and it is slower (the digest is rehashed a thousand times), so brute force takes longer. (On my machine 2000 attempts/sec, against 100000 for modified DES.) Also RedHat 6.0 and up uses MD5 (but SuSE does not by default - ach). These FreeBSD-type MD5 passwords can be recognized as 34-char encrypted passwords starting with $1$. The first 8 characters following are the salt. Poul-Henning Kamp described his design criteria.

Niels Provos and David Mazières developed bcrypt(), the best choice for a password hash today. It is based on Blowfish, and contains facilities for making the algorithm arbitrarily expensive. It is used by OpenBSD, and has passwords starting with $2$, $2a$, $2x$, or $2y$. Brute force is even slower here, at 100 attempts/sec.

8-bit input

Various implementations of crypt() have suffered from problems in an 8-bit environment since the programmers expected ASCII input. What to do with non-ASCII bytes? In some implementations they were replaced by '?', so that a strong password turned into the constant string "????????". In some implementations the high order bit was masked off, so that 0x80 became end-of-string. In 2011 a sign-extension bug was discovered in the Openwall implementation of Blowfish. The $2y$-prefix in bcrypt()-generated passwords indicates that they were generated by a post-fix algorithm.

4.3 MySQL passwords

Before version 4.1 MySQL had a very weak password algorithm. Here it is.

void hash_password(ulong *result, const char *password, uint password_len) {
        ulong nr=1345345333L, add=7, nr2=0x12345671L;
        ulong tmp;
        const char *password_end = password + password_len;

        for (; password < password_end; password++) {
                if (*password == ' ' || *password == '\t')
                        continue;
                tmp = (ulong) (uchar) *password;
                nr ^= (((nr & 63)+add)*tmp) + (nr << 8);
                nr2 += (nr2 << 8) ^ nr;
                add += tmp;
        }
        result[0]=nr & (((ulong) 1L << 31) -1L);
        result[1]=nr2 & (((ulong) 1L << 31) -1L);
//      printf("%08lx%08lx\n", result[0], result[1]);
}
The input is an arbitrary string. The output is a 16-hexdigit hash, 62 bits. This is weak for many reasons. It is fast, so brute force cracking is easy. There is no salt, so precomputation is possible. The value of nr2 is not used in the computation of nr, so a cracker can forget about the second half of the hash and work with the first 31 bits only. On the other hand, nr2 provides valuable information. Since the final nr and nr2 are known (except for their high order bit) one can find the value of nr2 before the last step, so that incorrect candidate passwords can be discarded without looking at their last byte, greatly speeding up the search. And with a bit of work one can also derive information about the stages after all but two or all but three bytes of the password, making the search even faster. This means that cracking 8-symbol passwords is quick. See also Philippe Vigier's poc.c.

Recent versions of MySQL use a double application of SHA1, and do not have obvious weaknesses. However, many sites still use the old scheme, e.g. for compatibility reasons.

Exercise Crack some or all of the following passwords.

Joe:4d432f8b35591ab8
Edv:0918419960333840
Bas:0692c5e37db23ab9
Jam:18bd29ea2b511d53

4.4 ZIP passwords

The PKZIP utility is used to create compressed archives. The format of the outputfile is well-documented. One can protect archives with a password. In the Microsoft world many (usually commercial) brute force ZIP password crackers are available, the most famous being Elcomsoft's AZPR. In the Unix world one has zipcracker (for distributed cracking over a Beowulf network) and fcrackzip (for simple and fast brute force), that come with source code. There is also pkcrack that implements the algorithm described by Eli Biham and Paul Kocher and uses some (at least 13 bytes) known plaintext. Altogether, it is usually feasible to find the password of a traditional ZIP archive. Recognizing that the password protection had become too weak, PKZIP 5.0 introduced stronger encryption.

Concerning the cracking speed one can expect: a moment ago I needed to crack a ZIP password and found that zipcracker did approximately 1000000 tries/sec on a 1400MHz machine.

Exercise In my mailbox I found the password protected zip file Message.zip. What is the password? What does this file contain? (File deleted. People had safety concerns.)

4.5 PDF passwords

Adobe's Portable Document Format is one of the more popular formats in which to distribute files representing printed material. Such files are commonly viewed with Acrobat Reader or with xpdf. The format allows the creator of the file to set certain protections. The protection comes in two flavours: protection bits and password protection. There may be two passwords: the owner's password and the user password.

The permission bits are not enforced in any way. But Adobe asks implementers of PDF readers to respect their settings. The Linux xpdf indeed respects the bits. Of course it is trivial to modify the source removing the tests on okToPrint(), etc.

Revision 2 knows about four permission bits (in a 16-bit short):

Bit Permission
2 Print
3 Change
4 Copy
5 Annotate

The 16-bit protection mode has the leading ten bits set to 1, the trailing two bits set to 0, and the remaining four permission bits Rev 2: Modifying / Copying text & graphics, accessibility support / Adding text annotations / Printing Rev 3: Filling in forms and signing the document / Assembling the document: adding or deleting pages, bookmarks, thumbnail images / Printing in a way that allows one to obtain a digital copy.

Encryption dictionary entries for the standard security handler R Revision (2, 3 or 4) O String related to Owner and User Password U String related to User Password P Permission mask

For people who don't know how to do this themselves, Password Crackers Inc. will remove permission bit protection of some document for $40, and password protection for $500. They write:

Our Acrobat .pdf recovery service does not brute-force check for passwords.
We search for the encryption key that Acrobat used to encrypt the file.
There are many fewer keys than possible passwords, hence we are able to
search all of the possible keys in less than 25 days.

4.6 Avoiding brute force

On the other hand, brute force may not be required.

Default passwords

Many services come with a default password and instructions to change that immediately, but often the default is left. See (the old and outdated) Default Password List to find, e.g., the default Debian LILO password, or the old Slackware user names without password. See also this webzcan list.

Example WWWBoard is a threaded World Wide Web discussion forum and message board. It comes with a default password file

WebAdmin:aepTOqxOi4i8U
(or WebAdmin:aeb/uHhRv6x2LQvxyii4Azf1, or WebAdmin:$1$ae$eVdFF2d.W9C3JSO3qluZ70) where the password is WebBoard. It is easy to find instances of WWWBoard with the default password untouched, for example in /bbs/passwd.txt. Now in /cgi-bin/wwwadmin.pl one might find the admin script and help the sysadmin by discarding unwanted messages.

Cisco reports (April 2004): A default username/password pair is present in all releases of the Wireless LAN Solution Engine (WLSE) and Hosting Solution Engine (HSE) software. A user who logs in using this username has complete control of the device. This username cannot be disabled. There is no workaround.

Three days later we see: Backdoor in X-Micro WLAN 11b Broadband Router: The following username and password works in every case, even if you set an other password on the web interface: Username: super, Password: super. By default the builtin webserver is listening on all network interfaces (if connected to the internet, then it is accessible from the internet too). Using the webinterface one can install new firmware, download the old, view your password, etc., etc. (This is a funny one. The X-Micro people soon "fixed" this problem, and released new firmware. The new firmware has backdoor username and password "1502".)

There are lots and lots of such cases. Sometimes planned, sometimes by accident. Access paths needed in the testing phase are not removed when the product is released.

Snooping

Some systems allow read-only access to kernel memory (e.g. in order to allow the ps utility to read system tables), and one can read tty input buffers and snoop passwords. (Nostalgia - this worked beautifully on Unix V6.)

On a local network (and local may be the wireless network in the building you stand in front of) one can sniff passwords using an ethernet sniffer.

Reading in the news

Some passwords are sufficiently interesting to be published in the news. Usually such posts are removed rather quickly again, but internet has a memory, and it is very difficult to erase what was published once. For example, recently (2007-02-11) arnezami published that the processing key for HD DVD discs is 09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0. Today Google gives more than a million hits on this string.

Asking Google

Many sites have lists of cracked MD5 passwords. So if you find one, say c3875d07f44c422f3b3bc019c23e16ae, then ask Google before trying to crack. Immediately a dozen sites will tell you that this is denis.

4.7 Time-memory tradeoff

If one is going to do brute force, a lot of time is needed. But if passwords have to be cracked repeatedly, it is possible to do an (expensive) precomputation with the result that subsequent password cracking will be fast.

The original idea is due to Hellman. Later many refinements have been proposed. A primitive version goes as follows.

Suppose one has a known algorithm, known plaintext and known ciphertext, and wants to find the key used. (This is often the case with passwords: the password is used as the key, and a constant is enciphered, and the password file contains the result of enciphering.) Say C = E(K,P), with obvious abbreviations.

It is inconvenient when C is longer than K, so we invent a reduction R that shortens C to have the same length as K, maybe just by throwing out some bits. Put f(K) = R(E(K,P)). Then f is a map from the space of keys to itself.

Starting from a key K0, put K1=f(K0), K2=f(K1), ... until we reach Kn, for some very large n. Store the pair (K0,Kn) in memory, and forget the intermediate steps.

The idea is to cover the space of all keys by such "threads" of keys, and to remember start and end of each thread. Either give all threads the same length n, or choose ends in such a way that they have some recognizable property, maybe having the first ten bits 0 or so.

Now the password cracking works as follows: given f(K) we would like to find K. Apply f many times to f(K) until we find the end Kn of the thread. Find the pair (K0,Kn) from the table. Apply f many times to K0 until we reach f(K). The step before that we had K.

This approach is best when there are no collisions, no pairs of distinct keys K, K' with f(K) = f(K'). If there are, then the threads merge and there wil be several possible K0 for a given Kn, leading to larger tables and larger cracking time. There is nothing one can do about the function E, it is given, but Oechslin proposes to vary the reduction function R along the thread. That speeds up things by a factor of at least 2.

On lasecpc13.epfl.ch an "instant NT password cracker" is available, that uses a 0.95 GB precomputed table to crack alphanumerical WinNT passwords in 5 sec average (and returns "not found" when the password contains non-alphanumerics). The precomputation took about six CPU days. For the theory, see Oech03.pdf.

Oechslin-type precomputed tables are known as rainbow tables. If the password algorithm uses a b-bit random salt, precomputation is made a factor 2b more difficult. See also Wikipedia.

Now that GPU's have become so fast, rainbow tables are obsolete.

4.8 Side channels and timing

Sometimes it is possible to recover a password or key by observing the timing or other external characteristics of a server or password checker.

Tenex

There are several versions of the story referred to earlier that it used to be possible on the Tenex system in the early seventies to recover a password by laying out an attempted password across a page boundary, and observing whether the checker incurred a page fault.

The bug was that you could align the given password string so that it was at the end of a page boundary, and have the next page in the address space mapped to a non-existant page of a write-protected file. Normally, Tenex would create a page when you tried to access a non-existant page, but in this case it couldn't (since the file was write-protected).

So, you did a password-checking system call (e.g. the system call which tried to obtain owner access to another directory) set up in this way and you would get one of two failures: Incorrect Password meant that your prefix was wrong, Page Fault meant that your prefix was right but that you need more characters. -- Mark Crispin

Blind SQL injection

One can successfully use timing in attacks using blind SQL injection. For example, give queries that will conditionally do a BENCHMARK or sleep/delay, and in this way discover user names and passwords in essentially the same way as in the Tenex example, where now one sees whether or not a delay occurs.

Cache effects

The time needed for a memory access depends on whether the looked for data was cached already. This non-uniformity can be exploited. For example, Dan Bernstein demonstrated that one can recover an AES key by statistics over timing data. And Colin Percival shows that when hyperthreading is enabled, one process can determine which cache lines were used by the other process, and thus get strong information on an SSL key used. This means that data-dependent table lookup is suspect now.

4.9 Captchas - protection by image

A somewhat different situation is that where a large database is meant to be freely accessed by people (using a browser), but not by robots who might copy the entire contents. Frequently the protection consists of a picture with more or less clearly visible text, where the user is requested to type the text in an input field.

 

Such images are now known as captchas, and used everywhere where it is desirable to verify that the software is dealing with an actual human. An important use is that against spam robots. Long ago I wrote

Project Write (or find) a program that recognizes the text in such images.

but there are many such programs now, and captchas that cannot be read by software tend to be difficult to parse for humans as well. There exist audio versions for blind users. See also Wikipedia.


Next Previous Contents