Smashing Hashes with Token Swapping Attacks
Table of Contents
Last Updated: 5-15-2023.
Note: Password data mentioned in this article was obtained through public resources to improve overall password security posture. Information shared in public breaches helps improve security recommendations.
Introduction #
There are many ways to crack passwords, and offensive tooling for password security has provided a lot of new and exciting attack methods to reveal plaintext. In my previous post, I stated the following:
Not surprisingly, the best wordlists come from actual passwords as the human element in setting passwords tends to permeate through to create predictable patterns that are often targeted.
Some of my favorite attack techniques utilize existing found material back into the cracking process. There are a few ways to do this, but today I want to introduce my personal favorite Token Swapping.
Token Swapping Concept #
To best understand the attack, first, I want to introduce two principles that password crackers abuse to recover plaintext.
First, human passwords often share patterns. When people develop a new password, they tend to follow patterns to make it easier to remember. However, they don’t realize that other people also follow similar patterns.
For example, if I said I needed a password that included the following:
- Upper Case Character
- Lower Case Character
- Digits
- Special Character
You might have come up with something like Word + Digits + Special
because it is the order I provided them in or that most English speakers tend to read from left to right, and it reads more normally.
Second, secret material is often shared or reused, especially among a shared user pool. For example, people use similar words when setting passwords, such as years, first names, pet names, religious items, places, phrases, and other material that might be included. Other things like the product or company/service name will likely appear when a series of hashes come from the same source.
These principles show that seemingly unrelated passwords may have more in common than initially seen.
If the best wordlists come from actual passwords and passwords from the same source share many patterns, then the most logical step is to incorporate other founds from the same/similar source into new wordlists.
The Token Swapping Attack #
The logic we want to replicate is:
- Split plaintext into segments
- Convert each token into a
hashcat
mask - Find other plaintext that contain the same mask
- Swap the token into other plaintexts using the mask
This provides two benefits:
- Plaintext material is preserved
- Plaintext masks/structure is preserved
This means we can reuse the existing founds into the process exactly how they were seen and place them directly into known patterns without guessing the password structure.
This process also has a high degree of entropy as the number of new candidates you can generate grows greatly depending on the token size and number of available plaintexts to swap.
To instrument this, I use
Maskcat with the mutate
mode, which will read input from stdin
, then split it into tokens and begin swapping. Because it is reading from stdin
, we can use shuf
to randomize the input and get different candidates every run.
cat found.lst | shuf | mutate 6 >> mutate.lst
cat found.lst | shuf | mutate 5 >> mutate.lst
cat found.lst | shuf | mutate 4 >> mutate.lst
usort mutate.lst
Testing On Data #
To test this, we will use a few publicly available breaches where many low-hanging fruit items have been picked off. No names or hash types will be shared, but breaches will be referred to by their ID number. We will only be working with the left hashes, so all the low-hanging fruit should be picked off.
Test 1 #
To kick it off, returning to 1865
, I found a pattern in the remaining hashes, but I could not enumerate them with rules, hybrid, and mask attacks. Returning to this list with this new technique makes sense, as many previous items were generated into seemingly random patterns with the same character set.
rrRRRR3R
UUU4UUUk
fffKK1KK
kkkkk7kf
TT3TTTTE
eeee5eSS
mNNNN2NN
yyyyyRR3
SS3SSJJJ
hNNNN2NN
If we look at the mask structure, we can see some patterns but nothing that would be super easy to enumerate:
$ mode -c maskcat-masks.tmp | head
107 ?u?u?d?u?u?u?u?u:8:2:192
103 ?u?u?u?d?u?u?u?u:8:2:192
103 ?l?l?l?l?l?d?l?l:8:2:192
103 ?d?u?u?u?u?u?u?u:8:2:192
100 ?u?d?u?u?u?u?u?u:8:2:192
98 ?l?l?l?l?l?l?d?l:8:2:192
73 ?l?l?l?l?l?l?l?d:8:2:192
72 ?l?l?l?l?d?l?l?l:8:2:192
42 ?l?l?l?l?l?l?d?u:8:3:192
32 ?u?u?d?l?l?l?l?l:8:3:192
When I first found the pattern, I could enumerate 618 new founds, and when I returned, 994 were left. To get started, we can repeat the token swapping attack to make a new wordlist, then run it with 5,000 rules to see how it works.
cat founds.lst | shuf | maskcat mutate 9 >> mutate.lst
cat founds.lst | shuf | maskcat mutate 8 >> mutate.lst
cat founds.lst | shuf | maskcat mutate 7 >> mutate.lst
cat founds.lst | shuf | maskcat mutate 6 >> mutate.lst
...
usort found.lst
Then we can test it…
Session..........: tokenswap
Status...........: Exhausted
Hash.Mode........: *
Hash.Target......: 1865.left
Time.Started.....: Wed Dec 21 04:05:03 2022 (0 secs)
Time.Estimated...: Wed Dec 21 04:05:03 2022 (0 secs)
Kernel.Feature...: Optimized Kernel
Guess.Base.......: File (/root/.local/share/hashcat/sessions/2711.induct/hashcat.loopback.1671595495_1860)
Guess.Mod........: Rules (/files/5k.rule)
Guess.Queue......: 1/1 (100.00%)
Speed.#1.........: 345.6 MH/s (0.61ms) @ Accel:256 Loops:256 Thr:128 Vec:1
Recovered........: 991/994 (99.70%) Digests (total), 0/994 (0.00%) Digests (new), 991/994 (99.70%) Salts
Progress.........: 4925270000/4925270000 (100.00%)
Rejected.........: 0/4925270000 (0.00%)
Restore.Point....: 991/991 (100.00%)
Restore.Sub.#1...: Salt:993 Amplifier:4864-5000 Iteration:0-256
Candidate.Engine.: Device Generator
Candidates.#1....: Wzzzzzz06 -> rrrGGG1G7
Hardware.Mon.#1..: Temp: 54c Fan: 0% Util: 64% Core:2055MHz Mem:9251MHz Bus:16
…and 99.70% of the remaining hashes were recovered with this method.
Test 2 #
Okay, so what if that was a fluke? Let us test again on 465
but this time, we will directly compare what looping over the existing founds with rules looks like compared to token swapping. This way, we can get an idea of the effectiveness.
To start, we can take the founds, repeat the token swap, and then run an attack with the same 5,000 rules from before. For this one, we can include --loopback
and note that the final runtime was ~2 hours.
Session..........: ruleloopback
Status...........: Exhausted
Hash.Mode........: *
Hash.Target......: 465.left
Time.Started.....: Tue Dec 20 21:58:17 2022 (22 mins, 29 secs)
Time.Estimated...: Tue Dec 20 22:20:46 2022 (0 secs)
Kernel.Feature...: Optimized Kernel
Guess.Base.......: File (/root/.local/share/hashcat/sessions/10.induct/hashcat.loopback.1671566861_7605) Guess.Mod........: Rules (/files/5k.rule)
Guess.Queue......: 1/1 (100.00%)
Speed.#1.........: 61521.6 kH/s (0.58ms) @ Accel:128 Loops:256 Thr:512 Vec:1
Recovered........: 161/111762 (0.14%) Digests (total), 161/111762 (0.14%) Digests (new), 142/107691 (0.13%) Salts Remaining........: 111601 (99.86%) Digests, 107549 (99.87%) Salts
Recovered/Time...: CUR:1,N/A,N/A AVG:0.31,N/A,N/A (Min,Hour,Day)
Progress.........: 82922070000/82922070000 (100.00%)
Rejected.........: 0/82922070000 (0.00%)
Restore.Point....: 154/154 (100.00%)
Restore.Sub.#1...: Salt:107690 Amplifier:4864-5000 Iteration:0-256
Candidate.Engine.: Device Generator
Candidates.#1....: * -> *
Hardware.Mon.#1..: Temp: 62c Fan: 49% Util: 62% Core:1965MHz Mem:9251MHz Bus:16
Then we can compare it to a token swap attack with token sizes 4-7 with no rules and ~6.5 hours of runtime.
Session..........: tokenswap
Status...........: Exhausted
Hash.Mode........: *
Hash.Target......: 465.left
Time.Started.....: Tue Dec 20 22:32:45 2022 (6 hours, 26 mins)
Time.Estimated...: Wed Dec 21 04:59:16 2022 (0 secs)
Kernel.Feature...: Optimized Kernel
Guess.Base.......: File (mutate.lst)
Guess.Queue......: 1/1 (100.00%)
Speed.#1.........: 4400.3 MH/s (0.06ms) @ Accel:512 Loops:1 Thr:128 Vec:1
Recovered........: 2386/111762 (2.13%) Digests (total), 2225/111762 (1.99%) Digests (new), 2181/107691 (2.03%) Salts
Remaining........: 109376 (97.87%) Digests, 105510 (97.97%) Salts
Recovered/Time...: CUR:0,289,N/A AVG:5.76,345.40,N/A (Min,Hour,Day)
Progress.........: 196338334616547/196338334616547 (100.00%)
Rejected.........: 131969935950/196338334616547 (0.07%)
Restore.Point....: 1823163817/1823163817 (100.00%)
Restore.Sub.#1...: Salt:107690 Amplifier:0-1 Iteration:0-1
Candidate.Engine.: Device Generator
Candidates.#1....: zzz298756zzz -> саша9999
Hardware.Mon.#1..: Temp: 70c Fan: 65% Util: 64% Core:1950MHz Mem:9251MHz Bus:16
While we can’t quite compare these one to one, the token swap attack had 2,386 total cracks, while the loopback had just 161 (~13x more). Additionally, if we compare the time investment, loopback yielded around 80 founds/hr, and the token swap yielded 367 founds/hr.
If we also look at the submission, we see several are also unique founds:
Hash List ID | Found Hashes | New Plains |
---|---|---|
465 | +2,386 | +1,374 |
Test 3 #
For the last test, we will test this attack’s sustainability. Theoretically, this attack is exceptionally non-deterministic because of how the swaps are generated. Let’s put it to the test by repeating it again and again on the same target, 2221
.
Instead of using the 5k rule again, we will make a tiny rule to see if the candidates are doing the heavy lifting or if the rules are.
$ cat rule.tmp
:
]
The target hashes are very slow to calculate, so this attack will be repeated three (3) times with mutating lists of a few different sizes to see how it impacts the results.
First, we can make a wordlist using all of the existing founds. The final size for the first found wordlist was 35 GB:
Recovered........: 54016/8900340 (0.61%) Digests (total), 0/8900340 (0.00%) Digests (new), 17/1048387 (0.00%) Salts
We can make a wordlist for the second round only using the new founds we got from the previous round. The final size for the second found wordlist was 5.3 GB:
Recovered........: 65472/8900340 (0.74%) Digests (total), 0/8900340 (0.00%) Digests (new), 19/1048387 (0.00%) Salts
For the final round, we can make another wordlist only using all the new founds, and we will make it larger but limit the token size to only six (6), five (5), and four (4). The final size of the last wordlist was 14 GB but ended up getting stopped about 13.4% of the way through:
Recovered........: 68038/8900340 (0.76%) Digests (total), 0/8900340 (0.00%) Digests (new), 20/1048387 (0.00%) Salts
Finally, let us look into the masks of the new founds to see the diversity of patterns we cover with this method.
3325 ?d?d?d?d?d?d?d?d?d?d:10:1:100
3039 ?u?l?l?l?l?s?d?d?d?d:10:4:203
2712 ?u?l?l?l?s?d?d?d?d:9:4:177
2242 ?l?l?l?l?d?d?d?d:8:2:144
2161 ?l?l?l?l?l?d?d?d?d:9:2:170
2067 ?u?l?l?l?l?l?s?d?d?d?d:11:4:229
1845 ?l?l?l?l?l?l?d?d?d?d:10:2:196
1771 ?l?l?l?l?l?l?l?l:8:1:208
1738 ?l?l?l?l?l?l?l?l?l:9:1:234
1460 ?u?l?l?l?l?s?d?d?d:9:4:193
1272 ?l?l?l?l?l?l?l?l?l?l:10:1:260
1239 ?u?l?l?l?l?l?s?d?d?d:10:4:219
1104 ?l?l?l?l?s?d?d?d?d:9:3:177
1091 ?l?l?l?l?l?l?d?d:8:2:176
1046 ?l?l?l?l?l?l?d?d?d:9:2:186
1016 ?l?l?l?l?l?s?d?d?d?d:10:3:203
919 ?l?l?l?l?l?l?l?d?d:9:2:202
917 ?l?l?l?l?l?l?l?l?l?l?l:11:1:286
879 ?l?l?l?l?l?d?d?d:8:2:160
821 ?u?l?l?l?l?l?s?d?d:9:4:209
Hash List ID | Found Hashes | New Plains |
---|---|---|
2221 | +65,796 | +60,664 |
Application #
I love this technique, and it comes early in my methodology. This technique is great because it combines multiple password-cracking aspects into a repeatable methodology. This strategy works well from additional tests, can be repeated several times without exhaustion, and can find interesting patterns.
I would recommend it for the following:
- Hash lists that have few known plains where other attacks are exhausting. This attack can quickly make a massive amount of new candidates from very little source material.
- Fast algorithms where you can exhaust your wordlists quickly and need to generate new ones.
- Hash lists where options like
loopback
, mask attacks, and hybrid attacks can not enumerate the pattern. - “Random” or generated candidates where the character set is consistent.
I would not recommend it for the following:
- Slow hashes where the wordlist size becomes a significant factor in completion time.
- Hash lists where there is little shared material between plaintext or random collections of hashes.
Reference #
The following are aliases
referenced above:
# unqiue sort file
usort() {
if [[ $# -ne 1 ]]; then
echo 'unique sort file inplace'
echo 'EXAMPLE: usort <FILE>'
else
LC_ALL=C sort -u $1 -T ./ -o $1
fi
}