Skip to main content
  1. Posts/

Smashing Hashes with Token Swapping Attacks

·9 mins
passwords hashcat hashcracking

Last Updated: 5-15-2023.

Note: Password data mentioned in this article was obtained through public resources to improve overall password security posture. Information shared in public breaches helps improve security recommendations.

Introduction #

There are many ways to crack passwords, and offensive tooling for password security has provided a lot of new and exciting attack methods to reveal plaintext. In my previous post, I stated the following:

Not surprisingly, the best wordlists come from actual passwords as the human element in setting passwords tends to permeate through to create predictable patterns that are often targeted.

Some of my favorite attack techniques utilize existing found material back into the cracking process. There are a few ways to do this, but today I want to introduce my personal favorite Token Swapping.

Token Swapping Concept #

To best understand the attack, first, I want to introduce two principles that password crackers abuse to recover plaintext.

First, human passwords often share patterns. When people develop a new password, they tend to follow patterns to make it easier to remember. However, they don’t realize that other people also follow similar patterns.

For example, if I said I needed a password that included the following:

  • Upper Case Character
  • Lower Case Character
  • Digits
  • Special Character

You might have come up with something like Word + Digits + Special because it is the order I provided them in or that most English speakers tend to read from left to right, and it reads more normally.

Second, secret material is often shared or reused, especially among a shared user pool. For example, people use similar words when setting passwords, such as years, first names, pet names, religious items, places, phrases, and other material that might be included. Other things like the product or company/service name will likely appear when a series of hashes come from the same source.

These principles show that seemingly unrelated passwords may have more in common than initially seen.

If the best wordlists come from actual passwords and passwords from the same source share many patterns, then the most logical step is to incorporate other founds from the same/similar source into new wordlists.

The Token Swapping Attack #

The logic we want to replicate is:

  1. Split plaintext into segments
  2. Convert each token into a hashcat mask
  3. Find other plaintext that contain the same mask
  4. Swap the token into other plaintexts using the mask

This provides two benefits:

  • Plaintext material is preserved
  • Plaintext masks/structure is preserved

This means we can reuse the existing founds into the process exactly how they were seen and place them directly into known patterns without guessing the password structure.

graph TD A[Password1234] --> Mask B[Bears4567] --> Mask Mask --> Tokenize Tokenize --> Swap subgraph Mask [Make Masks] ?u?l?l?l?l?l?l?l?d?d?d?d ?u?l?l?l?l?d?d?d?d end subgraph Tokenize [Tokenize 4] Pass 1234 Bear 4567 end subgraph Swap [Swap on Masks] Bearword1234 Password4567 Passs4567 Bears1234 end

This process also has a high degree of entropy as the number of new candidates you can generate grows greatly depending on the token size and number of available plaintexts to swap.

To instrument this, I use Maskcat with the mutate mode, which will read input from stdin, then split it into tokens and begin swapping. Because it is reading from stdin, we can use shuf to randomize the input and get different candidates every run.

cat found.lst | shuf | mutate 6 >> mutate.lst
cat found.lst | shuf | mutate 5 >> mutate.lst
cat found.lst | shuf | mutate 4 >> mutate.lst

usort mutate.lst

Testing On Data #

To test this, we will use a few publicly available breaches where many low-hanging fruit items have been picked off. No names or hash types will be shared, but breaches will be referred to by their ID number. We will only be working with the left hashes, so all the low-hanging fruit should be picked off.

Test 1 #

To kick it off, returning to 1865, I found a pattern in the remaining hashes, but I could not enumerate them with rules, hybrid, and mask attacks. Returning to this list with this new technique makes sense, as many previous items were generated into seemingly random patterns with the same character set.


If we look at the mask structure, we can see some patterns but nothing that would be super easy to enumerate:

$ mode -c maskcat-masks.tmp | head
    107 ?u?u?d?u?u?u?u?u:8:2:192
    103 ?u?u?u?d?u?u?u?u:8:2:192
    103 ?l?l?l?l?l?d?l?l:8:2:192
    103 ?d?u?u?u?u?u?u?u:8:2:192
    100 ?u?d?u?u?u?u?u?u:8:2:192
     98 ?l?l?l?l?l?l?d?l:8:2:192
     73 ?l?l?l?l?l?l?l?d:8:2:192
     72 ?l?l?l?l?d?l?l?l:8:2:192
     42 ?l?l?l?l?l?l?d?u:8:3:192
     32 ?u?u?d?l?l?l?l?l:8:3:192

When I first found the pattern, I could enumerate 618 new founds, and when I returned, 994 were left. To get started, we can repeat the token swapping attack to make a new wordlist, then run it with 5,000 rules to see how it works.

cat founds.lst | shuf | maskcat mutate 9 >> mutate.lst
cat founds.lst | shuf | maskcat mutate 8 >> mutate.lst
cat founds.lst | shuf | maskcat mutate 7 >> mutate.lst
cat founds.lst | shuf | maskcat mutate 6 >> mutate.lst
usort found.lst

Then we can test it…

Session..........: tokenswap
Status...........: Exhausted
Hash.Mode........: *
Hash.Target......: 1865.left
Time.Started.....: Wed Dec 21 04:05:03 2022 (0 secs)
Time.Estimated...: Wed Dec 21 04:05:03 2022 (0 secs)
Kernel.Feature...: Optimized Kernel
Guess.Base.......: File (/root/.local/share/hashcat/sessions/2711.induct/hashcat.loopback.1671595495_1860)
Guess.Mod........: Rules (/files/5k.rule)
Guess.Queue......: 1/1 (100.00%)
Speed.#1.........:   345.6 MH/s (0.61ms) @ Accel:256 Loops:256 Thr:128 Vec:1
Recovered........: 991/994 (99.70%) Digests (total), 0/994 (0.00%) Digests (new), 991/994 (99.70%) Salts
Progress.........: 4925270000/4925270000 (100.00%)
Rejected.........: 0/4925270000 (0.00%)
Restore.Point....: 991/991 (100.00%)
Restore.Sub.#1...: Salt:993 Amplifier:4864-5000 Iteration:0-256
Candidate.Engine.: Device Generator
Candidates.#1....: Wzzzzzz06 -> rrrGGG1G7
Hardware.Mon.#1..: Temp: 54c Fan:  0% Util: 64% Core:2055MHz Mem:9251MHz Bus:16

…and 99.70% of the remaining hashes were recovered with this method.

Test 2 #

Okay, so what if that was a fluke? Let us test again on 465 but this time, we will directly compare what looping over the existing founds with rules looks like compared to token swapping. This way, we can get an idea of the effectiveness.

To start, we can take the founds, repeat the token swap, and then run an attack with the same 5,000 rules from before. For this one, we can include --loopback and note that the final runtime was ~2 hours.

Session..........: ruleloopback
Status...........: Exhausted
Hash.Mode........: *
Hash.Target......: 465.left
Time.Started.....: Tue Dec 20 21:58:17 2022 (22 mins, 29 secs)
Time.Estimated...: Tue Dec 20 22:20:46 2022 (0 secs)
Kernel.Feature...: Optimized Kernel
Guess.Base.......: File (/root/.local/share/hashcat/sessions/10.induct/hashcat.loopback.1671566861_7605)                                                    Guess.Mod........: Rules (/files/5k.rule)
Guess.Queue......: 1/1 (100.00%)
Speed.#1.........: 61521.6 kH/s (0.58ms) @ Accel:128 Loops:256 Thr:512 Vec:1
Recovered........: 161/111762 (0.14%) Digests (total), 161/111762 (0.14%) Digests (new), 142/107691 (0.13%) Salts                                           Remaining........: 111601 (99.86%) Digests, 107549 (99.87%) Salts
Recovered/Time...: CUR:1,N/A,N/A AVG:0.31,N/A,N/A (Min,Hour,Day)
Progress.........: 82922070000/82922070000 (100.00%)
Rejected.........: 0/82922070000 (0.00%)
Restore.Point....: 154/154 (100.00%)
Restore.Sub.#1...: Salt:107690 Amplifier:4864-5000 Iteration:0-256
Candidate.Engine.: Device Generator
Candidates.#1....: * -> *
Hardware.Mon.#1..: Temp: 62c Fan: 49% Util: 62% Core:1965MHz Mem:9251MHz Bus:16

Then we can compare it to a token swap attack with token sizes 4-7 with no rules and ~6.5 hours of runtime.

Session..........: tokenswap
Status...........: Exhausted
Hash.Mode........: *
Hash.Target......: 465.left
Time.Started.....: Tue Dec 20 22:32:45 2022 (6 hours, 26 mins)
Time.Estimated...: Wed Dec 21 04:59:16 2022 (0 secs)
Kernel.Feature...: Optimized Kernel
Guess.Base.......: File (mutate.lst)
Guess.Queue......: 1/1 (100.00%)
Speed.#1.........:  4400.3 MH/s (0.06ms) @ Accel:512 Loops:1 Thr:128 Vec:1
Recovered........: 2386/111762 (2.13%) Digests (total), 2225/111762 (1.99%) Digests (new), 2181/107691 (2.03%) Salts
Remaining........: 109376 (97.87%) Digests, 105510 (97.97%) Salts
Recovered/Time...: CUR:0,289,N/A AVG:5.76,345.40,N/A (Min,Hour,Day)
Progress.........: 196338334616547/196338334616547 (100.00%)
Rejected.........: 131969935950/196338334616547 (0.07%)
Restore.Point....: 1823163817/1823163817 (100.00%)
Restore.Sub.#1...: Salt:107690 Amplifier:0-1 Iteration:0-1
Candidate.Engine.: Device Generator
Candidates.#1....: zzz298756zzz -> саша9999
Hardware.Mon.#1..: Temp: 70c Fan: 65% Util: 64% Core:1950MHz Mem:9251MHz Bus:16

While we can’t quite compare these one to one, the token swap attack had 2,386 total cracks, while the loopback had just 161 (~13x more). Additionally, if we compare the time investment, loopback yielded around 80 founds/hr, and the token swap yielded 367 founds/hr.

If we also look at the submission, we see several are also unique founds:

Hash List ID Found Hashes New Plains
465 +2,386 +1,374

Test 3 #

For the last test, we will test this attack’s sustainability. Theoretically, this attack is exceptionally non-deterministic because of how the swaps are generated. Let’s put it to the test by repeating it again and again on the same target, 2221.

Instead of using the 5k rule again, we will make a tiny rule to see if the candidates are doing the heavy lifting or if the rules are.

$ cat rule.tmp

The target hashes are very slow to calculate, so this attack will be repeated three (3) times with mutating lists of a few different sizes to see how it impacts the results.

First, we can make a wordlist using all of the existing founds. The final size for the first found wordlist was 35 GB:

Recovered........: 54016/8900340 (0.61%) Digests (total), 0/8900340 (0.00%) Digests (new), 17/1048387 (0.00%) Salts

We can make a wordlist for the second round only using the new founds we got from the previous round. The final size for the second found wordlist was 5.3 GB:

Recovered........: 65472/8900340 (0.74%) Digests (total), 0/8900340 (0.00%) Digests (new), 19/1048387 (0.00%) Salts

For the final round, we can make another wordlist only using all the new founds, and we will make it larger but limit the token size to only six (6), five (5), and four (4). The final size of the last wordlist was 14 GB but ended up getting stopped about 13.4% of the way through:

Recovered........: 68038/8900340 (0.76%) Digests (total), 0/8900340 (0.00%) Digests (new), 20/1048387 (0.00%) Salts

Finally, let us look into the masks of the new founds to see the diversity of patterns we cover with this method.

   3325 ?d?d?d?d?d?d?d?d?d?d:10:1:100
   3039 ?u?l?l?l?l?s?d?d?d?d:10:4:203
   2712 ?u?l?l?l?s?d?d?d?d:9:4:177
   2242 ?l?l?l?l?d?d?d?d:8:2:144
   2161 ?l?l?l?l?l?d?d?d?d:9:2:170
   2067 ?u?l?l?l?l?l?s?d?d?d?d:11:4:229
   1845 ?l?l?l?l?l?l?d?d?d?d:10:2:196
   1771 ?l?l?l?l?l?l?l?l:8:1:208
   1738 ?l?l?l?l?l?l?l?l?l:9:1:234
   1460 ?u?l?l?l?l?s?d?d?d:9:4:193
   1272 ?l?l?l?l?l?l?l?l?l?l:10:1:260
   1239 ?u?l?l?l?l?l?s?d?d?d:10:4:219
   1104 ?l?l?l?l?s?d?d?d?d:9:3:177
   1091 ?l?l?l?l?l?l?d?d:8:2:176
   1046 ?l?l?l?l?l?l?d?d?d:9:2:186
   1016 ?l?l?l?l?l?s?d?d?d?d:10:3:203
    919 ?l?l?l?l?l?l?l?d?d:9:2:202
    917 ?l?l?l?l?l?l?l?l?l?l?l:11:1:286
    879 ?l?l?l?l?l?d?d?d:8:2:160
    821 ?u?l?l?l?l?l?s?d?d:9:4:209
Hash List ID Found Hashes New Plains
2221 +65,796 +60,664

Application #

I love this technique, and it comes early in my methodology. This technique is great because it combines multiple password-cracking aspects into a repeatable methodology. This strategy works well from additional tests, can be repeated several times without exhaustion, and can find interesting patterns.

I would recommend it for the following:

  • Hash lists that have few known plains where other attacks are exhausting. This attack can quickly make a massive amount of new candidates from very little source material.
  • Fast algorithms where you can exhaust your wordlists quickly and need to generate new ones.
  • Hash lists where options like loopback, mask attacks, and hybrid attacks can not enumerate the pattern.
  • “Random” or generated candidates where the character set is consistent.

I would not recommend it for the following:

  • Slow hashes where the wordlist size becomes a significant factor in completion time.
  • Hash lists where there is little shared material between plaintext or random collections of hashes.

Reference #

The following are aliases referenced above:

# unqiue sort file

usort() {
	if [[ $# -ne 1 ]]; then
		echo 'unique sort file inplace'
		echo 'EXAMPLE: usort <FILE>'
		LC_ALL=C sort -u $1 -T ./ -o $1