Skip to main content
  1. Posts/

Spooky Scary Ampersands

·4 mins
passwords hashcracking appsec
Jake Wnuk
Author
Jake Wnuk

Last Updated: 10-22-2023.

Note: Password data mentioned in this article was obtained through public resources to improve overall password security posture. Information shared in public breaches helps improve security recommendations.

One late October night, while the moon was full and my GPU fans howled, I was looking at some non-salted algorithms for the MD5 and SHA1 left lists when I noticed an interesting pattern:

b6cf9f1b17cb057ad6d36be1443ab5ef:Jule&Julchen
a5128df5ef096a96ac2ab6ad2d4aac65:Spike&Penny1
36f69ab0b422ce72c84b927a72aee84a:messi&dare
92902db58778422cf81ab2570064c8ea:jalen&janiyah
4f65ba94577359d1a6bf2fccdf786727:mnbsf&marie
89ccf500309d0ac525951814e580d101:brocchetta&basta
110800eae12c507a96ac0765b57a47dc:PEr(h&K006
76e0cdef2fe8a4211600ca8d99b7e78c:tessa&marlee
cb55bdb6242fd4fa9d4c92f468e8d982:Andreas&Sabine

There was very clearly a common pattern happening, but what is interesting is the format they were found in. Upon some further investigation, some other unusual input was found like:

93498a0a29ab387b00c098164604b5cc:naya&praline
913c044fae051ce6e5d293071d06c935:D'FELLaN
748093df0bf8a4af348478c613a78611:l'éritier
c3ca16f4789cb7551063832ff965f97f:Arise & Shine
c44e2c441c37db319ace13e7bf6d8e02:"fgu."
6eb3c920815b06575ca9d1c07fb16611:dfd56$f5&gD
8981ddde39b6e7524a2cd9654d471299:TO%C3%91OMANZA
707f7cda51785d63a932c540209c059d:GOD'SENT12

So what gives? Well, this seems strange at first, but this makes more sense when we take a step back and consider how user-supplied input is handled from the point of input (source) to its intended usage point (sink).

Input Validation and Output Encoding #

When you think about user-supplied input, you should envision little ⚠️ warning signs everywhere near it, as incorrectly parsed or deserialized user input can be a bad time for an application. This generally takes the form of injection class vulnerabilities like XSS or SQLi, especially when looking at authentication portals. The best defense against this is the combination of input validation, ensuring that the data provided is valid and safe, and output encoding, which is encoding data before being returned to the client.

This encoding level will only sometimes be the case in the context of password cracking as it depends on the implemented logic from source to sink, which is the database storing authentication material. However, in some applications, it would be common to have similar input validation and output encoding controls applied to all areas of user-supplied input, including password fields.

When looking at standard output encoding formats in this context, there are a few traditional methods:

  • HTML Entity Encoding: Replaces key characters with their HTML entity equivalents and is commonly used to prevent reflected XSS issues.
  • URL Encoding: Replaces non-alphanumeric characters with their corresponding byte value in ASCII as a pair of hexadecimal digits, also called percent encoding.
Encoding examples for text
Two examples encoding the same string with different results

There are also other formats which can be used depending on the use case:

  • Unicode Escaping: Replaces Unicode characters with an ASCII escape code to display them in formats that do not support Unicode.
  • Base64 Encoding: Binary-to-text encoding scheme that displays binary data in an ASCII string format. Easily reversible and commonly used to encode data.
a5f2ee230ca47bead83ba62c7a3c601d9b393560:whynot%3F%21
afe8ae54d77d2477c73469384b7bfb3a547b4c5f:02%2F02%2F2032
e7808b791127ce36b5d1160b766c839f29a9a9ad:02%2F09%2F2024
d0ab7df752787b37dd12f44211013a8611d6ca28:15%2F07%2F2025
9ef192bfdefc3f77ddb7416048a7faaadd26b631:19%2F12%2F2024
6e8e12f9280bf4d3fc4c2f880cbf942284efe780:M%40hmoud1992
ad3fb562f2c56d7536e07f7f49daeb864bfe90ec:Solidworks%402001
64eacd35bd4508a0b822943531299033116274ce:Lobit%402020
87c83488736c8f494e2c1c755fb28346ad05599b:z+welt2525
7def339555118d57765e4226c873b1a5147036ec:zc45%mehdi
721563cbb2a11f9c2a06e94d5f55dd318f8bbccd:RR%40whk64
c1d83d75b94b6a29149d180c990e1729d034de53:X03MO1qnZdYdgyfeuILPmQ%3D%3D
78337316cbe95ab1c10ded541a49c9848b1695c0:RR%40wh64
0c71553bdbc556d262faf79347bed0d5474e4f20:Spots&stripes1
48ecb8161d3650d23cb06c00088c884cee4ba6c0:ananyan\\u0131mda
a762df272c97d19b6db41be2cb26f82991a1adde:Infy1722%24
2509b44ac05856cc3adb40e174cf6822e7354bd8:hugo1%2C%2C%2C
6df6cb8931e63ffa5613559e7ce23c4172612c3a:aaaé'è

Application #

In the context of hash cracking, it is good to account for situations where instances like this may have occurred, and the data has been modified before being stored. A simple way around this is to perform the same encodings on candidates before using them. This will not be too common as they will have to contain encoded characters and be transformed before being stored, but it is an excellent trick to have handy.

In order to do this, I implement rulecat with the encode mode:

$ cat test.lst
<hello World!>
Testing$!@%!\*()
its a 😊 day

$ cat test.lst | rulecat encode
%3Chello+World%21%3E
&lt;hello World!&gt;
Testing%24%21%40%25%21%5C%2A%28%29
its+a+%F0%9F%98%8A+day
its a \u1f60a day

You could also use the insert and overwrite modes to insert common patterns like &lt;3 instead of <3:

$ cat encoded.lst
 &amp;
 &lt;
 &lt;3

$ cat encoded.lst | rulecat insert 2
i2& i3a i4m i5p i6;
i2& i3l i4t i5;
i2& i3l i4t i5; i63

$ cat encoded.lst | rulecat overwrite 2
o2& o3a o4m o5p o6;
o2& o3l o4t o5;
o2& o3l o4t o5; o63

...

0ff776642791e741d5541b198317d438:c5:4myFun!&lt;3
71052a6fd7030c99ebf079c5a2d73245:99:9.21.98&lt;3
0888f688f69c65cc2d632d4fde7bdbfb:6c:_Chips&lt;3
14355395cd90945431eb7e576d62c3b2:61:zerty937&lt;3

Another method of enumerating the pattern could be to use -a1 with the -j and -k flags to perform a combinator attack with the pattern symbol inserted:

$ hashcat -m 0 -a1 0.left 3m.lst 3m.lst -j $\&$\a$\m$\p$\; --bitmap-max=28

...

7274a3b89b5f2a65e141dbb75dd54cb7:shaparra&amp;shaparro
da9bc2f58d93af259db0aece3192f439:sutanto&amp;suyatno
35d00f67f6ca263534c81247add705fe:nahyan&amp;hedaya

It’s a cool trick and worth keeping in the toolkit when you can identify implementations through source or approach a custom implementation.

Reference #