Compilation of Many Breaches

Compilation of Many Breaches (COMB Dataset)

TLDR

  1. Search for CompilationOfManyBreaches.7z on popular darkweb forums
  2. Unzip using the password 7z x CompilationOfManyBreaches.7z -p"PASSWORD"
  3. Query with ./query.sh [email protected] or the popular h8mail tool

Details

The original COMB (Compilation Of Many Breaches) was released in 2017 with 1.4 billion credentials included. This new release is much larger and was originally released in early February 2021. The dataset contains credentials from Netflix, LinkedIn, and many more. The dataset is a compilation of many different breachs as you might assume from the name. This set contains 15.2 billion breached accounts, 3.2 billion unique email and password pairs, and 2.5 billion unique emails.

Where to find the Dataset

This breach may be found on various forums and file sharing services as a 7zip titled CompilationOfManyBreaches.7z

For legal purposes, I will not link to this data, but if you google and look through forums, you should be able to find all the details you need to collect this breach. For a list of darknet forums, take a look HERE

Querying the Dataset

First, unzip the 7z file

# Install p7zip if not already installed
## Linux 
$ apt install p7zip-full
## Mac 
$ brew install p7zip
 
# Unzip the file
$ 7z x CompilationOfManyBreaches.7z -p"+w/P3PRqQQoJ6g"

Once unzipped, query the data with the included query.sh script.

$ cd CompilationOfManyBreaches
$ ./query.sh [email protected]

The query.sh file caters to querying emails, so if you want to target the data by password or domain, or to more easily query emails and lists, h8mail is likely your best option.

Install and query with h8mail (opens in a new tab)

$ pip3 install h8mail

# Query by Email
$ h8mail -t [email protected] -sk -bc ./CompilationOfManyBreaches/

# Query by Password
$ h8mail -t "Pa$$w0rd!" "password123" -sk -lb ./CompilationOfManyBreaches/ --loose

# Query by Domain
$ h8mail -t example.com aegisec.org -sk -lb ./CompilationOfManyBreaches/ --loose