Hackers steal 15,000 cloud credentials from exposed Git config files

30 Oct 2024, 14:00 by Bill Toulas · BleepingComputer

A large-scale malicious operation named "EmeraldWhale" scanned for exposed Git configuration files to steal over 15,000 cloud account credentials from thousands of private repositories.

According to Sysdig, who discovered the campaign, the operation involves using automated tools that scan IP ranges for exposed Git configuration files, which may include authentication tokens.

These tokens are then used to download repositories stored on GitHub, GitLab, and BitBucket, which are scanned for further credentials.

The stolen data was exfiltrated to Amazon S3 buckets of other victims and was subsequently used in phishing and spam campaigns and sold directly to other cybercriminals.

While exposing Git authentication tokens can allow data theft, it could also lead to full-blown data breaches like we recently saw with the Internet Archive.

Exposed Git configuration files

Git configuration files, such as /.git/config or .gitlab-ci.yml, are used to define various options like repository paths, branches, remotes, and sometimes even authentication information like API keys, access tokens, and passwords.

Developers might include these secrets in private repositories for convenience, making data transmissions and API interactions easier without configuring or performing authentication each time.

This is not risky as long as the repository is appropriately isolated from public access. However, if the /.git directory containing the configuration file is mistakenly exposed on a website, threat actors using scanners could easily locate and read them.

If these stolen configuration files contain authentication tokens, they can be used to download associated source code, databases, and other confidential resources not intended for public access.

The threat actors behind EmeraldWhale use open-source tools like 'httpx' and 'Masscan' to scan websites hosted on an estimated 500 million IP addresses divided into 12,000 IP ranges.

Sysdig says the hackers even created files listing every possible IPv4 address, spanning over 4.2 billion entries, to streamline future scans.

The scans simply check if the /.git/config file and environment files (.env) in Laravel applications are exposed, which may also contain API keys and cloud credentials.

Once an exposure is identified, the tokens are verified using 'curl' commands to various APIs and, if valid, are used to download private repositories.

These downloaded repositories are scanned again for authentication secrets for AWS, cloud platforms, and email service providers. The threat actors used the exposed authentication tokens for email platforms to conduct spam and phishing campaigns.

Sysdig observed the use of two commodity toolsets to streamline this large-scale process, namely MZR V2 (Mizaru) and Seyzo-v2.

The EmeraldWhale attack chain
Source: Sysdig

For Laravel, the Multigrabber v8.5 tool was used to check domains for .env files, steal them, and then classify the information based on its usability potential.

Laravel attack overview
Source: Sysdig

Evaluating the stolen data

Sysdig examined the exposed S3 bucket and found one terabyte worth of secrets in it, including stolen credentials and logging data.

Based on the collected data, EmeraldWhale stole 15,000 cloud credentials from 67,000 URLs that exposed configuration files.

Of the exposed URLs, 28,000 corresponded to Git repositories, 6,000 were GitHub tokens, and a notable 2,000 were validated as active credentials.

Besides major platforms like GitHub, GitLab, and BitBucket, the hackers also targeted 3,500 smaller repositories belonging to small teams and individual developers.

Stolen credentials by platform
Source: Sysdig

Sysdig says that mere lists of URLs pointing to exposed Git configuration files are sold on Telegram for about $100, but those exfiltrating the secrets and validating them have a lot more significant monetization opportunities.

The researchers note that this campaign isn't particularly sophisticated, relies on commodity tools and automation, yet still managed to steal thousands of secrets that can potentially lead to catastrophic data breaches.

Software developers can mitigate the risk by using dedicated secret management tools to store their secrets and using environment variables to configure sensitive settings at runtime instead of hardcoding them in Git configuration files.