Intern allegedly messed with ByteDance's LLM training cluster

No losses caused – except the intern's job – says TikTok parent

by · The Register

ByteDance has terminated an intern for "maliciously interfering" with a large language model training project.

The parent company of TikTok addressed rumors that had been circulating around the internet in a statement it posted on its news aggregation platform Toutiao.

"Recently, some media said that 'ByteDance's large model training was attacked by interns', and after internal verification by the company, there were indeed serious violations of discipline … in the commercial technical team, and the intern has been dismissed," wrote ByteDance as auto-translated from Chinese.

The Chinese giant denied claims that the intern disrupted training on a cluster of over 8,000 H100 GPUs and caused tens of millions of dollars in losses.

According to ByteDance, the incident "Did not affect the official project and online business of commercialization, nor did it involve other businesses such as ByteDance's large model."

The intern was terminated in August, and the incident was reported to their university.

According to information provided on a GitHub page detailing the incident, the male intern was working towards a master's degree in computer science at Peking University.

The page – which The Reg will not link to as it names the accused man – claims that the student carried out his attack for over two months, "causing great harm to nearly 30 employees at all levels of the company and making your colleagues' work in vain for nearly a quarter."

Posts on the GitHub page allege that the intern:

  • Modified PyTorch source code, including random seeds and optimizers;
  • Randomly stopped multi-machine processes;
  • Opened a login backdoor, automatically launching attacks and randomly stopping processes;
  • Participated in daily cluster troubleshooting meetings, continuously modifying the attack code based on colleagues' troubleshooting ideas;
  • Modified colleagues' model weights, making experimental results unreproducible.

ByteDance's Doubao is the most used GenAI app in China. According to one tracker, in September it had 47 million monthly active users (MAUs), compared to Baidu's Ernie Bot with just 12 million users. ®