Why send a message when you can get your Zoom digital video clone to read the script?

We're sure colleagues will find your lookalike, soundalike avatar's missive very warm and human

by · The Register

Zoom Video Communications intends to offer office workers the ability to communicate with colleagues using an AI lookalike that speaks from a script using a lip-synced, cloned voice.

"Imagine being able to create video clips for demos or training sessions using just a text script, featuring a digital avatar that looks and sounds like you – no retakes needed," mused chief product officer Smita Hashim in a blog post on Wednesday.

Whether coworkers will appreciate being addressed in this way is another matter.

In conjunction with the Zoomtopia 2024 conference, the comms biz talked up its Zoom AI Companion 2.0, a generative AI assistant (formerly Zoom IQ) that works across the Zoom platform.

Available in the coming weeks for paid customers of the Zoom Workplace productivity suite at no additional cost, the latest iteration of the company's generative AI service takes the form of a side panel input menu in the Zoom Workplace app.

Instead of clicking on icons and manipulating apps as people have done for the past four decades, AI Companion provides a way to use natural language to direct the underlying chatbot model to perform these tasks.

An example cited during the conference showed an employee named "Craig" typing: "Schedule a meeting to discuss the global strategy plan and rollout at 10:00 with all channel members." Zoom's AI Companion responded by creating a meeting invite button for the Team Chat channel and suggesting a meeting agenda.

AI Companion can also do things like summarize meetings, which calls into question the need to attend at all.

Zoom Workplace has various other functions besides Team Chat, including Calendar, Docs, Tasks, Phone, and Clips.

Launched last year, Zoom Clips provides a way to record short video clips for asynchronous communication. Instead of emailing coworkers, Zoom Workplace users can subject colleagues to prepared videos – for those times when making a personal effort to communicate is just too much.

At some point next year, users will have the option to purchase, for an additional $12 per user per month, an add-on to AI Companion that expands the apps accessible to Zoom's AI beyond the calendar and email services of Microsoft and Google to include Atlassian (Jira & Confluence), Glean, Workday, Zendesk, ServiceNow, Box, Asana, and Hubspot, among others.

As part of this add-on, users of Zoom Clips will be able to create and send video clips containing a photorealistic digital simulacrum that speaks in a digital, soundalike voice from a text script.

"The ability to use custom avatars for Zoom Clips is a new feature that will launch in the first half of 2025 as part of the custom AI Companion add-on," a Zoom spokesperson told The Register. "Custom avatars for Zoom Clips will help people communicate asynchronously with their colleagues in a faster, more productive way, saving them precious time and effort recording clips by using a personalized AI-generated avatar to create clips with a transcript. Users will record a 'seed video' to create their personal avatar and the avatar video clip will sync with the audio generated from the transcript."

Microsoft Azure is thinking along similar lines with an avatar generator that provides photorealistic avatars that speak in synthesized voices from text scripts.

This capability is already showing up in commercial products, as seen in this video produced by Photo AI of an AI-generated woman posing as a CNN reporter – which, if distributed with the intent to deceive, would qualify as a deepfake.

Despite ongoing efforts to limit the potential harm of deepfakes, the US First Amendment guarantee of free speech poses a problem for overly broad prohibitions on AI-created content. In California earlier this month, a judge temporarily blocked state law AB 2839, which prohibits the distribution of deceptive election-related deepfakes.

Zoom, for its part, insists its avatar service will be defended against abuse and that mechanisms like watermarking will prevent videos from being useful for deception – and thus being classified as deepfakes.

"We've built in numerous safeguards to help protect against misuse and will continue to review and add safeguards in the future," the company's spokesperson said. "We employ advanced authentication and watermarking technology to make it obvious when a clip is generated with an avatar, and strict usage policies to help ensure the integrity of avatar-generated content and prevent misuse or deepfake creation." ®