DeepL, a translation company best known for its text tools, today released a voice-to-voice translation suite that includes custom apps for use cases such as meetings, mobile and web conversations, and group conversations for frontline workers. The company is also releasing an API that lets outside developers and businesses build on top of DeepL’s tech for custom use cases, such as call centers.
“After spending so many years in text translation, voice was a natural step for us,” DeepLK CEO Jerich Kotelowski told TechCrunch in an interview. “We’ve come a long way when it comes to text translation and document translation. But we thought there was no perfect product for real-time voice translation.”
Challenges in building a real-time translation product center on striking a balance between reducing latency — the delay between someone speaking and the translated audio playing — and maintaining accurate results, Kutylowski said.
DeepL is releasing add-ons for platforms like Zoom and Microsoft Teams, where listeners can either listen to real-time translations while speaking in other local languages or follow the translated text in real-time on screen. The program is currently in early access, and the company is inviting. Organizations join the waiting list.. The company also has a product for mobile and web-based communications that can be in person or remotely.
DeepL allows users to participate in group conversations in settings such as training sessions or workshops, allowing participants to join via a QR code.
DPL said its Voice-to-Voice tech can learn custom vocabulary and can also learn industry-specific terms and company and personal names accordingly.
Kutylowski said AI is reimagining what customer service will look like in the coming years. He noted that the translation layer helps companies provide support in languages where qualified staff are scarce and expensive to hire.
TechCrunch event
San Francisco, CA
|
October 13-15, 2026
The company said it controls the entire stack from voice to voice. However, current systems convert speech to text, apply translation, then convert it back to speech. DPL believes that since it has worked on text translation for years, it has an edge in translation quality. Going forward, the company wants to develop an end-to-end voice translation model that skips the text step entirely.
DeepL faces competition from several well-funded startups operating in adjacent corners of the space. Sanas, which last year raised $65 million from Quadrille Capital and Teleperformance, uses AI to modify a speaker’s tone in real time — a tool aimed primarily at call center agents.
Dubai-based Camb.AI focuses on speech synthesis and translation for media and entertainment companies Amazon Web Services, helping to dub and localize video content at scale.
Palabra, backed by Reddit co-founder Alexis Ohanian’s firm Seven Seven Six, is building a real-time speech translation engine designed to preserve both meaning and the speaker’s original voice, putting it in direct competition with what DeepL is building now.




