DeepL, a translation company best known for its text tools, released a voice-to-voice translation suite today. This suite covers use cases like meetings, mobile and web conversations, and group conversations for frontline workers through custom apps.
The Natural Progression to Voice Translation
DeepL's CEO, Jarek Kutylowski, shared with TechCrunch that
"Having dedicated many years to text translation, venturing into voice was a logical progression for us. We have made significant strides in text and document translation. However, we felt there was a lack of a robust solution for real-time voice translation."
Challenges in Real-Time Translation
Kutylowski said that the challenges in creating a real-time translation product center on striking a balance between reducing latency — the delay between someone speaking and the translated audio playing back — and maintaining accurate results.
Platform Integration and Early Access
DeepL is releasing add-ons for platforms like Zoom and Microsoft Teams, where listeners can either hear real-time translation while others are speaking in native languages or follow real-time translated text on screen. This program is currently under early access, and the company is inviting organizations to join a waitlist. The company also has a product for mobile and web-based conversations that can take place in person or remotely.
Group Conversations and Custom Vocabulary
DeepL also allows users to participate in a group conversation in settings like training sessions or workshops, allowing participants to join through a QR code. DeepL said that its voice-to-voice tech can also learn and adapt to custom vocabulary, such as industry-specific terms and company and personal names.
The Future of Customer Service
Kutylowski said that AI is reimagining what customer service will look like in the coming years. He noted that a translation layer helps companies provide support in languages where qualified staff are scarce and expensive to hire.DeepL is focusing on transforming from a specialized translator into an agentic AI platform for enterprise, emphasizing automation, custom LLMs, and real-time voice, powered by advanced NVIDIA GPU technology.
By 2026, DeepL is rapidly expanding its language portfolio (now 30+ languages) and moving beyond text to include voice-to-voice translation and deeper API integrations, aiming for "end-to-end language intelligence".