Initial project scope for
languages and dialects
Project completed in
led to project expansion into
Dubber is the world’s leading provider of cloud-based call recording and voice AI. Businesses and government organizations around the world use Dubber’s unified call recording and voice intelligence solution to record business calls at scale and unlock insights in calls, videos and messages.
Dubber captures conversations in the cloud and uses artificial intelligence to transcribe the conversations as well as provide sentiment, behavioral analytics and alerts. Thousands of organizations use Dubber to boost employee productivity, enhance customer relationships, improve their compliance and more.
In addition, over 170 mobile networks and service providers worldwide partner with Dubber to provide scalable conversational recording and intelligence to their customers. As international expansion is a key strategic goal for Dubber, ensuring that its products work effectively in multiple languages is a top priority.
Dubber was working on a new release to its product suite – Notes by Dubber – which was announced at Mobile World Congress in early 2022. Notes by Dubber provides real-time transcription to capture phone or video-conference meetings and create summaries that allow for the sharing of meeting highlights.
Using AI, Notes by Dubber assigns action items within direct points in the conversation. Natural Language Processing (NLP) determines the various speakers in a meeting to create a record of what was said by whom. Integrations with tools such as Asana, Jira and Slack allow actions, notes and recordings to be sent to colleagues, providing enhanced collaboration for organizations of all sizes.
Given the company’s goal to expand its global footprint and support a variety of languages beyond Australian, UK and US English, it needed language datasets to benchmark the product’s transcription accuracy in several target languages. It was important for the company to use highly accurate and bias-free benchmarking data, collected and annotated by a reliable partner that could ensure on-time delivery given the upcoming launch at Mobile World Congress.
Dubber chose to work with LXT to collect and transcribe data in 10 languages and dialects. As quality was critical to Dubber, developing the initial project guidelines was an important first step. Based on LXT’s multiple years of experience in speech data collection and transcription, the company worked hand in hand with the Dubber team to develop thorough speech collection and transcription guidelines that created a strong foundation for the project. Dubber set clear quality expectations up front to ensure that the data they received would meet their requirements.
LXT proceeded to screen project participants for language fluency and business experience, as the data collection task simulated a meeting environment which required participants to have a strong business background. Meeting recordings are inherently complex as there can be a lot of interruptions, cross talk, filler words and noise in business meetings. The meeting transcriptions needed to account for all of these nuances, which added to the complexity of the project.
Sample datasets were provided to the Dubber team for each language and dialect to establish data quality before the full datasets were completed. The LXT team responded to feedback quickly and delivered the high-quality data that Dubber needed to complete its benchmarking across all languages and dialects in less than six weeks. The total project time was eight weeks which included development of the project guidelines.
The Dubber team was able to launch Notes by Dubber in multiple languages with confidence, knowing that the data they used for benchmarking was high in quality. The team had a very successful launch at Mobile World Congress and is forging agreements with some of the world’s largest telecommunications companies to distribute Notes by Dubber. Dubber highly values LXT as a partner to support its expansion efforts and has since requested support for additional languages.