AI & Machine Learning Blog

Abstract waveform with branching paths and ambient particles representing agentic audio data collection

Audio Data Collection for Agentic AI: What Has Changed for ASR in High-Resource Languages

Audio data collection for agentic AI focuses on capturing messy, context-rich speech data that fine-tunes existing ASR models rather than building new ones from scratch. In 2026, high-resource languages like English have achieved baseline parsability, meaning the traditional focus on demographic coverage (female vs male, old vs young, city vs country) has shifted. Instead, the emphasis is now on niche

Feb 09, 2026

Written by Tania Strahan

Featured posts

LXT Acquires clickworker to Deliver Industry-Leading AI Data Solutions

AI data collection methods & techniques: Four approaches for effective algorithm training

Portrait of Philip Hall graphic with text AI IN THE REAL WORLD

Generative AI: A brief overview of its history and impact

Explore more from LXT

AI Agent Audio Data Waveform Visualisation

Agentic AI Voice: Why High-Quality Transcription Data Is Essential

Agentic AI voice systems are AI-powered tools that listen, understand, and take autonomous action based on spoken input. Unlike basic voice assistants, agentic AI voice technology reasons through problems, makes decisions, and executes multi-step workflows, all triggered by natural speech.

Introducing The ROI of High-Quality AI Training Data 2025

Earlier this year, we released The Path to AI Maturity 2025, our annual executive survey that tracks the evolution of AI maturity and the rise of generative AI. Over the past four years, AI maturity has surged across U.S. enterprises. In 2025, 83% of organizations report traditional AI in production, and 16% have reached transformational adoption, where AI is embedded

LLM benchmarks in 2026: What they prove and what your business actually needs

Models that dominate leaderboards often underperform in production. Learn why benchmark saturation and data contamination undermine predictive power, and how to build evaluation programs that actually predict real-world success.

AI agent evaluation: comprehensive framework for measuring agent performance

AI agents are rapidly becoming central to enterprise operations, with 60% of organizations now deploying agents. However, despite widespread adoption, 39% of AI projects in both 2024 and 2025 continue to fall short of expectations. The difference between success and failure isn’t the technology – it’s systematic evaluation. Learn how enterprise leaders are using comprehensive frameworks to measure not just what their agents produce, but how they think, ensuring safer deployments and measurable ROI across performance, safety, and user experience.

Interspeech 2025 – Speech data at the center of fair and inclusive AI

The annual Interspeech 2025 conference in Rotterdam carried the theme “Fair and Inclusive Speech Science and Technology.” While the research covered everything from low-resource ASR to mental health detection, one idea kept resurfacing: progress in speech AI is bottlenecked by the data we collect, curate, and use to train models. Unlike past years where model architectures dominated the headlines, 2025

LXT Completes Integration of Clickworker to Deliver Single Platform for Industry-Leading AI Data Solutions

Managed, secure and crowd-based solutions power generative and agentic AI applications for top 10 global technology companies, the Fortune 500 and innovative startups