It’s been a busy year at LXT, including exciting customer engagements across many different regions, languages and applications as well as important additions to our executive team and lastly, a new brand and website presence for our company.
We’re looking forward to continued growth in 2022, but as the year comes to a close, we’d also like to look back at some of the interesting developments in both artificial intelligence (AI) generally, and in the training data space more specifically.
The exponential growth in AI investment continues
According to CB Insights, AI startups achieved another record-breaking quarter in Q3 2021 for a total of $50 billion year-to-date, which has already surpassed all of 2020 by 55%. This level of investment has also been borne out in the public markets, with 3 SPACs and 8 IPOs taking place in the third quarter alone, along with PayPal’s $2.7 billion acquisition of Paidy.
Demand has never been greater, and to scale globally and take advantage of this opportunity, organizations need to train AI technology on data that captures the unique cultural and linguistic nuances of every region, every culture and language.
But AI still has a diversity challenge, and a long way to go
In the 2021 AI Index Report, Stanford’s Institute for Human-Centered Artificial Intelligence (HAI) reported that AI still has a diversity challenge. As of 2019, 45% of new US-based AI PhD graduates were white, in comparison to just 2.4% who were African American and 3.2% Hispanic. Meanwhile, female graduates of AI PhD programs in North America have accounted for less than 18% of all PhD graduates on average, according to an annual survey from the Computing Research Association (CRA).
Statistics for 2020 and 2021 have yet to be released, but we will hope for and work toward progress soon. As Annie Jean-Baptiste, head of product inclusion at Google, said at CES 2021 on a panel about gender and racial bias, “inclusive inputs lead to inclusive outputs”. That is well said, and LXT is singularly committed to helping organizations that are building inclusive AI by providing access to a diverse crowd to generate AI training data from all corners of the world.
Massive language models get even bigger, and more capable
This fall, Microsoft and Nvidia teamed up to train what they claim is the largest and most capable AI-powered language model to date: Megatron-Turing Natural Language Generation (MT-NLG). This latest model contains 530 billion parameters and achieves “unmatched” accuracy in a wide array of language tasks, including reading comprehension, commonsense reasoning and natural language inferences.
Nvidia’s senior director of product management and marketing for accelerated computing, Paresh Kharya, and group program manager for the Microsoft Turing team, Ali Alvi, wrote in a blog post that it’s “a big step forward in the journey towards unlocking the full promise of AI in natural language. MT-NLG will shape tomorrow’s products and motivate the community to push the boundaries of natural language processing (NLP) even further. The journey is long and far from complete, but we are excited by what is possible and what lies ahead.”
The continued evolution of representation learning and speech processing
In a related program, LXT was chosen as the exclusive data collection and annotation partner for the SUPERB program, working alongside leading researchers from National Taiwan University, MIT, Carnegie Mellon University, Johns Hopkins University, and Facebook AI. Its stated goal is to fuel research in representation learning and general speech processing.
According to Hung-yi Lee, associate professor of the Department of Computer Science & Information Engineering at National Taiwan University, “SUPERB is a unique effort to create a benchmark for models across a wide variety of tasks and will benefit the broader speech industry by enabling the detection of emotion, intent, content and other semantic information. High-quality data is key to its success, and LXT was chosen as the exclusive partner based on its flexibility, reliability, and collaborative culture.
As we look to the new year ahead, we are excited to continue working with AI pioneers, to expand our industry partnerships and to provide even more flexible work opportunities. Best wishes to all of our colleagues, customers and crowd for a healthy and happy 2022!