“The inclusion of humans in all parts of the data lifecycle is crucial – from annotation, collection, and evaluation to checking that the data is not harmful, fake, dishonest, or perpetuating any stereotypes.”

LXT is a diverse group of industry leaders and experts, committed to empowering our teams and delivering value to our customers. To achieve that, we are asking our LXT AI Data Experts about developments and trends in artificial intelligence. To kick us off, we chatted with Jessica Fernando, LXT’s Solutions Manager, about responsible AI, data bias, and some best practices to ensure responsible and ethical AI.

What recent trends in AI are you most excited about?

The artificial intelligence landscape is dynamic, and there are always new breakthroughs and developments to watch and follow as they evolve. One of the things that excites me most is the potential for AI, and more specifically speech and language technology, to be used by speakers of minority and low-resource languages. We do a lot of work in this space at LXT in collaboration with our clients. Our commitment to providing high-quality data is key in ensuring responsible AI practices with the view that speakers of a wide range of languages and dialects can access new technologies.

Can you elaborate on the meaning of ‘responsible AI’?

As AI becomes more integrated into society, it is critical to ensure that AI development and implementation are respectful, consider human rights and diversity, and ensure privacy and fairness. Responsible AI practices take these factors into account to mitigate potential bias and ensure that it remains human-centric. It is imperative that the human factor remains in generative AI. The inclusion of humans in all parts of the data lifecycle is crucial – from annotation, collection, and evaluation to checking that the data is not harmful, fake, dishonest, or perpetuating any stereotypes. As sophisticated as AI is now, we aren’t at the stage where we can let generative AI go without any human intervention.

What are some best practices to ensure responsible and ethical AI?

There’s quite a bit we need to consider to make sure AI is responsible and ethical, and a lot of it comes down to the data and how it’s collected, annotated, and implemented. 

Firstly, ensuring that AI systems don’t discriminate or exhibit bias against individuals or groups based on characteristics such as race, gender, ethnicity, or socioeconomic status is critical. Systems should be made as transparent as possible, where clear insight and explanation are given into their decision-making processes, what data they use, and how they operate. Knowing as much information as possible about who annotated the data is also really important, particularly demographic information about the annotators, as well as their native spoken languages. Providing metadata like this along with the data for future reference should always be a consideration so that any potential bias that annotators may bring to the data is accounted for and documented.

Secondly, it’s essential to ensure that the privacy and anonymity of the training data are maintained, minimizing the retention of personal data and obtaining informed consent from the individuals whose data is being used to train AI models. Data security should absolutely be at the forefront of all AI practices, and we have a responsibility to make sure sensitive data both in transit and at rest is completely secure.

Aside from this, we have a responsibility to make sure the environmental impact of maintaining and training generative AI is minimal, and that it remains a sustainable practice. The computational resources and energy requirements that are needed to run large language models (LLMs) are pretty significant and the environmental impact of that is sometimes easy to overlook. Controlling the size and complexity of the training data needed for generative AI and aiming for a balanced data set can help reduce the already massive computing power needed to run the models.

How can data help improve bias in AI?

Data plays a pivotal role in addressing bias within AI systems by providing a foundation for robust and equitable model training. By diversifying datasets to include representative samples from various demographics, cultures, and backgrounds, AI developers can mitigate biases that are often present in algorithms. Using techniques such as data augmentation and adversarial training can help expose and correct biases within the data. Ongoing monitoring and analysis of AI systems’ outputs against real-world outcomes allow for continual refinement and adjustment, ensuring fairness and inclusivity. Ultimately, leveraging comprehensive and unbiased datasets empowers AI models to make more informed and equitable decisions, fostering trust and reliability in their applications across diverse contexts.

As we learn more about this technology and how to manage data for responsible and ethical AI, how do you stay in tune with the latest updates?

One way I like to keep up with the latest information, acts, and government actions surrounding responsible AI is by attending local meetup groups of like-minded people invested and interested in the topic in Toronto. I also follow people on LinkedIn focused on the intersection between AI and human rights, which helps keep me updated with the latest news and developments in the space. I definitely encourage those who are interested in responsible and ethical AI to seek out a meetup group in your local area and start reading insightful articles and blogs to stay up to date.

Learn more about Jess’s expertise here.