Appen combats skewed AI data to ensure end-users have the same experience – TechRepublic

Posted: May 4, 2021 at 8:10 pm

The company launched diverse training data sets for natural language processing initiatives.

Image: iStock/metamorworks

Training data provider Appen data just launched recently developed diverse training data sets for natural language processing initiatives in an effort to ensure end-users will receive the same experience, regardless of language variety, dialect, ethnolect, accent, race or gender.

Appen said it realized that AI projects that are based on biased or incomplete data don't work for everyone. It is enabling organizations to launch, update and operate unbiased AI models through a variety of projects and partnerships focused on the diversity of languages and dialects, the company announced on its website.

In March, Proceedings of the National Academy of Sciences found that popular automated speech-recognition systems used for virtual assistants, closed captioning, hands-free computing and more, "exhibit significant racial disparities in performance."

SEE:Juggling remote work with kids' education is a mammoth task. Here's how employers can help (free PDF)(TechRepublic)

The report concludes "that more diverse training datasets are needed to reduce these performance differences and ensure speech recognition technology is inclusive. Language interpretation and natural language processing systems suffer from the same challenge and require the same solution."

"The quality and diversity of training data directly impacts the performance and bias present in AI models," said Mark Brayan, CEO at Appen, in a press release. "As a data partner, we can supply complete training data for many use cases to ensure AI models work for everyone. It's critical that we engage a diverse group of individuals to produce, label, and validate the data to ensure the model being trained is not only equitable, but also built responsibly."

With a goal to create AI for everyone, Appen developed a variety of projects and partnerships which focus on the diversity of languages and dialects.

As an example, the Appen website explained:

Without setting out to do so, biased AI data can set off a wave of information that is not only not valuable toward research, but can actually be detrimental.

"Biased AI data leads to projects that can fail to deliver the expected business results and harm individuals they are supposed to benefit," said Dr. Judith Bishop, senior director of AI specialists at Appen. "The scale and complexity of AI projects makes it impossible for most companies to acquire sufficient unbiased high-quality data without partnering with an AI data expert." She added, "Developing the most diverse and expert crowd of data annotators provides the industry with a clearly differentiated resource for building fair and ethical AI projects."

Learn the latest news and best practices about data science, big data analytics, and artificial intelligence. Delivered Mondays

See the article here:

Appen combats skewed AI data to ensure end-users have the same experience - TechRepublic

Related Posts