Why NLP, ChatGPT Require Guardrails and Content Moderation to Combat Bias

With Microsoft’s recent announcement of GPT-3 integration into its Bing search engine, and Google’s subsequent frenzy to build their own chat-search tool, it’s clear that natural language processing (NLP) is here to stay. These developments make it much easier to imagine a future where digital information is managed through NLP tools like ChatGPT.

As we move into that future, however, we need to ensure these powerful tools are being used properly. I’ve written before about the blatant errors that ChatGPT can make, but the problem goes much deeper. Just as YouTube and Facebook consistently weed out violent, biased, false, sexual, or illegal content, we need to make sure our artificial intelligence (AI) chat tools aren’t filling the world’s minds with bad ideas. This can be as obvious as ChatGPT using expletives when used in grade school classrooms or as subtle as hidden biases within systems used to grant loans to applicants.

The Importance of Guardrails and Content Moderation

Most of the time, when you ask ChatGPT to provide information about violence, profanity, criminal behaviors, race, or other unsavory topics, you’ll get refused with a cookie-cutter response about OpenAI’s content policy. Key phrase: Most of the time.

There are countless workarounds to this thin layer of protection. For example, you could ask ChatGPT to pretend to be an actor in a play about [insert unsavory topic], and it will speak freely. This is because it was trained on ebooks, articles, social media posts, and more that were pulled from the Internet, including from the deepest and darkest corners of the web. Unlike content moderation on YouTube or Facebook, however, it’s extremely difficult for NLP developers to moderate what ChatGPT actually outputs. All it does is generate and connect sequences of words that are often found connected in the text it was trained on, with no moral compass or understanding of its output.

Insights into the Why & How of AI & Hyperautomation's Impact_featured — Guidebook: Insights Into the Why and How of AI and Hyperautomation’s Impact

Unlike traditional software that operates through rules defined by the developer, deep learning models that are the basis for NLP define their own rules. AI engineers can carefully select the training data or modify the model’s architecture, but they can’t predict with 100% accuracy what the output of the model will be. By their nature, AI systems can also handle situations that the developers may never have anticipated. All of this leads to occasional outliers in how the system responds.

Gary Marcus, the author of Rebooting AI: Building Artificial Intelligence We Can Trust, copy-pastes a few grim examples in which ChatGPT users found workarounds that led the chat tool to produce a response that may inspire acts of violence. Another user generated a list of realistic-sounding (yet scientifically false) claims about the Covid-19 vaccine, complete with made-up citations and authors.

If knowledge workers want to access information and influence our thinking with NLP tools, we obviously have some work to do. If these tools replace traditional search engines, they will hold the ability to impact our worldview; these tools can produce misinformation and tilt public perception.

The Impacts of Bias on NLP Systems

Besides the obvious threat of poor content moderation, NLP tools face additional, subtler issues. As AI and NLP are embedded deeper into the functioning of our society, these tools are increasingly pushing the world in a certain way. For instance, they are used in determining which demographics receive mortgages and impacting how children learn in school. If companies using AI don’t take bias mitigation and responsible development seriously, that can raise real points of concern. That concern can be exacerbated by the fact there’s limited to no AI regulation.

From automated hiring systems to customer service bots, there are many ways that natural language processing can help your organization cut costs, improve the customer experience, and drive revenue. However, many experiments in automated hiring have been shot down due to the obvious bias of the models. Researchers at Berkeley discovered that models like GPT-3 tended to assign males with higher-paying jobs than females. When asked what gender a doctor is, the NLP system would reply “male” more than “female.” While this does simply reflect the actual ratio of male to female doctors, it becomes a problem if hospitals start using NLP to automatically read and score resumes.

DataRobot conducted a study and discovered that one in three organizations that were using AI struggled due to the bias within their models regarding race, gender, sexual orientation, and religion. Of those organizations, 62% lost revenue as a result, 61% lost customers, 43% lost employees, and 35% racked up legal fees due to legal action. Furthermore, the existing measures taken to avoid bias are only somewhat effective — 77% of organizations surveyed by DataRobot already had algorithms in place to discover bias.

How to Mitigate Bias

Unfortunately, there is no silver bullet when it comes to mitigating bias (yet). There are good practices though. The best models come from high-quality data sets that include all populations and have accurate labeling. Here are a few things you can do to mitigate bias in your AI systems:

Develop a ‘testing model’ or work with one of the Hyperautomation Top 10 companies to benchmark your AI system to check for bias
Actively seek out external audits
Open source your data sets to boost accountability
Consider using crowdsourced or synthetic data
As painful as they can be, running developments through an ethics review board can save you a lot of money and pain later on

See the AI/Hyperautomation Top 10 Shortlist

As a side note, studies have shown that biased models aren’t just caused by low-quality training data, but also the diversity of the development team. Having a development or ethics review board with diverse backgrounds decreases the chance that the human biases of your team are translated into the model.

In fact, this brings home a point often brought up in Acceleration Economy: automation success results from humans in combination with artificial intelligence. Both have flaws and biases that can be mitigated by the other. For example, Amazon, whose AI recruiting tool turned out to favor male candidates over female ones, decided to use AI to detect flaws in its human-driven recruiting approach instead

Finally, make your practices around developing models transparent. Ironically, OpenAI has not been proactive in this regard, as a TIME article recently exposed. To build the content moderation model keeping GPT-3’s outputs in check, they sent thousands of snippets of text pulled from the darkest corners of the internet to an outsourcing firm in Kenya. The workers at the firm, who manually labeled text, were paid between $1.32 and $2 per hour for 9-hour shifts reading dark material often involving extreme violence, bestiality, and trauma. OpenAI contracted the same firm to review negative images to improve the output of their DALL-E-2 image generator, many of which left the workers mentally scarred.

I’m also curious to see if NLP models like ChatGPT continue down proprietary paths or become open source. This is similar to the early 2000s when there was tension between company-run and open-source software — and AI is facing a similar turning point. To me, transparent community-run oversight of these models seems like a healthier approach. To shareholders, as always, the opposite is true.

Ultimately, NLP lets you boil down large bodies of information, compile them, and summarize them, just like a teacher. That teacher might not be the best one, however. Any biases or errors can potentially influence a lot of people. Tackling bias must be the number 1 priority of anyone entering the world of natural language processing, no matter if you’re a startup or if you’re Google.

Looking for real-world insights into artificial intelligence and hyperautomation? Subscribe to the AI and Hyperautomation channel:

Why NLP, ChatGPT Require Guardrails and Content Moderation to Combat Bias

Toni Witt

Areas of Expertise

Gary Miller on Aligning Customer and Partner Success in the AI Era | Cloud Wars Live

Microsoft Delivers Role-Specific Copilots, Enhanced Controls and Integrations for Copilot Studio

Revolut Taps Google Cloud’s AI to Scale Fintech Services Worldwide

Google Remains World’s Hottest Cloud Vendor; Oracle Rising, Microsoft Surging

The Agentic Enterprise: How Microsoft and Industry Leaders Are Redefining Work Through AI

SAP Business Network: A B2B Trading Partner Platform for Resilient Supply Chains

Using Agents and Copilots In M365 Modern Work

AI Data Readiness and Modernization: Tech and Organizational Strategies to Optimize Data For AI Use Cases

Why NLP, ChatGPT Require Guardrails and Content Moderation to Combat Bias

The Importance of Guardrails and Content Moderation

The Impacts of Bias on NLP Systems

How to Mitigate Bias

Toni Witt

Areas of Expertise

Related Posts