Large Language Models (LLMs) have quickly gained the attention of many, joining the discussion list at family dinners due to their ability to engage users with compelling and impactful results that evolve based on user input. LLMs’ power comes from their ability to distill down information and make meaningful connections that lead to recommended sets of content for users to consume.
One commonly discussed defect with LLMs is their ability to manifest ‘hallucinations’, essentially creating a relationship between two pieces of information that does not exist in the real world.
Hallucinations are an unexpected output, often triggered by edge cases in data and relationships. Beyond random manifestations of data, we have the risk of user input leading to hallucinations. This input could be intentional or unintentional and becomes a form of ‘pollution’ in the model, or interactions by users that lead to adverse outputs from the LLM and associated applications.
Pollution in our models can come from many sources but has the potential to lead to unexpected behavior, including hallucinations or other incorrect results.
Pollution can come from a variety of sources, including incorrect or incomplete training data, user interaction in a low-resource language, and incorrect domain-specific terminology. To ensure our models do not get polluted from these and other sources, your organization must first define what constitutes pollution by defining standards for model outcomes, language usage, answers and outputs, responses, and accuracy levels.
Once our policies are defined, we have multiple options for where and how we monitor for and correct pollution within our model training and deployment processes:
LLMs evolve faster than other forms of analytical models due to their nature of direct interaction with users. This direct interaction can intersect with training data in unpredictable ways. Through both intentional and unintentional user actions, we can experience polluted models that lead to incorrect and incomplete results, including hallucinations.
Through clear policies that lead to the implementation of quality controls, modeling monitoring, and alerting we can catch pollution before it manifests in ways that break business processes or negatively impact user trust.
Ready to start your AI journey?