Bias - The idea that the programmers, the material used in training, and the output all have biases that effect what the AI model does and produces. The AI output is never neutral, it has a point of view.
- The programmers / trainers - Programmers and trainers are people. All people have biases and points of view; it is difficult to overcome and eliminate a person's bias. Companies have goals and those goals establish direction and biases; whether that be social justice, or profit, or whatever. The issue is that those biases influence the training and output of the AI. Instead of a neutral output, the AI's output will be shaped toward the intentions and desires of the programmers / trainers.
- The materials used for training - Material used for training is selected and curated by the company involved. Only desired materials are used as input for training an AI. Undesirable materials are left out of the training process. If a company wishes certain outcomes, then the training material can be used to shape that outcome. One example is Magisterium AI. It is an AI trained on all catholic materials (Bible, history, and canon law). All of its output will have a pro-Roman Catholic bias.
There is an old programmers saying that fits this idea: Garbage in, garbage out.
- Output - How an AI replies to a prompt is guided by training material and its alignment. The alignment is adjusted to match the desires and plans of the company and programmers. The output may be true or factually incorrect, based upon the alignment and training materials. An AI's output can be made to be false on all occasions, if that was the desire of the programmer. An AI's output could be entirely written like a pirate, if that was the training.
Truth and Fact vs. Hallucinations
- Truth is something that is true; believed and often verifiable
- Factual is something backed by research and often quantifiable
- Hallucinations are output that is not true, but presented as truth or fact.
Hallucinations are often dismissed as a problem with output, but they are actually problems with training and alignment. Training materials often include novels (including works of fiction and myth) and social media (home to misinformation, disinformation, and outright conspiracy theories). With that as its training it is easily understandable that some output is going to be factually wrong. Another aspect of this is that many novels include deception as an important plot point to problem solve or to save the hero. Even in the Bible Abraham passes his wife off as his sister, Isaac does the same, and Jacob deceives his father in order to get his blessing. Deception, intentional lying is one way that some "good guys" get what is needed. So, an AI, a composer or writer, uses this same method in order to write. Some "hallucinations" are actually intentional.