Vocabulary to Know
There is a huge amount of vocabulary associated with computer science and artificial intelligence. What appears here is a curated list of some of the most important terminology. At the bottom of the page are links to other AI vocabulary lists.
To save time and energy, most of these definitions come straight from Wikipedia.
Alignment - A process within the training cycle that improves the response of the AI. In a nutshell, either a human or an AI, scores the response that the AI produces. The higher score is a reward and a lower score a penalty. That training aligns the AI to respond in the desired manner.
Artificial General Intelligence (AGI) - a type of artificial intelligence (AI) that matches or surpasses human capabilities across a wide range of cognitive tasks. This is in contrast to narrow AI, which is designed for specific tasks.
Artificial Super Intelligence (ASI) - a hypothetical agent that possesses intelligence surpassing that of the brightest and most gifted human minds. "Superintelligence" may also refer to a property of problem-solving systems (e.g., superintelligent language translators or engineering assistants) whether or not these high-level intellectual competencies are embodied in agents that act in the world.
Generative Artificial Intelligence - (Should not be identified as GenAI) - Generative artificial intelligence is artificial intelligence capable of generating text, images, videos, or other data using generative models, often in response to prompts. Generative AI models learn the patterns and structure of their input training data and then generate new data that has similar characteristics.
Large Language Model (LLM) - A large language model (LLM) is a computational model capable of language generation or other natural language processing tasks. As language models, LLMs acquire these abilities by learning statistical relationships from vast amounts of text during a self-supervised and semi-supervised training process.
Machine Learning (ML) - Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of statistical algorithms that can learn from data and generalize to unseen data and thus perform tasks without explicit instructions.
Natural Language Processing (NLP) - Natural language processing (NLP) is an interdisciplinary subfield of computer science and artificial intelligence. It is primarily concerned with providing computers with the ability to process data encoded in natural language and is thus closely related to information retrieval, knowledge representation and computational linguistics, a subfield of linguistics. Typically data is collected in text corpora, using either rule-based, statistical or neural-based approaches in machine learning and deep learning. Major tasks in natural language processing are speech recognition, text classification, natural-language understanding, and natural-language generation.
Neural Network (NN) - In machine learning, an artificial neural network is a mathematical model used to approximate nonlinear functions. Artificial neural networks are used to solve artificial intelligence problems.
Niche AI or Narrow AI - Narrow AI can be classified as being “limited to a single, narrowly defined task." - e.g. PANDA (Pancreatic Cancer Detection with Artificial Intelligence) can detect pancreatic cancer earlier and more accurately than human doctors.
One-Shot Prompt - Prompting an AI to come up with an answer in which you have "trained" (prompted) the AI with one example of the same type. These are often steps toward learning logic or reasoning.
RAG (Retrieval Augmented Generation) - Retrieval augmented generation (RAG) is a type of information retrieval process. It modifies interactions with a large language model (LLM) so that the model responds to user queries with reference to a specified set of documents, using this information in preference to information drawn from its own vast, static training data. This allows LLMs to use domain-specific and/or updated information. Many models now connect to the internet to complete complex prompts. This act drastically reduces perceived hallucinations.
Reinforcement Learning - Shaping an AIs output by tweaking the training and alignment. Some is RLHF (Human Feedback) in which a human scores responses. Another model is RLRW (Real World) in which a robot interacts with the real world, learns, and improves its training.
Synthetic Data - Artificially generated, by the AI, data used to train a model. Another tool to train a model with reasonable data.
Training - Imagine a conveyor belt feeding data into your model. Each item on the belt represents a single training example. The model processes it, learns from it, and adjusts its internal parameters accordingly. This loop continues through multiple epochs (training sessions), gradually refining the model’s understanding. - from AI Models.org
Transformer - Text is converted to numerical representations called tokens, and each token is converted into a vector via lookup from a word embedding table. At each layer, each token is then contextualized within the scope of the context window with other (unmasked) tokens via a parallel multi-head attention mechanism, allowing the signal for key tokens to be amplified and less important tokens to be diminished.
Zero-Shot Prompt - Prompting an AI to come up with an answer in which you have not "trained" (prompted) the AI with any examples of the same type. These are often checks on how the AI is progressing toward learning logic or reasoning.
Additional Vocabulary Lists