Glossary

Definition/Description: Software that attempts to identify whether content (text, image, media) was generated by AI rather than humans.

Examples / Notes: GPTZero, Turnitin AI detection, Copyleaks

Definition/Description: Policies, frameworks, and practices to ensure responsible and ethical development, deployment, and use of AI systems.

Examples / Notes: EU AI Act, corporate AI ethics boards, model risk assessments

Definition/Description: Informal term for AI systems or human reviewers that moderate or monitor model outputs for compliance and safety.

Examples / Notes: Moderation pipelines combining automated filters + human reviewers

Definition/Description: The ability to understand, use, and critically evaluate AI technologies and their impacts.

Examples / Notes: Workshops teaching prompt basics, company training on safe AI use

Definition/Description: Low-quality, mass-generated AI content that clutters the internet and reduces information quality.

Examples / Notes: Spammy SEO articles auto-generated at scale

Definition/Description: AI systems that can plan, reason, and take multi-step actions autonomously, often chaining tools and memory.

Examples / Notes: AutoGPT, BabyAGI, task-executing agents

Definition/Description: The process of ensuring AI models behave in accordance with human values, ethics, and intentions.

Examples / Notes: RLHF pipelines to align chatbots with human preferences

Alignment Tax

Definition/Description: The perceived drop in capability or creativity of a model after applying safety/alignment measures.

Examples / Notes: Stricter filters that reduce creative or edgy outputs

Definition/Description: Concept connected to aligning AI with human values; also associated with work by Anthropic on safe model behavior.

Examples / Notes: Discussions and research papers from Anthropic on AI safety

Definition/Description: Standardized test or dataset used to measure AI performance across tasks.

Examples / Notes: MMLU, GLUE, SuperGLUE

Chain-of-Thought (CoT)

Definition/Description: Prompting technique encouraging models to reason step by step before answering.

Examples / Notes: ‘Let’s think step by step…’ prompts that improve reasoning accuracy

Context Window

Definition/Description: The maximum number of tokens (words + symbols) an AI model can process at once.

Examples / Notes: 4k, 8k, 32k token windows in LLM variants

Contextual Compression

Definition/Description: Reducing long inputs into compact, semantically relevant snippets before passing to LLMs.

Examples / Notes: Summarizing a long document into key bullets before using RAG

Distillation (Knowledge Distillation)

Definition/Description: Training a smaller ‘student’ model to mimic a larger ‘teacher’ model for efficiency.

Examples / Notes: DistilBERT distilled from BERT for faster inference

Embedding

Definition/Description: A numerical vector representation of text, images, or data that captures semantic meaning.

Examples / Notes: Sentence-transformers embeddings used for semantic search

Evaluation Benchmark

Definition/Description: Standardized test or dataset used to evaluate AI model performance.

Examples / Notes: HELM, BIG-bench used for broad evaluation

Fine-Tuning

Definition/Description: The process of training a pretrained model further on specific data to specialize it.

Examples / Notes: Fine-tuning GPT on legal docs to make a legal assistant

Foundation Model

Definition/Description: A large pretrained model (like GPT, PaLM, LLaMA) that can be adapted to many downstream tasks.

Examples / Notes: GPT-4, PaLM, LLaMA

Frugal AI

Definition/Description: Designing AI systems that are efficient in compute, energy, and data usage.

Examples / Notes: TinyML models for on-device inference to save energy

Generative AI

Definition/Description: AI systems capable of creating text, images, audio, video, or code from prompts.

Examples / Notes: ChatGPT (text), DALL·E/Stable Diffusion (images)

Grounding

Definition/Description: Linking AI outputs to verifiable sources of truth (knowledge bases, documents, or real-world data).

Examples / Notes: Using RAG to cite company policies when answering support questions

Guardrails-as-a-Service

Definition/Description: External filtering/safety layers applied to AI models to enforce compliance or reduce harmful outputs.

Examples / Notes: NeMo Guardrails, Azure AI Content Safety

Hallucination

Definition/Description: When an AI model generates information that is plausible but false or fabricated.

Examples / Notes: An LLM inventing a nonexistent study or citation

Humanizer

Definition/Description: A tool or process that makes AI-generated content more natural, human-like, and less detectable as machine-made.

Examples / Notes: ‘AI humanizer’ tools that rewrite ChatGPT output to improve tone and reduce detector flags

Inference

Definition/Description: Running a trained model to generate predictions/outputs (opposite of training).

Examples / Notes: Calling an LLM to generate an answer via an API

Inference Endpoint

Definition/Description: A cloud-hosted API that serves a trained model for real-time predictions or generations.

Examples / Notes: OpenAI API, Hugging Face Inference API

Jailbreak (LLM)

Definition/Description: A prompt or trick designed to bypass restrictions, filters, or safety rules of an AI system.

Examples / Notes: DAN (‘Do Anything Now’) style prompts that try to override safety layers

LoRA (Low-Rank Adaptation)

Definition/Description: A lightweight fine-tuning method that adapts large models efficiently without retraining all parameters.

Examples / Notes: LoRA adapters to customize Stable Diffusion styles or LLM behaviors

Memory (in Agents/LLMs)

Definition/Description: Persistent storage of interactions or states so AI can recall past conversations or events.

Examples / Notes: A chatbot remembering user preferences across sessions

Meta-prompting

Definition/Description: Using prompts that generate new prompts dynamically to optimize model behavior.

Examples / Notes: An LLM generating follow-up prompts to refine a search

Mixture of Experts (MoE)

Definition/Description: Architecture activating only a subset of expert subnetworks per input to improve efficiency.

Examples / Notes: Google’s Switch Transformer uses MoE to scale efficiently

MoE (Mixture of Experts)

Definition/Description: Architecture where different ‘expert’ subnetworks are selectively activated per input to save compute.

Examples / Notes: Switch Transformer uses routing to activate a subset of experts

Multimodal AI

Definition/Description: AI models that process and generate across multiple modalities, such as text, images, and audio.

Examples / Notes: GPT-4V (text + images), CLIP (image + text embeddings)

Overfitting

Definition/Description: When a model memorizes training data, performing poorly on unseen data.

Examples / Notes: A classifier scoring high on training set but failing on new inputs

Parameter

Definition/Description: Trainable values (weights, biases) in a model that determine its behavior.

Examples / Notes: GPT-3 had ~175B parameters; parameters indicate model capacity

Prompt

Definition/Description: User input that directs a model’s behavior/output.

Examples / Notes: Instructional text like ‘Summarize this article in 3 bullets’

Prompt Chaining

Definition/Description: Combining multiple prompts in sequence to achieve a complex task or reasoning pipeline.

Examples / Notes: LangChain flows that extract facts → reformulate → summarize

Prompt Engineering

Definition/Description: Crafting inputs (prompts) to guide model outputs effectively.

Examples / Notes: Few-shot examples, role prompts, and constraint instructions

Prompt Injection

Definition/Description: An attack embedding malicious instructions in prompts to override restrictions.

Examples / Notes: Embedding ‘ignore previous directions’ in user-uploaded text

Pruning

Definition/Description: Removing unnecessary weights/neural connections to shrink models and improve speed.

Examples / Notes: Magnitude pruning to drop small weights and reduce model size

Quantization

Definition/Description: Compressing AI models by reducing precision of numbers, making them smaller and faster with minimal accuracy loss.

Examples / Notes: INT8 quantization to deploy models on CPU

RAG (Retrieval-Augmented Generation)

Definition/Description: Technique that combines LLMs with external retrieval for grounded, factual responses.

Examples / Notes: Chatbot uses vector DB to fetch relevant documents before answering

Red Teaming

Definition/Description: Testing AI systems with adversarial prompts or scenarios to find vulnerabilities or unsafe behaviors.

Examples / Notes: Security teams probing models with malicious prompts pre-launch

Reinforcement Learning (RL)

Definition/Description: Training paradigm where agents learn by taking actions in an environment to maximize cumulative reward.

Examples / Notes: AlphaGo learning to play Go via RL

Reinforcement Learning from Human Feedback (RLHF)

Definition/Description: Technique that uses human preference data to guide model behavior during training.

Examples / Notes: Ranking human-preferred responses to train a reward model

Retrieval Plugin / Connector

Definition/Description: Module that links an LLM with external data sources (databases, APIs, documents).

Examples / Notes: LangChain connector to Google Drive or Airtable

Self-Consistency

Definition/Description: Sampling multiple reasoning paths and choosing the most consistent answer for robustness.

Examples / Notes: Generate 10 CoT chains and take majority vote on the final answer

Steerability

Definition/Description: The ability to control and direct a model’s style, tone, or behavior via prompts or instructions.

Examples / Notes: System prompt that makes the model ‘be concise and formal’

Supervised Fine-Tuning (SFT)

Definition/Description: Training a model on labeled human-curated data as an intermediate before RLHF.

Examples / Notes: SFT on curated chat logs to teach helpful responses

Synthetic Data

Definition/Description: Artificially generated data used to train or augment models when real data is limited or sensitive.

Examples / Notes: LLM-generated Q&A pairs to augment training sets

Synthetic Persona

Definition/Description: A consistent, branded AI-powered character that interacts with users in marketing, games, or support.

Examples / Notes: Virtual influencers or branded chat assistants

System Prompt / Hidden Prompt

Definition/Description: The underlying instruction that defines an AI’s role, personality, and limits, usually invisible to users.

Examples / Notes: ChatGPT’s hidden instructions like ‘You are a helpful assistant’

Temperature

Definition/Description: A parameter that controls randomness in generation; higher values increase creativity.

Examples / Notes: Temperature 0.0 → deterministic; 0.8 → more diverse responses

Throughput

Definition/Description: Number of inferences processed per unit time; measure of serving capacity.

Examples / Notes: A model serving cluster handling 1000 requests/sec

Token

Definition/Description: A chunk of text (word or subword) used as input/output units for LLMs.

Examples / Notes: GPT tokenization: ‘chatting’ → may split into [‘chat’,’ting’]

Tool Use / API Calling

Definition/Description: Model’s ability to call external tools or APIs to perform actions or fetch data.

Examples / Notes: LLM calling a calculator API or a flight-booking service

Transformer

Definition/Description: Neural network architecture using attention mechanisms; the backbone of modern LLMs.

Examples / Notes: The ‘Attention is All You Need’ paper introduced the transformer

Tuning-free Control

Definition/Description: Methods for influencing model output without retraining, often through prompting or adapters.

Examples / Notes: Using prompt templates or lightweight adapters to change behavior

Vector Database

Definition/Description: Databases optimized for storing and querying embeddings for similarity search and retrieval.

Examples / Notes: Pinecone, FAISS, Milvus used for semantic search

Watermarking (AI output)

Definition/Description: Embedding hidden signals in AI-generated content to identify it later.

Examples / Notes: Subtle statistical marker in generated text or image noise patterns

Zero-Shot / One-Shot / Few-Shot Learning

Definition/Description: Approaches for guiding models with none, one, or a few examples directly in the prompt.

Examples / Notes: Few-shot: show 3 examples of Q→A in the prompt to guide responses

Zero-Trust AI

Definition/Description: An architecture principle designing systems with strict verifiability rather than blind trust.

Examples / Notes: Audit logs, strict access control and cryptographic checks on models and data