alexi.sh
All articlesBrowser securityNetwork privacyPrivacy toolingThreat modelingAI codingDev tooling

alexi.shAI Engineering Lab

ai-coding

What Is an LLM? Large Language Models Explained (2026)

PrivSec Lab3 min read
Source code on a screen

An LLM (large language model) is a neural network trained on huge amounts of text to predict the next token β€” the technology behind ChatGPT, Claude and Llama. What an LLM is, how it works, what it can and can't do, explained plainly.

Chatbots, coding assistants, summarisers β€” almost every AI tool you've used recently is powered by an LLM. The term is everywhere in 2026, but rarely explained clearly. This guide answers it plainly: what a large language model is, how it actually works, what it's genuinely good at, and β€” just as important β€” what it can't do.

What an LLM is

An LLM (large language model) is a neural network trained on enormous amounts of text to understand and generate human-like language. Its core job is deceptively simple: predict the next token (a word or word-piece) given everything before it. Do that over and over and you get coherent answers, essays, translations and code.

"Large" refers to both the training data (much of the public web and more) and the parameters β€” often billions of internal values that store what the model learned. ChatGPT, Claude, Gemini and Llama are all LLMs.

Source code on a screen

How it works

Almost every modern LLM uses the transformer architecture. Training happens in stages:

  1. Pretraining β€” the model reads vast text and learns patterns by repeatedly predicting the next token and correcting itself when wrong. This is where most of its knowledge forms.
  2. Fine-tuning & RLHF β€” it's then refined with curated examples and human feedback to be more helpful, follow instructions, and avoid harmful output.

At inference (when you use it), you give a prompt and it generates a response one token at a time, each chosen from the probabilities it learned. Crucially, it isn't looking things up β€” it's predicting plausible text from patterns.

A code editor open on a screen
A code editor β€” LLMs generate text (and code) one token at a time, predicting the most likely continuation.

Tokens and parameters

  • Tokens β€” the unit of text an LLM reads and writes, roughly a word or word-piece. Limits like the context window are measured in tokens.
  • Parameters β€” the billions of internal weights adjusted during training that store what the model learned.

More parameters and data can mean more capability, but architecture, data quality and fine-tuning matter just as much as raw size.

What LLMs can and can't do

Strong at: drafting and summarising, answering questions, translating, explaining, and writing and debugging code.

Real limits:

  • Hallucination β€” they can state false things confidently. They predict plausible text, which is not the same as correct.
  • Knowledge cutoff β€” they don't inherently know recent events.
  • No true understanding β€” no beliefs or grounding, just learned patterns.
  • Bias β€” they can reflect biases in their training data.

The fix for facts and freshness is to give them real sources at answer time β€” that's exactly what RAG (retrieval-augmented generation) does.

LLM vs AI

AI is the broad field; an LLM is one prominent kind of AI specialised in language. Every LLM is AI, but image generators, recommenders and game agents are AI too, built differently. "AI" today often means an LLM chatbot β€” but the terms aren't interchangeable.

Running and choosing one

You can run open LLMs privately on your own machine with Ollama, and for development specifically, see our guide to the best coding LLMs. The same fundamentals β€” tokens, parameters, next-token prediction β€” apply whether the model runs in the cloud or on your laptop.

The bottom line

An LLM is a neural network that generates language by predicting the next token, trained on huge text and refined with human feedback. It's remarkably capable at language and code, and genuinely limited by hallucination, a knowledge cutoff and the absence of real understanding. Use it for what it's good at, verify what matters, and add retrieval when you need current, grounded facts.

Photo: Unsplash (source)

Also available in

FAQ

What is an LLM?
An LLM, or large language model, is a type of artificial-intelligence system trained on enormous amounts of text to understand and generate human-like language. At its core it predicts the most likely next 'token' (a word or word-piece) given everything before it, and by doing this repeatedly it writes coherent sentences, answers questions, summarises, translates and writes code. The 'large' refers to both the training data and the number of parameters β€” often billions β€” that store what the model learned. ChatGPT, Claude, Gemini and Llama are all built on LLMs.
How does an LLM work?
An LLM is a neural network, almost always based on the transformer architecture. During training it reads vast text and learns statistical patterns by repeatedly predicting the next token and adjusting its parameters when it's wrong. After this pretraining, it's often refined with fine-tuning and human feedback (RLHF) to be more helpful and safe. At use time ('inference'), you give it a prompt and it generates a response one token at a time, each token chosen based on the probabilities it learned. It isn't looking anything up β€” it's predicting from patterns.
What can LLMs do β€” and what can't they?
They're strong at language tasks: drafting and summarising text, answering questions, translating, explaining concepts, and writing and debugging code. Their limits are real: they can 'hallucinate' (state false things confidently), they have a knowledge cutoff and don't inherently know recent events, they have no true understanding or beliefs, and they can reflect biases in their training data. They predict plausible text, which is not the same as being correct β€” always verify facts that matter.
What's the difference between an LLM and AI?
AI is the broad field of making machines do things that seem intelligent. An LLM is one specific, currently very prominent kind of AI β€” a model specialised in language. So every LLM is AI, but not all AI is an LLM: image generators, recommendation systems, game-playing agents and spam filters are AI too, built with different techniques. When people say 'AI' today they often mean an LLM-powered chatbot, but the terms are not interchangeable.
What are tokens and parameters in an LLM?
A token is the unit of text an LLM processes β€” roughly a word or part of a word; models read and generate text token by token, and limits like 'context window' are measured in tokens. Parameters are the internal numerical values (weights) the model adjusts during training to store what it learned; modern LLMs have billions of them. Loosely, more parameters and more training can mean more capability, but architecture, data quality and fine-tuning matter just as much as raw size.