alexi.shπŸ” Search research…
Browser securityOS privacyToolingThreat modelingAI-codingVPNEncryption

alexi.shAI Engineering Lab

ai-coding

What Is an LLM? Large Language Models Explained (2026)

PrivSec Lab3 min read
Source code on a screen

An LLM (large language model) is a neural network trained on huge amounts of text to predict the next token β€” the technology behind ChatGPT, Claude and Llama. What an LLM is, how it works, what it can and can't do, explained plainly.

Chatbots, coding assistants, summarisers β€” almost every AI tool you've used recently is powered by an LLM. The term is everywhere in 2026, but rarely explained clearly. This guide answers it plainly: what a large language model is, how it actually works, what it's genuinely good at, and β€” just as important β€” what it can't do.

What an LLM is

An LLM (large language model) is a neural network trained on enormous amounts of text to understand and generate human-like language. Its core job is deceptively simple: predict the next token (a word or word-piece) given everything before it. Do that over and over and you get coherent answers, essays, translations and code.

"Large" refers to both the training data (much of the public web and more) and the parameters β€” often billions of internal values that store what the model learned. ChatGPT, Claude, Gemini and Llama are all LLMs.

Source code on a screen

How it works

Almost every modern LLM uses the transformer architecture. Training happens in stages:

  1. Pretraining β€” the model reads vast text and learns patterns by repeatedly predicting the next token and correcting itself when wrong. This is where most of its knowledge forms.
  2. Fine-tuning & RLHF β€” it's then refined with curated examples and human feedback to be more helpful, follow instructions, and avoid harmful output.

At inference (when you use it), you give a prompt and it generates a response one token at a time, each chosen from the probabilities it learned. Crucially, it isn't looking things up β€” it's predicting plausible text from patterns.

A code editor open on a screen
A code editor β€” LLMs generate text (and code) one token at a time, predicting the most likely continuation.

Tokens and parameters

  • Tokens β€” the unit of text an LLM reads and writes, roughly a word or word-piece. Limits like the context window are measured in tokens.
  • Parameters β€” the billions of internal weights adjusted during training that store what the model learned.

More parameters and data can mean more capability, but architecture, data quality and fine-tuning matter just as much as raw size.

What LLMs can and can't do

Strong at: drafting and summarising, answering questions, translating, explaining, and writing and debugging code.

Real limits:

  • Hallucination β€” they can state false things confidently. They predict plausible text, which is not the same as correct.
  • Knowledge cutoff β€” they don't inherently know recent events.
  • No true understanding β€” no beliefs or grounding, just learned patterns.
  • Bias β€” they can reflect biases in their training data.

The fix for facts and freshness is to give them real sources at answer time β€” that's exactly what RAG (retrieval-augmented generation) does.

LLM vs AI

AI is the broad field; an LLM is one prominent kind of AI specialised in language. Every LLM is AI, but image generators, recommenders and game agents are AI too, built differently. "AI" today often means an LLM chatbot β€” but the terms aren't interchangeable.

Running and choosing one

You can run open LLMs privately on your own machine with Ollama, and for development specifically, see our guide to the best coding LLMs. The same fundamentals β€” tokens, parameters, next-token prediction β€” apply whether the model runs in the cloud or on your laptop.

The bottom line

An LLM is a neural network that generates language by predicting the next token, trained on huge text and refined with human feedback. It's remarkably capable at language and code, and genuinely limited by hallucination, a knowledge cutoff and the absence of real understanding. Use it for what it's good at, verify what matters, and add retrieval when you need current, grounded facts.

Photo: Unsplash (source)

Also available in