What is a Large Language Model?

LLMs are neural networks trained to predict the next token in a sequence across enormous amounts of text. Through this seemingly simple objective, they learn grammar, reasoning, factual knowledge, and task-following behaviour โ€” all from statistical patterns in language.

Tokens, not words

LLMs operate on sub-word tokens, not whole words. "unbelievable" might be split into ["un", "believ", "able"]. A typical LLM vocabulary has 50,000โ€“100,000 tokens. This matters for understanding context limits and API costs.

Language model objective
\[ \mathcal{L} = -\sum_{t=1}^{T} \log P(x_t \mid x_1, \ldots, x_{t-1}) \]
The model is trained to maximise the log-probability of each token given all preceding tokens. Minimising this cross-entropy loss over billions of examples produces surprisingly general capabilities.

The Transformer: Self-Attention

The key innovation in modern LLMs is the self-attention mechanism (Vaswani et al., 2017). It allows every token to directly attend to every other token in the context, capturing long-range dependencies that recurrent networks struggled with.

Scaled Dot-Product Attention
\[ \text{Attention}(Q, K, V) = \text{softmax}\!\left(\frac{QK^T}{\sqrt{d_k}}\right) V \]
Q (queries), K (keys), V (values) are linear projections of the input. The \(\sqrt{d_k}\) scaling prevents dot products from becoming too large. The result is a weighted average of values, where weights encode semantic relevance between tokens.

๐Ÿ“Œ Multi-head attention runs this operation \(h\) times in parallel with different learned projections โ€” allowing the model to simultaneously attend to syntax, coreference, semantic similarity, and other relationship types.

Practical Deployment: What Actually Matters

For business applications, the most important concepts are context window (how much text the model processes at once), prompt engineering (structuring inputs for consistent outputs), and RAG (Retrieval-Augmented Generation โ€” connecting models to your own data).

ApproachWhen to useCostCustomisation
Prompt engineeringGeneral tasks, quick explorationLowLow
RAGDomain Q&A over your own documentsMediumMedium
Fine-tuningConsistent style/format, specialised tasksHighHigh

โš ๏ธ LLMs hallucinate. They generate plausible-sounding text, not guaranteed facts. Always validate outputs for high-stakes decisions, and use RAG with verifiable sources when factual accuracy is critical.

LLMs in Business Workflows

The most impactful applications we see in small business contexts are: document summarisation and extraction, customer-facing chatbots grounded in company data, automated report generation, and email/comms drafting. The key is identifying tasks that are language-heavy, repetitive, and currently done manually.

Agentic workflows

Beyond single-turn responses, LLMs can be embedded in multi-step agent loops โ€” reading data, calling tools, checking outputs, and iterating autonomously. This is where the real productivity gains come from, but also where careful design and guardrails become essential.

Neural Networks Clustering