Definitions

Glossary

Plain-language definitions of the terms we work with most. Linked to the service pages where each one shows up in practice.

Seypro's glossary covers core terminology in custom software, applied AI, SEO/GEO, security, and strategic consulting — including Generative Engine Optimization (GEO), RAG pipelines, fractional CTO/CMO/CSO, technical SEO audits, Core Web Vitals (LCP/INP/CLS), and private LLM deployment. Each entry links to the service it relates to.

In this glossary

Generative Engine Optimization (GEO)
Fractional CTO / CMO / CSO
RAG pipeline (Retrieval Augmented Generation)
Technical SEO audit
Core Web Vitals (LCP, INP, CLS)
Private LLM (on-prem / self-hosted model)

Generative Engine Optimization (GEO)

Optimising for citations in AI-generated answers — ChatGPT, Perplexity, Gemini, Claude — rather than for blue-link rankings.

Generative Engine Optimization is the practice of structuring a website so that large-language-model search interfaces (ChatGPT, Perplexity, Google AI Overviews, Gemini, Claude) cite it accurately and frequently when answering user questions. It is the AI-search counterpart of traditional SEO.

GEO leans on the same fundamentals as good technical SEO — clean structure, schema.org metadata, fast loads — and adds explicit signals that AI engines parse: factual answer capsules at the top of pages, clear definitional content, comprehensive entity coverage, llms.txt files, and structured data on every meaningful surface.

Where SEO ranks pages, GEO supplies sentences. The unit of value is whether your site is the source of the sentence the model produces.

See: GEO services

Fractional CTO / CMO / CSO

Senior C-level leadership engaged part-time, with hands-on execution capability.

A fractional executive is a senior leader engaged for a defined number of days per week or per month, typically through a long-running retainer. The role differs from interim leadership (which is full-time but temporary) and from advisory boards (which give counsel but don't execute).

Fractional CTOs are most common for companies past early product-market fit but not yet ready to hire a full-time CTO at $300k+ all-in. Fractional CMOs serve a similar arc on the marketing side. Fractional CSOs (Chief Strategy Officers) usually focus on positioning, market entry, and cross-functional planning.

Engagements typically use whichever toolchain the team already runs on (Notion, Linear, Jira), prioritise with OKRs and the RICE framework, and reference established frameworks like TOGAF and ITIL where the engagement requires it.

See: Strategic Consulting

RAG pipeline (Retrieval Augmented Generation)

A pattern where an LLM is given relevant documents at query time, so its answers are grounded in your data rather than its training corpus.

A RAG pipeline retrieves relevant documents from a vector database (Pinecone, pgvector, Weaviate, Qdrant) at the moment a user asks a question, then includes those documents in the prompt sent to a language model. The model answers using the retrieved context rather than relying on what it learned during training.

The architecture has three parts: ingestion (chunking documents, generating embeddings, storing them), retrieval (turning the user's question into an embedding and finding the closest matches), and generation (passing the retrieved chunks to the LLM with instructions to answer from them).

RAG is the right pattern when the underlying knowledge changes faster than fine-tuning cycles allow, when the corpus is private, or when you need the model to cite sources. It is not the right pattern when you need the model to reason across a corpus too large to fit in any reasonable retrieval window — agentic workflows or long-context prompting fit those cases better.

See: AI services

Technical SEO audit

A systematic review of the engineering foundations underneath search visibility — render, crawl, index, schema, performance.

A technical SEO audit covers the layers of a website that affect how search engines fetch, parse, and rank it, separately from the content layer. The scope typically includes: render path (CSR/SSR/SSG/ISR), crawl budget and robots directives, sitemap completeness, canonical correctness, hreflang for international sites, schema.org coverage on every meaningful entity, image and font optimization, Core Web Vitals, and HTTPS/security headers.

A good audit produces a prioritised list — what to fix first, ordered by ranking impact relative to engineering effort. The findings most teams underestimate: Core Web Vitals on mobile, missing structured data on key pages, and orphaned URLs that consume crawl budget without contributing.

In 2026, technical SEO and GEO overlap heavily — most of the structural improvements that help Google also help AI search engines parse the site.

See: SEO & Search Visibility

Core Web Vitals (LCP, INP, CLS)

Google's three user-experience metrics: Largest Contentful Paint, Interaction to Next Paint, and Cumulative Layout Shift.

Core Web Vitals are three measurable signals Google uses to evaluate page experience. LCP (Largest Contentful Paint) is how quickly the largest visible element renders — Google's threshold for "good" is under 2.5 seconds. INP (Interaction to Next Paint) is how quickly the page responds to user input — under 200ms is "good". CLS (Cumulative Layout Shift) is how much the layout jumps around as the page loads — under 0.1 is "good".

INP replaced FID (First Input Delay) as a Core Web Vital in March 2024. INP is harder to optimise because it measures every interaction, not just the first.

Core Web Vitals are measured in the field via the Chrome User Experience Report, which is what feeds Google's ranking signal. Lab tools (Lighthouse) approximate the same metrics but use synthetic conditions — pass them in lab, you still need to monitor the field.

See: Website Design

Private LLM (on-prem / self-hosted model)

A language model running on infrastructure you control, rather than via a third-party API like OpenAI or Anthropic.

A private LLM deployment runs an open-weight model — typically Llama, Mistral, Qwen, or DeepSeek — on hardware you control: a GPU server, a fleet of A100/H100s, a dedicated cloud instance, or in some cases consumer hardware running a quantised model via Ollama or vLLM.

The driving reasons are usually data sovereignty (regulatory or contractual requirements that data can't leave your infrastructure), cost at scale (API token bills can dwarf hosting costs at high throughput), latency control, and the ability to fine-tune on private data without sending it to a third party.

The trade-offs are real: open-weight models trail the frontier closed models on most benchmarks, deployment complexity is non-trivial, and total cost of ownership only crosses over the API option past a certain query volume. The right pattern is usually hybrid — private LLM for high-volume internal workflows, frontier APIs for the harder problems.

See: AI services

Term you wanted that isn't here?

Tell us what you were looking for. The glossary grows in the direction of the questions we get.

Get in touch