Andrew GordieievLinkedIn ↗

Portfolio · 2026

Andrew Gordieiev

AI Solutions Architect. I build production-grade agentic systems and the engineering practices that make them sustainable — bridging hands-on multi-agent architecture with organization-wide AI capability rollout. 20+ years in software, 8+ in architecture leadership.

Multi-agent architectureProduction AI deliveryAI practice rolloutEnterprise guardrails

About

Three lines, no marketing

I'm a software architect who spent the last few years moving from classical distributed systems into agentic AI — design, implementation, and rollout of multi-agent platforms inside teams that ship to real users.

Right now I build a 135-agent SDLC platform that takes a feature spec and walks it through architecture, design, code, review, QA, and deploy with explicit human-in-the-loop gates — and I run the same playbook to roll AI capabilities across engineering organizations.

What I optimize for is durability of the resulting system: tests, observability, governance, and the team practices that survive the architect leaving the room.

Selected Work

Three production-grade engagements

A flagship multi-agent platform, an organization-wide AI practice rollout, and an enterprise guardrails framework. Each addresses a different binding constraint: technology, organization, governance.

Case 01

Flagship · Multi-agent platform · v0.72.0 · Phase I — live testing

Agentic SDLC Platform — 135 agents, spec-to-prod in 1–2 days

xteam is an agentic SDLC platform — 135 specialized agents organized into 14 functional groups, with ~32 explicit author + auditor pairs (every artifact has a writer and a scored reviewer) and a Virgil multi-project state contract that tells the operator what phase each project is in, which agent should run next inside it, and — across the cross-machine priority registry at ~/.claude/xteam-virgil/registry.txt — which of the operator's many parallel projects deserves the next focus slot. The platform turns a feature spec into a deployed increment in 1–2 days while preserving the engineering practices a senior architect would demand: typed contracts, scored audits ≥99/100 binary, anti-mock test guards, canary + synthetic + rollback + kill-switch on the deploy side, an explicit Ship-to-Prod 81-position binary contract, and a Layer 7 runtime — 27 continuous 24/7 contracts (25 watchcat operators + 2 cron-driven autonomy ops) that observe production after deploy and catch silent regressions every other layer missed. Constitution: 16 binary-tested principles (author/auditor symmetry, ≥99/100 binary gates, closed-allowlist skip markers, markdown+bash identity moat, Web Bot Auth-signed prod requests, ed25519 + Merkle memory integrity, MCP sigstore attestation, Code Bible per-language best-practices, Corpus Autonomy Discipline, Autonomous Phase Boundaries Discipline). 9 identity-drift fixtures replayed every PR; 55 closed-allowlist skip markers. Compliance machinery covers EU AI Act §50 + Annex XI, NIST AI 600-1, ISO 42001, SOC 2 Type II, California ADMT, plus an Air-gap transport matrix for regulated deployments. The system has shipped feature increments across four domain classes — fintech SaaS, multi-modal comic production, developer tooling, and compliance/regulated — without losing the human-in-the-loop discipline. Every irreversible operation routes through an approval gate; every audit produces a numerical score and a list of binary gaps; every gap has a named owner and a re-evaluation trigger.

Topology · interactive

Agent topology — 135 agents in 14 functional groups

Hover or click any node to inspect role, paired auditor, and an example of work. Author/auditor pairs are highlighted by a dashed link when an agent is selected. The remaining 51 agents (25 runtime watchcats + 5 browser-driving + 2 mobile-publisher pair + 19 compliance authors/auditors abridged from the topology to keep visualisation legible) live in dedicated cards below.

xteam agent topologyInteractive circular layout of xteam author + auditor agents organised by functional group. Hover or click an agent to see role and paired auditor. Runtime watchcats and browser-driving subagents shown separately below.DiscoveryArchitectureUX / A11yEngineeringCode ReviewManual QATest / EvalSecurityComplianceDevOpsPerf / CostResilienceGrowthStudioxteam135 agents · 14 groups

Workflow · interactive

Workflow walkthrough — Sub-step Delivery Protocol stages

Step through the SDLC pipeline from spec to deploy. Each stage shows its agents, input, output, and the binary approval gate that must score ≥ 99/100 before the artifact moves on.

SDLC Walkthrough

Spec → Production, with explicit gates

Stage 1 of 8

Specify

An idea hits the system as plain prose. xSpecifier converts it to structured user stories, functional requirements, and binary acceptance criteria. xSpecAuditor scores the spec and bounces it back if traceability is broken.

xSpecifierxSpecAuditor

Input

Raw idea / user story

Output

spec.md with FRs, ACs, success criteria

Approval gate

Spec must score ≥ 99/100 on auditor rubric.

Virgil · interactive

Virgil simulator — multi-project state contract

xVirgil is the cross-cutting state contract that reads explicit signals (not heuristics), proposes the next action and the agent best placed to take it, and ranks the operator's many parallel projects in a single cross-machine priority registry (~/.claude/xteam-virgil/registry.txt) so the next focus slot lands on whichever project moved the farthest. Pick a scenario.

Navigator

Project state → suggested next step

Pick a project state. The Navigator reads explicit signals (not heuristics) and recommends the next action and the agent best suited to take it.

Signals

  • No .specify/ directory
  • No spec.md, no architecture.md
  • Idea: 'a tool to help recruiters screen LLM portfolios'

Phase

DISCOVERY

Next action

Convert the idea into a problem statement, ICP, and JTBD before any architecture decisions.

Suggested agent

xAnalyst

Architecture without a problem statement produces over-engineered systems for the wrong user. Discovery-first prevents premature commitment to a stack.

Layer 7 · runtime

27 Layer 7 contracts (25 watchcats + 2 autonomy crons)

Pre-deploy gates catch bugs that haven't shipped. Layer 7 catches the silent regressions every other layer missed — schema drift, GSC indexability collapse, credential-stuffing spikes, egress exfiltration, broken CTAs invisible to dashboards but loud in support volume. Plus two cron contracts that close the autonomy loop (L7.26 monthly idea-scout sweep, L7.27 monthly portfolio-aggregator). Click any name for the canonical incident class it exists to catch.

Layer 7 · Runtime

27 Layer 7 contracts (25 watchcats + 2 autonomy crons)

The seventh layer of Ship-to-Prod (L7.1–L7.27). Every watchcat is a long-lived 24/7 operator that observes production after deploy and catches the silent regressions L1–L6 cannot. Each ships with a tool shortlist, baselines, runbook, and severity promotion path (ADVISORY → BLOCKING when criteria met). Plus two autonomy-themed cron contracts (L7.26 xidea-scout monthly sweep, L7.27 xportfolio-aggregator monthly aggregation — visible in the topology card above). Click a name for the canonical incident class it exists to catch.

Front-end visibility5

Back-end & data6

Trust & integrity5

Cross-platform5

Meta & infra4

Browser QA · per-merge

5-subagent browser-driving cluster

AI manual QA that physically drives real browsers on real staging URLs — and prod, with Web Bot Auth + safe-prod-guard wrappers. Bracket-mode parser [critical | standard | quick] picks the model class per invocation; round-trip mutation taxonomy 6×4 (CSV / JSON / XLSX / PDF / TSV / Markdown × encoding / structure / scale / edge) is exercised every merge.

Per-merge browser QA · interactive

5-subagent browser-driving cluster

AI manual QA that physically drives real browsers on real staging URLs — and prod, with Web Bot Auth + safe-prod-guard wrappers. Spec 029 (plugin v0.31.0) introduced the cluster and the bracket-mode parser [critical | standard | quick]; spec 031 added Web Bot Auth signing and # PROD-PAYMENT-NEVER runtime hard-stop.

xBrowserDriverWeb

Drive (web)

Drives real browsers against staging URLs. Export-first round-trip discipline — download → mutate → re-upload → assert — catches silent corruption that pure DOM tests miss. Cryptographically signs every production request so destructive flows can't escape staging.

Default mode

[standard]

Allowed modes

critical · standard · quick

Output

qa-runs/<iso-ts>/web/*

Round-trip mutation taxonomy

6×4 matrix exercises every export/import surface: CSV · JSON · XLSX · PDF · TSV · Markdown × encoding · structure · scale · edge-case. Per-merge SDP 1.7a runs 1-of-N rotation; xExportImportWatchcat (Layer 7) runs continuous probe post-deploy.

Production safety

Every prod request signed via Web Bot Auth ( Constitution Principle IX ); 4-pattern # PROD-PAYMENT-NEVER hard-stop blocks any selector matching data-testid*=payment / charge / card / payment-checkout ARIA-roles. Hard-fails the run on detection.

Compliance · posture

Compliance regimes covered by paired auditors

Templates ship with DRAFT disclaimer; final binding sign-off routes to a qualified human (lawyer, ISO auditor, SOC 2 firm). xteam auditors are scaffolding, not legal advice — but the scaffolding lets counsel review a structured artefact instead of a blank page. Five regimes, 25+ paired author/auditor agents.

Compliance posture · enterprise procurement

5 compliance regimes covered by paired auditors

xteam ships paired author / auditor agents for every compliance regime an enterprise procurement team typically asks about, with templates that explicitly carry DRAFT disclaimers and require qualified human sign-off. This is not a substitute for legal counsel — it's the scaffolding that lets counsel review a structured artefact instead of a blank page.

European Union · specs 049 + 055

EU AI Act §50 + Annex XI + C2PA

Article 50 transparency notice for AI systems on EU market; Annex XI GPAI technical documentation; Code of Practice C2PA watermark/marking for synthetic content across 5 surfaces (text / image / audio / video / code) and 3 schemes (c2pa:// / synthid:// / proprietary://).

xEuAiActAuthorxEuAiActAuditorxEuAiActWatermarkAuthorxEuAiActWatermarkAuditor

Effective date

2026-08-02 (Annex III enforcement)

Severity promotion path

Severity ADVISORY → BLOCKING auto-promote on regulatory deadline

Skip path

Each regime has a documented skip marker (e.g. # SKIP-NO-EU-AI-ACT-OBLIGATION) for projects where the regime does not apply. Skip is auditable; not silent.

Final binding compliance sign-off always routes to a qualified human (lawyer, ISO auditor, SOC 2 firm). xteam auditors are scaffolding, not legal advice. This is non-negotiable in every compliance template.

Dark Factory · the headline

Business idea in. Ready-for-prod out. Nothing in between.

The owner gives the system a business idea — a paragraph. The system gives the owner a production-ready release on staging. That is the entire owner-facing contract. Spec writing, architecture, design, code, tests, security, performance, accessibility, privacy, code review, manual QA, browser-driving QA against real staging, the deploy itself, the synthetic / canary / rollback / kill-switch wiring, the 27 Layer-7 watchcats — all autonomous.

Five steps, in order — explorable below. Each step has paired author + auditor agents driving every artefact to a ≥99/100 binary score under an audit-loop that escalates on plateau, context overflow, or iteration cap. An aggregator the agents cannot bypass rolls per-stage verdicts up to a single production-confidence verdict.

The autonomy boundary is structural, not policy. A small set of paths is off-limits to autonomous changes regardless of agent intent — constitution, non-goals, principle headings, the guard file enforcing them. Ambiguous responses route to HOLD by default. The line between what autonomy may decide and what only the owner may decide is enforced on every PR — not by trust.

The 5 steps · interactive

Click any step to see what happens inside it

Step 1 is the owner's only input. Step 5 is the system's only output. Steps 2 – 4 are autonomous — paired author + auditor agents driving every artefact to a ≥99/100 binary score, with an audit-loop that escalates on plateau / context overflow / iteration cap rather than silently shipping a mediocre result.

Discovery + architecture + design pairs

Spec · arch · design

Discovery agent frames the idea into FRs / ACs / success criteria. Architect picks stack, draws module boundaries, writes ADRs. Designer produces UX flows, UI visuals, and design tokens. Each step has a paired auditor scoring ≥99/100 binary; an audit-loop pushes the pair to that bar, or escalates on plateau / context overflow / iteration cap.

Output

spec.md · architecture.md · design.md · ADRs · design tokens

Why blocking, not warning

A misfiring autonomous chain in warn-only mode cascades through the agent dispatch and produces N broken specs in one batch before any owner intervention. Warn-only is structurally insufficient — batch granularity amplifies a single misfire by the batch size.

Opt-out path

Projects preferring supervised cascade can opt out via a documented marker — single-developer plugins, hobby projects, or any context where supervised execution is the right fit. Autonomy is opt-in, supervised cascade stays the default.

Re-evaluation trigger

The autonomy boundary is re-evaluated under one of two triggers — at least one documented production-affecting incident attributable to autonomous dispatch, OR three successful autonomous batch kickoffs without owner override.

Roadmap · public

Phases shipped — and what's next

Honest roadmap pinned to what's actually landed in master at plugin v0.72.0. All nine phases A → I shipped; Dark Factory (Phase I) is currently under live testing and adjustment across three parallel consuming projects. Each phase is a coherent batch of specs; each spec passes the full delivery ladder before merge.

Roadmap · public

Phases shipped — and what's next

All nine phases A → I shipped. Each phase is a coherent batch of specs; each spec passes the full delivery ladder before merge. Phase I — Dark Factory full activation is currently under live testing and adjustment across three parallel consuming projects. Hover or focus a cell for detail.

shippedshipped + live testing

Phase I

· shipped + live testing
Dark Factory full activation

Idea → ready-for-prod, autonomously. The owner hands over a business idea; the system runs the full SDLC (spec, arch, design, code, tests, audits, browser-driving QA, staging deploy) and declares production-ready. Currently under live testing and adjustment across three parallel consuming projects.

v0.72.0 · Constitution v1.13.0: 16 binary-tested principles · 55 closed-allowlist skip markers · 9 identity-drift fixtures replayed every PR · markdown + bash + Python self-gates only — no SaaS SDK lock-in.

Agents
135 across 14 functional groups
Spec → deployed
1–2 days
Author/auditor pairs
~32 explicit symmetries
Layer 7 contracts
27 continuous 24/7 (25 watchcats + 2 autonomy crons)
Production gate
Ship-to-Prod 81-position binary
Constitution
16 binary-tested principles · 9 drift fixtures

Case 02

Process · AI practice rollout

AI Practice Rollout — foundation-first across 300+ engineers

Foundation-first AI practice rollout inside a 300+ engineer IT services firm. A small core architecture team (5 architects) carried a single thesis: the binding constraint on AI adoption is organizational, not technical. We built a foundation layer — AI policy, reference architectures, eval frameworks, prompt and agent libraries — and ran it through four pilot engagements across real-time systems, media and content workflows, data-heavy reporting, and an adjacent custom-delivery domain. Each pilot produced a characteristic velocity signature: a 2–3 week dip while engineers learned the flow, then a sustained uplift to ~2–2.5× pre-adoption throughput. Foundation patterns that generalized fed back into the layer; patterns that didn't stayed scoped to their pilot. By the end of the engagement, AI capabilities had landed across 20+ projects with measurable tier-up rates among the broader engineering population. The deliverable was not a tool — it was a way of running pilots that survives the architects leaving the room.

Diagram · 1 of 2

Adoption curve — dip, then uplift

Three pilots, each running 12 weeks. Velocity dips for ~3 weeks as engineers learn the new flow, then crosses the baseline and stabilises at roughly 2–2.5× of pre-adoption throughput. The foundation layer (policies, refs, eval frameworks) shortens the dip for every subsequent pilot.

Pre-adoption baseline = 1.00.5×1.0×1.5×2.0×2.5×W0W2W4W6W8W10W12dip · learning curvesustained upliftWeeks since adoption startVelocity (× baseline)
  • Pilot A · Real-time systems
  • Pilot B · Media & content workflows
  • Pilot C · Data-heavy reporting

Diagram · 2 of 2

Foundation-first practice architecture

Pilots feed the foundation layer; the foundation layer shortens the next pilot. Adoption metrics close the loop and decide whether the foundation pattern generalizes or stays pilot-specific.

Adoption metrics· Velocity uplift × baseline· Tier-up rate of engineers· Defect-escape rate / release· Time-to-first-PR (joiners)Pilot engagements· Pilot A · Real-time systems· Pilot B · Media & content workflows· Pilot C · Data-heavy reporting· Pilot D · Adjacent custom-deliveryFoundation layer· AI policy & guardrails· Reference architectures· Eval frameworks· Prompt libraries· Cross-pilot learningsfeedback · what generalizes
Org size
300+ engineers
Core team
5 architects
Client pilots
4 engagements
Project coverage
20+ projects
Sustained uplift
~2–2.5× baseline
Tier-up
~50% of engineers (directional)

Case 03

Governance · Enterprise AI guardrails

Enterprise AI Guardrails — propose, never apply

Two-phase engagement inside a European mobility enterprise (5,000+ employees). Phase 1 was a Transportation Management System delivered from scratch on Java/Spring/Azure with a 30–40 person team — a multi-year program of typed APIs, event-driven domain modules, and CI/CD against a regulated production environment. Phase 2 was an AI guardrails framework for a broader subsystem (150–200 dev/test engineers) where the binding constraint was governance: any AI-driven change had to remain inside the internal perimeter, must never exfiltrate data, and must pass multi-criteria automated review plus a human approval gate before being applied. The framework runs a Python orchestrator that dispatches builder and validator agents in parallel; only patches that pass both an automated multi-criteria score and a human reviewer reach the Java ecosystem fix agents. The pattern is "propose, never apply" — agents produce diffs and rationales, humans accept the change. Coverage augmentation is automated against the same gate. The framework now operates as the standing AI integration pattern for the subsystem.

Diagram · 1 of 1

Approval-gated agentic workflow

Builder and validator agents run in parallel against the same spec. Both outputs converge on a human approval gate. Only approved patches reach the Java fix agents — the system proposes, never applies.

proposereview planpatch + rationalescore + flagsapproved → applyrejectedSpec / IssueInternal requestBuilder agentPython orchestratorValidator agentMulti-criteria reviewHuman approval gateDiff · justify · rollback★ blockingJava fix agentsSpring · Maven · JUnitRejected → logNever appliedInternal perimeter · no data exfiltrationPattern: propose, never apply
Enterprise scale
5,000+ employees
Phase 1 team
30–40 engineers
Phase 2 reach
150–200 dev / test
Stack
Java · Spring · Azure · Python orchestrator
Pattern
Propose-only · human approval gate
Perimeter
Internal only · no exfiltration

Approach

Four principles I keep returning to

Build for production, not demos.

A demo is a shape; production is a contract. Most agent failures aren't model failures — they're the missing tests, the unhandled failure modes, the absent rollback paths, the silent skips. I treat agentic systems the way I treat any other distributed system: typed contracts, schema-validated boundaries, explicit error taxonomy, observability at every step. The interesting question isn't whether the LLM produced something plausible — it's whether the system stays correct when the LLM doesn't.

Organizational change is the binding constraint, not technology.

Bringing AI into an engineering org of 300+ people, the bottleneck is rarely the model or the tooling — it's adoption mechanics. Engineers need a runway short enough to feel velocity uplift before they abandon the new flow. Managers need measurements that cut through hype. I plan rollouts as a sequence of low-risk pilots feeding back into a foundation layer (policies, refs, eval frameworks) so every subsequent project lands faster than the previous one.

Human-in-the-loop as architectural principle.

The agents that work in production are the ones that propose and never apply. Approval gates aren't a UX afterthought — they're a structural feature: every irreversible operation routes through a human review step with a diff, a justification, and a rollback path. This is non-negotiable for code, for infrastructure, for any system that touches customer data.

Fundamentals-first over license-first.

Picking the latest framework before the team understands the underlying patterns produces fragile systems. I push teams to learn the primitives — message protocols, tool-use semantics, eval design, prompt-engineering as a discipline — before locking into a vendor stack. Vendors change. Patterns persist.

Background

Twenty years, three arcs

  1. 2025 — present

    AI Solutions Architect

    Multi-agent SDLC platform (xteam — 135 agents, 14 functional groups, ~32 author+auditor pairs, 27 Layer 7 runtime contracts, Ship-to-Prod 81-position binary, Constitution v1.13.0 with 16 binary-tested principles; v0.72.0, Phase I — Dark Factory full activation in live testing). AI practice rollout across an IT services organization (300+ engineers, 4 client pilots). Enterprise guardrails framework for a European mobility company.

  2. 2018 — 2025

    Solution Architect → Senior Solution Architect

    Architecture leadership across distributed systems, SaaS platforms, and enterprise programs. Led delivery teams of 30–40 engineers; shipped production systems on Java/Spring with Azure and AWS. Engagements at multiple IT services firms across mobility, cloud voice, mortgage data automation, and information services domains.

  3. 2011 — 2018

    Java Backend Engineer → Tech Lead

    Java backend specialist across card banking systems, virtual betting platforms, and SaaS products. Moved from individual contributor through tech lead as systems and teams grew.

Selected affiliations

Multiple IT services firms · European mobility enterprise · Multiple fintech and SaaS clients