2026 Summer Studio · Flying Banana Productions
When the Prototype
Is the Product
AI tools for prototype development — and what happens when the line between “a quick demo” and “the real thing” quietly disappears.
Stephen Palmer · Founder & Technical Director
- Warm open. Thank the Summer Studio. This is the “AI tools for prototype development” slot — but I’m going to take it somewhere more useful.
- Set expectation: this whole deck is itself a static HTML site, built with agents, hosted on Cloudflare — same toolkit I’ll describe. Meta proof, mention again at the end.
- Press S for presenter notes, T to start the timer. Target ~45 min.
I was asked to talk about no-code tools and landing pages.
I’ll cover that — but what I really want to share is what a year of building with agents taught me: the line between a prototype and a real product has gotten surprisingly thin.
- Acknowledge the assigned topic honestly — they expected a tools survey. I’ll deliver it (the field guide near the end), but the story earns it first.
- Keep it personal: this is what I observed in my own work this year, not a proclamation about the industry.
- Promise the payoff: by the end they’ll have a concrete framework for when to reach for a tool vs. when to (agent-)code.
Who’s talking
25 years
shipping
software.
- Former Director of Development at Unity Technologies — the world’s leading real-time 3D platform.
- 15+ years as a professional game developer: level design, programming, production.
- Shipped titles in the Half-Life, Halo, Brothers in Arms, Borderlands, and Call of Duty series.
- Led teams of 100+ with multi-million-dollar budgets and global publishing partners.
- Credibility, fast. The names do the work — don’t belabor. The point isn’t nostalgia, it’s contrast with the next slide.
- Land the throughline: I spent a career making big things with big teams. The interesting question now is how small a team can make something real.
The pivot
From teams of 100+
to a studio of one
that ships like a team.
Flying Banana Productions — highly customized, cost-effective software for sports, entertainment & beyond, where deep domain knowledge is critical. Run solo, built with AI agents as the workforce.
- This is the emotional hinge of the talk. I didn’t scale down — I changed the leverage.
- FBP model: custom + subscription software, client work where domain depth matters. The “workforce” is agents; the human role changes (next act).
- Tie to audience: most of you are 1–3 people. That used to cap what you could build. It doesn’t anymore.
This past year taught me: the cost of building real software has collapsed.
So the old question — “prototype it cheap, or build it for real?” — stopped making sense for me. Increasingly, it’s the same motion. This talk is the story of four projects, and the method that grew out of them.
- State the thesis plainly and let it sit. This is the sentence they should remember.
- Preview the structure: 1 origin story → the method → 3 real builds → the founder framework. Set the map so they can follow.
Where it started · 2025
Knox
Challenger
The official companion app for the ATP Challenger Tour professional tennis event in Knoxville — live scores, draws, schedules, tickets, and an on-site practice-court reservation system for players and tournament staff.

Shipped to the App Store & Google Play · live for the Nov 2025 tournament
- Real event, real stakes, real users showing up on tournament day — not a toy. That pressure is what forged the method.
- Two surfaces: public fan app + operations tooling for staff (practice-court booking). I built both, solo, in ~5 months.
Tournament week, by the numbers
- Native mobile + a live proxy onto the official ATP API + a user/ticketing backend — shipped to both app stores.
- Built alongside the people running it: an operations training manual for staff, a dozen dated build sessions, static-HTML venue displays.
- 297 practice courts booked through the app — “in lieu of texting Julie.”
- The post-tournament wish list — an SMS interface, email optional — became the seed of the next product.
- The stats come from the post-tournament recap I gave the committee — real adoption, not vanity numbers. “Texting Julie” gets a laugh and makes the point: software replaced a human bottleneck kindly.
- Emphasize the collaboration: I sat with the people running the courts; the training manual and 12 dated sessions are the receipts.
- Land the last tick hard — the recap’s “opportunities” list (SMS, email optional) is literally where PlayerDesk came from. Products beget products when you write down what hurt.
The lesson
Agents amplify your velocity…
and they amplify your mistakes.
Going fast with AI is easy. Going fast without the wheels coming off is an engineering problem. Knox Challenger taught me I needed a method — not just a faster keyboard.
- This is the bridge into 5x. The naive version of “AI coding” is a pile of plausible code nobody can trust. I lived that risk on a project with a hard deadline.
- Transition line: “So I wrote the method down.”
How I work
5x-engineer
Moving pretty fast
without breaking things.
the working method I distilled into a manifesto — and a CLI
- Frame as “here’s how I work” — one practitioner’s method, offered because it’s been working, not a doctrine. The discourse is full of performative massively-parallel codegen; my version is slower-looking and finishes.
- It’s two things: a written methodology (a manifesto) and an actual published CLI tool that operationalizes it.
The shift I felt
I stopped typing.
I started bringing
the taste.
- Agents turned out to be faster — and often better — than me at writing the code, the docs, the tests.
- What’s left is what they can’t do: taste, empathy, product judgment, architecture, quality control.
- I became the gate at every checkpoint — instead of the bottleneck at the keyboard.
- Most relatable idea for founders: your edge was never typing speed. It’s judgment. Agents make judgment the scarce resource — which is good news for domain experts.
- “Bring the taste” is the quotable line. Repeat it at the close.
My loop
1 · Design & build from plans
PRD/TDD first, decomposed into a doc graph agents can navigate. Then a phased plan with checklists — one branch and PR per phase, each sized to a clean context window.
2 · Two models, not one
An author agent writes; a different frontier model reviews it like a Staff Engineer. No self-review bias.
3 · Model-agnostic on purpose
No LLM vendor lock-in. The workflow treats models and harnesses as interchangeable parts — a new model can slot in the week it ships.
4 · Human-gated
The reviewer loops “not ready” until issues (P0/P1/P2) are closed. I approve, archive the artifacts, merge.
- Walk the loop once, concretely. Dual-model review is the trick I’d keep if I could only keep one — a model grading its own homework misses real bugs; a different frontier model catches them.
- Model-agnosticism is the flip side of the same coin: the frontier moves every few months, so the author and reviewer roles are sockets, not names. The 5x CLI ships provider plugins for multiple harnesses for exactly this reason.
- Artifacts accumulate (plans, dated reviews) — I mine my own history for recurring mistakes and feed them back in.
- Optional aside: the 5x repo builds itself with its own process — the commit log is author/review/fix loops all the way down.
Guardrails I won’t skip
Because agents amplify mistakes, I stopped treating the boring fundamentals as optional.
They’re what let me move fast on purpose.
- Source control + one branch/PR per phase
- Automated deploy pipeline
- Containerized local testing
- Fast test suites + coverage gates
- Pre-commit / pre-push hooks
- PR validation before merge
- For a founder audience: this is the part people skip and regret. The guardrails are cheap to set up with agents and they’re what keep velocity from becoming chaos.
- Don’t over-teach. Just establish that “fast” rests on a foundation, not vibes.
The number
Roughly a week’s worth of traditional output… per day.
“5x” is the midpoint of a measured, quality-adjusted range — pulled from a month of my own git history (commits, PRs, tests, migrations, reviews), not a benchmark. Deliberately not “10x.” The tool is published as a CLI — @5x-ai/5x-cli.
I fully expect the number — and the tooling — to be obsolete within the year. That’s half the fun.
- Be honest about the number: it’s a measured ~4–6x quality-adjusted throughput for one engineer on one stream, and it’s self-aware that it’ll be obsolete in a year.
- The credibility move: I measured it from my own repos, not a benchmark. Now let me show you the proof — three real products.
Build #1 · the scale-up
Player
Desk
A multi-tenant SaaS court-booking platform for tennis venues — book by SMS, AI chat, or web, with sub-100ms bookings and real-time schedule sync. Built from scratch with everything Knox Challenger taught me.

playerdesk-production.up.railway.app · live
- Frame the leap: Knox was one tournament. PlayerDesk is a platform meant to host many venues — strict per-tenant isolation, hard performance targets, multiple front doors.
- “Built on lessons learned from Knox Challenger” is literally in the design doc. The method compounds: project N starts where project N-1 ended.
- Note the deck itself just shifted into PlayerDesk’s colors — club greens. Say nothing yet; the meta slide pays it off.
The pilot partner
Knoxville Racquet Club — booking courts by paper and telephone since 1961.
- One of the oldest and largest tennis clubs in town — and PlayerDesk will be their first digital booking platform.
- We’re modernizing desk operations together: paper time-tracking retired, a 24-hour mobile court calendar for members, back-office systems connected for accurate time billing.
- Their operation is full of subtleties the design has to capture — the deep-domain work an off-the-shelf product would struggle to absorb.
- The collaboration theme, embodied: built in lock-step with a real venue, not spec’d in a vacuum. Sixty-five years of operating habits are the requirements doc.
- The hard part isn’t the booking grid — it’s the member types, billing rules, and court conventions that decide whether adoption succeeds. That’s exactly why KRC couldn’t just buy something off the shelf — and it’s the moat.
Ambition compounds
- Multi-tenant Postgres with exclusion constraints + serializable transactions enforcing the performance target.
- Four front doors: an SMS rules engine, an AI natural-language assistant (Claude + MCP), a web dashboard, and a venue-local desktop shell with auto-update.
- Real regulatory engineering — which is where the prototyping story gets interesting…
- Don’t drown them in architecture. The point of the stats is scale: this is a serious platform built by one person + agents over ~7 months.
- Tee up the next slide as the concrete “static HTML to clear a real-world gate” story.
Static HTML in the wild
To send a single SMS, US carriers make you prove how users consent.
- We built dedicated 10DLC “evidence pages” — public HTML pages with real opt-in screenshots — mapped field-by-field to Twilio’s carrier-registration form.
- Two separate compliance programs, each with its own reviewer-facing page; SMS stayed feature-flagged off until carrier approval landed.
- That’s “build a landing page / basic functionality” — in service of a real business gate, generated fast.

/compliance/a2p-member-auth
- This is the first direct answer to the assigned topic. “Basic HTML pages” aren’t just marketing landing pages — sometimes a simple static page is the thing that unblocks your whole product (here, the ability to text customers legally).
- LLMs are great at generating exactly this kind of static, structured page quickly. Lean into that.
Build #2 · new domain, new stack
CHIEF
Constraint-based Helper for Intelligent EM Forecasting — desktop software that generates optimized shift schedules for an Emergency-Medicine residency program, replacing a chief resident’s manual Excel grind.

Electron desktop app · local-first · constraint solver + LLM assistant
- Different everything: healthcare not sports, a desktop Electron app not a web SaaS, a constraint-programming solver at its core, local-first for HIPAA comfort.
- The point: the method is domain- and stack-agnostic. Same plan/review discipline, totally different technology.
Mockups before code
Agents drove Pencil through its MCP server — the customer approved the UI before a line of it was built.
9 screens authored in Pencil (chief.pen), agent-driven via its built-in MCP server · reviewed screen-by-screen by the Program Director · the shipped UI traces directly back to these frames
- Concrete tool answer #2 for founders: you can have agents produce real design mockups (not just code) that a non-technical stakeholder can react to. Cheap to iterate, expensive to skip.
- Honesty note for me: I have the chief.pen file (9 screens) and the app exposes its own MCP server; the “agents drove Pencil via its MCP server” claim is from Stephen’s confirmation — and the live Pencil session is reconnected if I want to show it.
Domain knowledge can’t be faked
The scheduling truth arrived as years of hard-won wisdom — not as a spec.
- What the program handed me: a 13-block sample-schedule workbook, a master shift-requirements ruleset, a five-page explainer, and a plain-language “scheduling desires” document.
- Rules like “protect Wednesday didactics, 9:30–3:00” · “FM PGY-2 keeps Thursday/Friday 6a–2p — it matches clinic” · “Smithfield is overflow only: EM before TY before FM.”
- Hard constraints, conditional eligibility, soft preferences, a 13-step priority order — all of it institutional memory.
- Let the artifacts do the talking — this is what real domain depth looks like on paper. None of it is invent-able by an agent; all of it decides whether the schedule works.
- For founders: your unfair advantage is the domain truth you can gather that a model can’t. Go sit with the people who hold it.
Plain language in, solver constraints out
The old cost
Codifying this much operational nuance was traditionally an enormously expensive analysis-and-engineering project — the kind that kills a tool like this before it starts.
The new pipeline
LLMs capture, translate, and refine the hard and soft constraints from plain language — then distill them into data a cheap, deterministic, offline solver can process.
The surprise, for me: it’s more efficient to let agents write and refactor the constraints as code than to build all the abstractions a “purely data-driven” solver would need. A rule change arrives in plain English and turns around in hours — and someday the built-in assistant may morph its own constraint logic.
- This is the strongest technical idea in the talk — give it air. Data-as-code: the constraint layer stays as readable code that agents rewrite on demand, instead of a config schema you must over-engineer up front.
- The division of labor matters: LLM for capture/translation (expensive, fuzzy, occasional), deterministic solver for the actual schedule (cheap, exact, every day). Pay for intelligence once, run arithmetic forever.
- The closing thought is a live research direction — the embedded assistant already speaks MCP to the app; letting it propose constraint edits is the natural next step.
Build #3 · the pure prototype
Deed
Lineage
A title archive for the Scarlett Oaks tract in Blount County, TN — reconstructing the chain of title across 70 years of land ownership, as a searchable, mappable static site.

Static HTML · hosted on Cloudflare Pages
- This is the most on-topic build for “rapid prototyping.” A proof-of-concept I deliberately built leaning all the way into Claude Code’s newest features.
- Set up the contrast: no big architecture, no SaaS — just static HTML, basic hosting, and a week.
The exception that proves the rule
No customer to iterate with — so I became the domain expert.
- Hours in the county deed registrar’s office, pulling 80+ physical deed records by hand.
- Each scan paired with machine-readable metadata: parties, dates, book/page, acreage, metes-and-bounds, prior-deed links.
- An agent can read a deed. It can’t drive to the courthouse and find the right one.

92 deeds · 1956 → present
- Even the “pure AI prototype” was front-loaded with deeply human research. That’s the consistent lesson across all four projects.
- Make it visceral: the bottleneck wasn’t code, it was 80 manila folders in a basement.
Day one · the riskiest question
Can an LLM turn 1950s metes-and-bounds prose into real geometry?
- Day 1 was a spike — a small desktop tool built to answer exactly that question. It could.
- Only then did the site happen: a presentation-only layer over the data that evolved into the full demo — plain static HTML on Cloudflare, iterated over about a week.
- Agents did the fuzzy work — vision and parsing, via custom skills and parallel workflows; deterministic scripts did the geometry; I approved at named gates.

the day-1 spike, grown into a georeferencing editor · 89 deeds, 54 parcel nodes placed
- The founder lesson: spend day one on the question that kills the project if the answer is no. Everything else — the site, the polish — was downstream of that yes.
- Division of labor again: LLM judgment for reading deeds, deterministic code for geometry and closure checks, human approval at named gates.
- “Static HTML + basic hosting, iterated into a nice demo in a week” — the assigned-topic promise, delivered literally.
This presentation is a static HTML site, built with agents, hosted on Cloudflare.
Same toolkit as Deed Lineage. No PowerPoint. You’re looking at the prototype-that’s-the-product right now.
it even re-skinned itself to match each product as we went — a few CSS variables per act
- The wink. Pause here, let them realize the deck itself is an example. It earns the thesis without me asserting it.
- Pay off the theming now: point out the deck changed palettes for PlayerDesk (club green), CHIEF (clinical navy), and Deed Lineage (vellum). Each shift is a handful of CSS variables — the kind of design agility you get when your presentation is HTML an agent can edit.
- Optional: hit ‘O’ for the overview grid to show it’s a real little app, then carry on.
What it all adds up to · 1
Agents build fast.
The moat is still human.
Every one of these four projects lived or died on domain knowledge and real-user iteration — tournament staff, a program director, a courthouse basement. The code was never the hard part.
- Synthesize across all four. If agents commoditize code, then whatever isn’t code — domain depth, taste, relationships — is where your value concentrates.
- For this audience specifically: that’s great news. Your domain insight is the asset; agents are how you express it.
What it all adds up to · 2
Plain static HTML is the most underrated prototyping tool you have.
PlayerDesk
Evidence pages that cleared a carrier compliance gate.
CHIEF
Agent-driven mockups a customer could approve before code.
Deed Lineage
A whole product demo — static, hosted, polished — in a week.
This deck
The talk you’re watching.
- LLMs are exceptionally good at generating static/basic HTML. It renders anywhere, hosts for free, and it’s the fastest way to put something real in front of a human.
- This is the bridge into the explicit “tools” section they were promised.
The part you were promised
When to code
vs. use a tool
A practical framework for an early founder — most of you with a team of one to three.
- Signal the turn: now I pay off the assigned topic directly and practically.
- Keep it actionable — they should be able to apply this Monday.
Reach for a tool / no-code when…
- You’re validating — pre-product-market-fit, testing demand.
- It’s a landing page, form, or waitlist — the page is the deliverable.
- The work is throwaway or a one-off.
- An off-the-shelf tool is your product surface (Stripe, Airtable, Typeform).
- You need it today and nobody will maintain it.
Reach for (agent) code when…
- You have a real domain edge the tool can’t express.
- You must own the data & logic (compliance, IP, lock-in risk).
- Performance, scale, or regulation are real constraints.
- The prototype is going to become the product.
- You’ll integrate deeply with other systems over time.
- Be even-handed. No-code tools are genuinely the right call for a lot of early validation. Don’t let my story read as “always build.”
- The dividing line is: throwaway vs. foundational, and generic vs. domain-specific.
A field guide · summer 2026
| You need… | Reach for | Worth knowing |
|---|---|---|
| A marketing site or landing page | Framer · Webflow | Framer for speed and polish; Webflow when CMS or e-commerce control matters |
| A working app from a prompt | Lovable · Bolt · v0 | They generate real React/TypeScript you own — export to GitHub, graduate it later |
| A complex web app, no code | Bubble | The most capable classic builder — real learning curve, and the app isn’t portable |
| Internal tools on spreadsheets | Glide · Softr | Glide sits on Sheets; Softr turns Airtable into member portals |
| To sell things | Shopify · Square | Commerce is a solved problem — don’t rebuild it |
| Glue between systems | Zapier · Airtable · Notion | The generic 80% of ops — automate before you build |
- Map it to the room: food & bev ventures (Big Al’s, ZeroCross) live in the Shopify/Square row. AI-service startups (Lunara, Vetra, AdviseAI) — prompt-to-app plus APIs. Marketplace ideas (JobWillow) — Softr or Bubble to validate. Media/creative (SherSpinz, Cameron) — Framer. Everyone — the glue row.
- The interesting 2026 shift: the prompt-to-app lane now exports code you own, so the old no-code dead-end tradeoff is softening. Same force as my thesis, arriving in tool form.
…but the line moved
Agents made real code cheap enough to prototype in.
The old reason to pick a no-code tool was speed. When you can stand up real, owned code in an afternoon, you can often get the speed and keep the foundation. Start with static HTML to prove the idea — then graduate the same project into the real stack, with the same workflow.
- The synthesis of the whole talk. You no longer have to trade “fast now” against “real later.” That’s the thesis, restated as practical advice.
- Caveat honestly: this requires the discipline from Act 3, or you just get fast garbage. The method is what makes the shortcut safe.
Monday morning · a starter toolkit
The agents you already have
- Claude, ChatGPT, Codex — the apps, used like a power user: attach your real documents, ask for working static HTML, iterate out loud.
- Ready to own a repo? Any agentic coding tool — Claude Code, Codex CLI, Cursor… the method transfers.
The rest of the kit
- Static HTML + Cloudflare Pages — the fastest real artifact you can make, hosted free.
- Pencil — mockups a customer can approve before you build; no-code for the generic 80%.
- A second model as reviewer — don’t ship work one agent graded itself.
…and the one that matters most: go get the domain knowledge a model can’t.
- Make this the screenshot-able “save this slide” moment. Deliberately vendor-neutral: most of this room will meet agents through the Claude/ChatGPT/Codex apps, not a terminal — meet them there.
- End the section on the human note — the toolkit is necessary but the domain insight is the differentiator.
If you remember three things
- The cost of real software collapsed — prototype and product are one motion now.
- Speed without discipline is a trap — a method (and dual-model review) is what keeps the wheels on.
- Agents commoditize code, so your domain knowledge and taste are the moat.
- Slow down. Three clean takeaways, then the sign-off. Don’t add new material here.
You bring the taste.
Stephen Palmer · Flying Banana Productions
flyingbananaproductions.com
Questions →
- Callback to “bring the taste.” Thank them, invite questions, share contact. Leave this slide up during Q&A.
- Point at the QR — the deck itself is hosted at summer-studio-2026.flyingbananaproductions.com (Cloudflare, same as Deed Lineage). Scanning it is the last proof of the thesis: they’re taking home the static-HTML artifact.
