
Building the venue before the model showdown: a model catalog to pick the contender, and the groundwork for a verified Milwaukee inventory.
There are an absurd number of AI models out there, more landing every week, and the question that actually matters āĀ which one can I use for this?Ā ā is buried under marketing and leaderboards that all measure slightly different things. So I built a catalog to cut through it. Describe what you need in plain English ā say, āan open-weight model good at legal summarizationā ā and it returns only models that are actually in the catalog, ranked, with the reason each one matched. Never an invented model, never a made-up spec. And you can still browse and compare them directly by the attributes that decide whether one fits a real project.
Each entry lays out the things youād otherwise have to dig for ā who made it, how big it is and what that implies for running it, how much context it can hold, whether the weights are open or closed, what the license allows commercially, and a plain-language summary of what itās actually good at ā with a short editorial take and comparable models alongside. Technical terms in the writeups are wrapped with plain-English definitions you can hover for, so the explanation comes to you instead of sending you off to look it up.
Itās an early version and Iāll keep filling and refining it, but itās live and usable now. Regular readers will recognize the bigger reason it exists: this is the tool Iāll use to pick the model that goes into the Milwaukee assistant build ā the contender, in this weekāsĀ Building IntelligenceĀ terms. Have a look and tell me whatās missing.
This series has been pointing at a showdown for months now: a small, purpose-built Milwaukee tech assistant going up against the big general-purpose models with web search, judged head-to-head on a fixed rubric. Thatās the whole bet ā that something narrow and carefully fed can beat something vast and generic on its home turf. This week was not the fight. It was building the place the fight happens. You donāt stage a contest without a venue, and the venue didnāt exist yet.
So I spent the week on the website. hardais.com is where this model will eventually live ā not a slide about a model, an actual thing you can query ā and a fair showdown needs three things that werenāt there a week ago: a contender to put in the ring, a ground truth to judge its answers against, and a place to hold the whole thing. I made progress on the first two.
The contender first. You canāt put āan AIā in a ring ā you have to pick a specific model, and there are a staggering number to pick from, with new ones landing weekly. So I built a model catalog: a search-and-compare feature for narrowing the field down to a model I can actually work with. That last part matters, because ābest in generalā and ābest for what Iām doingā are different questions ā I need one I can run, shape, and afford, not just one that tops a leaderboard. It launched this week and itās in theĀ AnnouncementsĀ above; consider it the tool Iāll use to choose the fighter.
The ground truth is the Milwaukee inventory ā the database the assistant will draw from. Last weekās edition was about the hardest part of that: deciding whatĀ doesnātĀ go in, and building a schema that enforces the discipline instead of leaving it in my head. With that structure finally locked, this week it became easy to build a friendly way to actually fill it ā and that turned out to be a small lesson in its own right, which is this weekāsĀ Under the Hood. The short version: getting the rigorous part right first is exactly what made the easy part easy.
But āfill itā doesnāt mean scrape the web and dump it in. The entire premise is that nothing goes in unverified, and a lot of what makes a community real isnāt published anywhere ā it lives in the heads of the people running it. So this week I put correspondence out to several players in the Milwaukee tech scene, asking something more specific than ācan I list youā: would they consent to being aĀ sourceĀ ā someone I can point to when I claim a fact is true. Thatās why the database carries consent flags on people and a paper trail on every source: a ground truth made of real Milwaukee folks who said yes is a very different thing from a list I assembled by guessing. The letters are out. Iām waiting to hear back, and I wonāt pretend the inbox is full yet.
So thatās the honest state of things: the arena is going up, but the bell hasnāt rung. The contender-selection tool is live, the ground truth has a structure and its first real outreach, and the place it all lives is taking shape. The actual test ā the model against the baselines, scored ā only means anything once the venue is real, and the venue gets built before the fight, not during it. Next week tells me whether the inventory starts filling with real, consented, sourced organizations. Until then: the stage, not the show.
Building Intelligence this week made a claim in passing: that getting the rigorous part right first is what made the easy part easy. This is the easy part. With the inventoryās structure finally locked, I needed a way to actually put organizations into it ā and I let an AI build that for me in an afternoon. The interesting thing isnāt that I did it fast. ItāsĀ whyĀ doing it fast and loose was a safe choice rather than a reckless one.
Start with how data actually gets into a database. The commands that talk to a database come in two flavors, and both have names worth knowing.Ā DDLĀ ā Data Definition Language ā is the set of commands that define theĀ structure: ācreate a table called organizations, give it these columns.ā Thatās the work I did last week building the schema.Ā DMLĀ ā Data Manipulation Language ā is the set that handles theĀ contents: āinsert a row, put this name here, this website here.ā Filling the inventory is a DML job. The old-fashioned way to do it is to hand-type those DML commands one organization at a time ā which works, but itās tedious and unforgiving: one fumbled line and youāve quietly entered a broken or half-filled row. The friendlier way is aĀ GUIĀ ā a graphical user interface, which just means a screen with labeled boxes and dropdown menus where you fill in a form, click Save, and something else writes the DML for you. I wanted the form. So IĀ vibe-codedĀ it: I described what I wanted in plain English to an AI, it wrote the code, and I shaped it by reaction ā āmake that a dropdown,ā āmove that fieldā ā instead of writing a line of it myself.
Hereās why that should make you nervous, in general. Vibe coding produces something thatĀ looksĀ right very quickly, and ālooks rightā is exactly the trap ā an AI will confidently generate code that does something subtly wrong, and you may not notice until the damage is done. Pointing that loosely-built tool straight at the database thatās supposed to be my trustworthy source of truth sounds like a great way to fill it with quiet garbage.
It isnāt, and the reason is last weekās work. The rigor doesnāt live in the form ā it lives in the database underneath it. All those rules I built into the schema (a field that canāt be left blank, an entry that has to point at a real cited source, a switch that defaults to ānoā until a human says otherwise) are enforced by the database itself, no matter what hands it the data. The form is just a messenger. If the vibe-coded GUI tries to save something that breaks one of those rules, the database refuses it and hands back an error. The guardrails are in the foundation, so the convenience layer bolted on top is allowed to be casual ā it physically cannot write a row the structure forbids.
And the structure didnāt justĀ permitĀ the form ā it shaped it into something that nudges me toward clean data by default. Because the database already defines the fixed set of, say, allowed source types, the GUI can read that list and turn it into a dropdown automatically: Iām picking from known options, not free-typing āwebsiteā one day and āweb siteā the next and creating two things where thereās one. It puts the āraw, as-foundā description and the āhuman-approvedā description in two separate boxes, so the act of curating is built into the act of entering. The form is good not because the AI is clever, but because the schema gave it a clean shape to fill.
So hereās the lesson, and it cuts against how vibe coding usually gets sold. Itās pitched as a way toĀ skipĀ the hard part. What actually happened is the opposite: the hard part is the only reason the shortcut was safe. I did the slow, careful structural thinking where it counted ā the schema, the rules, the defaults ā and thatās precisely what earned me the right to be fast and loose on the layer where it didnāt. Schema before code, Iāve said before. This is the ābefore codeā part paying off: the front doorās built, it canāt let anything ugly through, and now the only thing left is to walk the real organizations in.
āCuriosity is the engine of achievement.ā
ā Ken Robinson ā Ken Robinson was a British author, speaker, and international advisor on education in the arts to government, non-profits, education, and arts bodies. He is best known for his work on promoting creativity and innovation in education. Robinson was a professor emeritus at the University of Warwick in the UK and was knighted for his contributions to the arts.