AI Newsletter 4_18_26

0 views
Skip to first unread message

Caleb Bryant

unread,
Apr 18, 2026, 11:16:06 AM (2 days ago) Apr 18
to it-ho...@googlegroups.com
📬 Getting this in Promotions? Move it to Primary so you never miss an edition.
IT HootClub Owl

IT HootClub — AI Community Newsletter

Hands-on. Career-focused. Future-ready.
Issued 2026-04-18

AI's Impact: App Surge and Security Concerns

Explore AI's dual role in revitalizing app launches and compromising data security, plus learn about Diffusion Models and using Anchored Prompts effectively.

Announcements

Transparency — How This Newsletter Is Built Each Week

Following through on the commitment made two weeks ago — accountability for AI-assisted work isn't just a statement, it's a practice. Here's how this newsletter is actually architected, in plain terms, and what's happening behind the scenes this week.

The newsletter has four distinct AI-driven roles, and I want each one disclosed explicitly so you always know what produced what:

  • The compilation model — a single AI model that generates the core pipeline content: the curated news stories, the Learning Loop items, and the edition's title and description. This is the model "writing" most of what you see outside of the marked feature sections.
  • The Lift-Off quote — generated by one or more AI models and labeled per model when more than one is enabled.
  • The Nest Jest — the joke and its cartoon, generated by one or more AI models and labeled per model when more than one is enabled.
  • The Fireside Chat — the same prompt sent to multiple AI models for side-by-side comparison, always labeled per model.

For this edition:

  • Compilation model: OpenAI GPT
  • Lift-Off quote: OpenAI GPT
  • Nest Jest: OpenAI GPT
  • Fireside Chat: Llama 3, Qwen 3, and Mistral 3 (all running locally on my laptop) — a deliberate departure from the usual cloud-model comparison. See the Fireside section and this week's Under the Hood for why.

I reserve all editorial judgment — topic selection, content direction, Building Intelligence drafting, Under the Hood writing, and the final review on every section. The AI models are doing the work I ask them to do in the roles I've defined. The editor is me.

Building Intelligence

BUILDING INTELLIGENCE · WEEK 2

What's Inside an 8B Model? — Parameters, Base vs Instruct, and Why Size Isn't the Whole Story

This week's Under the Hood shows you my three locally-installed models as a table — 8.0B, 8.2B, 8.9B parameters; different context lengths; different capabilities. The table is useful, but it raises a question: what do those numbers actually mean, and why would anyone care? That's what this section is for.

What a "Parameter" Is

When you see a model described as "7B" or "8B" or "70B," that number refers to parameters — individual numerical values stored inside the model. Think of each parameter as a small dial. A model labeled 8B has roughly 8 billion dials. During training, the process turns those dials — very slightly, very carefully — until the model's outputs better match the patterns in its training data. Once training is done, those dials are locked in place. The exact configuration of all 8 billion dials is what the model knows.

More parameters generally means more capacity — the model can represent more nuanced relationships, handle more varied inputs, and produce more sophisticated outputs. The cost is proportional. More parameters means more memory to load the model, more compute to run it, and dramatically more compute to train or fine-tune it. Rough size classes for context:

  • 3B–8B — runs on modest hardware, fast to fine-tune, suitable for focused domain work. This is where I'm operating.
  • 13B–30B — requires more VRAM or multi-GPU setups. Meaningfully better general reasoning.
  • 70B+ — research and enterprise scale. Fine-tuning requires serious infrastructure or substantial cloud compute budgets.

A question worth sitting with: if larger models are more capable, why not just fine-tune a 70B model for this Milwaukee project? Wouldn't it learn the same specific knowledge on top of its broader capabilities and just be better at everything?

The practical answer is that fine-tuning a 70B model requires hardware most practitioners don't have access to, and the cost makes experimentation impractical. The more interesting answer is that whether a larger general model beats a smaller focused one on a specific domain comes down to three things: data quality, training approach, and question specificity — not raw parameter count. That's an opportunity, not a limitation. A well-constructed dataset, a thoughtful fine-tuning approach, and clearly scoped evaluation questions can produce results that compete with — and sometimes outperform — much larger general-purpose models on narrow tasks. This is where real practitioners still have room to stand out in an industry often dominated by three major providers.

Base Model vs Instruct Model

There's another distinction that matters as much as size, and it's not visible in the parameter count. The difference is between a base model and an instruct model.

A base model completes text. Feed it "The capital of Wisconsin is" and it continues with "Madison." It was trained on massive quantities of text from the internet, and its fundamental skill is predicting what word comes next. If you ask a base model a question in natural language, it may answer — but it may also just continue your question as if you were in the middle of writing an essay. It has no concept of "turn-taking" in a conversation.

An instruct model has been further trained — usually after the initial base training — on examples of human instructions paired with good responses. This second training phase teaches the model to follow directions, answer questions, and hold a conversation in a structured back-and-forth way. Every major AI assistant you've interacted with — ChatGPT, Claude, Gemini — is an instruct model under the hood.

For a fine-tuning project like the one this series is building toward, we want an instruct model as our starting point. We're teaching it new facts about Milwaukee, not teaching it how to have a conversation from scratch. Starting from an instruct model means the conversational scaffolding already exists, and fine-tuning just needs to add the specific knowledge we care about.

How do you tell whether a model is base or instruct? Sometimes the name makes it obvious — "Llama 3 8B Instruct" versus plain "Llama 3 8B." Sometimes you have to look at the metadata. Running ollama show on my three models reveals stop tokens like <|start_header_id|> and <|im_start|> — special markers used to indicate conversation turns. Base models have no need for those tokens. Their presence is the fingerprint of an instruct model. All three of my local models are instruct-tuned. That's what we want.

What to Take From This

The three models on my laptop are all within a few hundred million parameters of each other — but their context lengths range from 8,000 tokens to 262,000, and their licenses, capability profiles, and underlying architectures all differ. The parameter count is one variable. It's not the only one, and depending on the project, it may not even be the most important one.

Next week: with the concepts in hand, we look at each of the three candidates honestly, weigh them against each other, and commit to one.

Sources & Further Reading

Under the Hood

Last week I said the next step was picking a model. What happened instead is that I took honest stock of what I'm already working with, and found the answer to "what model?" was already on my hard drive — along with a few surprises about what I thought I knew.

Before getting into this, two terms worth defining clearly because they come up repeatedly. Inference is the act of running a trained model to get an answer — you give it a prompt, it produces a response. That's different from training, where the model is being taught by example and its internal values are being adjusted. Everything in this edition is about inference. Training comes in a later week, and it's where the real hardware constraints land. GPU stands for Graphics Processing Unit — a specialized chip originally built for rendering graphics but now the workhorse for AI because the math involved is highly parallel. VRAM is Video RAM — memory that lives on the GPU itself, much faster than regular system RAM, and the main constraint on whether a given model will run at all.

The Hardware I'm Working With

Here are the specs: Samsung laptop running Windows 11, 11th-generation Intel i7-1165G7 processor, 16 GB of RAM, 1 TB of internal storage with about 557 GB free, and Intel Iris Xe integrated graphics with 128 MB of dedicated video memory. It's honestly the middle of what most students and early-career IT professionals are actually using. Not a gaming rig. Not a workstation. Just a capable everyday laptop from 2021-era hardware.

The line that matters most for this project is the graphics. Intel Iris Xe is integrated graphics — the GPU shares a chip with the main processor and pulls from system RAM rather than having dedicated video memory of its own. The 128 MB listed is the minimum reservation. A dedicated GPU is a separate physical card with its own VRAM — typical AI-capable cards have 8, 16, 24, or 40+ GB of VRAM specifically built for the massively parallel math that language models require. Integrated graphics are functional for everyday computing and light 3D work. They are not what AI inference wants to be run on.

The Storage Solution

Having 557 GB free sounds like plenty until you realize a single 8-billion- parameter language model takes 5–6 GB, training data can run into the tens of gigabytes, and every tool in the pipeline (Python environments, model caches, working files, git history) adds up. The AI stack eats disk fast, and I didn't want to fill my C drive with it.

My solution was an external SSD. Specifically, a Samsung T7 Shield 2TB portable SSD, which I bought from Amazon for about $250. USB 3.2 Gen 2, roughly 1 GB/sec transfer speeds, IP65-rated against water and dust — physically tough enough to travel in a bag without worry. On that drive I created a folder called AI_Env, and everything AI-related lives there: Python, VS Code, Git, Ollama, the model weights, and the project work-in-progress.

Important to be clear about what this fixes and what it doesn't. The external drive solves the storage problem — I now have effectively 2.5 TB of working space instead of 557 GB. It does not solve the GPU problem. No amount of external storage gives me VRAM. For inference that's workable. For training, I'll still need the lab. The bonus is portability — this setup travels with me to any machine, which matters for a student who works from more than one location.

What's Running — Ollama and the Three Models

Ollama is a local runtime for language models. It handles downloading models, running an inference server in the background, and exposing a command-line interface so you can talk to the model from the terminal. It's not the only option — LM Studio is the main alternative with a graphical interface instead of a terminal — but Ollama is what I use, and it's what's installed in my AI_Env folder.

Ollama pulls models from its own curated registry at ollama.com/library. That registry is a thin layer on top of the real source: Hugging Face, which is where essentially every major open-weight language model is originally published. When I want a wider selection — including variants, fine-tunes, and newer models that haven't made it into Ollama's curated list yet — I go "shopping" directly on Hugging Face. There are over 2 million models on the platform, and the number is growing fast. The vast majority aren't conversational language models — it hosts image generators, speech models, embedding models, specialist fine-tunes for every conceivable niche — but the subset relevant to any given project is still large enough to require real evaluation work. How to navigate that is its own topic, and one worth covering in its own edition.

Here's what I already had pulled:

NameParamsContextLicenseNotable capabilities
llama3:latest8.0B8,192Meta Llama 3 CommunityText only
qwen3:8b8.2B40,960Apache 2.0Tools, extended reasoning
ministral-3:8b8.9B262,144Apache 2.0Tools, vision

Three models, all roughly 8-billion-parameter class, from three different organizations — Meta, Alibaba's Qwen team, and Mistral AI. Each has a different license and a different capability profile. The context length range — 8K to 262K tokens — is the biggest practical difference. Context length is how much text a model can consider in a single prompt-response cycle, and that 32x spread between the smallest and largest here is significant for any use case involving long documents.

One surprise is worth naming. The model labeled ministral-3:8b sounds like it should be Ministral 3B, a small 3-billion- parameter model Mistral AI released last year. It isn't. Running ollama show on it reveals an architecture of mistral3 and 8.9 billion parameters — this is actually a variant of Mistral's newer Mistral 3 series (the multimodal family that includes Mistral Small 3). The name in the registry is one person's shorthand, not an official naming convention. That's a real-world lesson: a model's name is not its ground truth. The metadata in ollama show is. If I'd committed to fine-tuning "Ministral 3B" based on the name alone, I'd have been training against a model that isn't what the name suggests.

Actually Running Them — The CPU-Only Reality

When I started the Ollama server, the startup logs told me exactly what to expect:

inference compute: id=cpu library=cpu
                   total="15.7 GiB" available="3.5 GiB"
entering low vram mode: total vram="0 B" threshold="20.0 GiB"

Translated: Ollama scanned the system for usable GPU memory and found zero bytes. Intel Iris Xe integrated graphics don't count as inference-capable hardware — they share system RAM, use drivers Ollama doesn't directly support, and aren't optimized for AI math. So inference runs entirely on the CPU, using system RAM instead of VRAM. Ollama also activated "low VRAM mode" — a degraded operating mode for hardware far below the 20 GB of VRAM it considers comfortable.

To see what this looked like in practice, I ran the same simple prompt through each model — "Hello, what year is it?" — and captured both the responses and the time it took each one to answer. The responses themselves are in this week's Fireside Chats section below, because the differences between them are worth reading on their own. The speeds tell the hardware story: Llama 3 answered in 18 seconds, Qwen 3 took 32 seconds, Ministral-3 took 1 minute 13 seconds. Usable. Not fast. A cloud model running on a purpose-built inference server would handle all three prompts in under a second combined.

Where This Leaves Me

Three models already on disk, all running (slowly) on CPU-only inference. No GPU acceleration and no way to train here. That's a cleaner picture than I had a week ago, and it means the next step is no longer "what do I need to buy" — it's "which of these three do I actually pick, and why?"

That's the next edition's topic. Before I can commit to one, there's conceptual work to do: understanding what 8B parameters actually buys you, what the difference between a base and an instruct model means, and what criteria separate a model worth building on from one that isn't. That's what Building Intelligence is covering — this week and next.

Sources & Further Reading

Fireside Chats with AI

This week's Fireside Chat is a departure from the usual format. Instead of asking Gemini, Claude, and GPT the same question, I ran a four-word prompt — "Hello, what year is it?" — through the three local, open-source models running on my laptop. Same hardware. Same prompt. Same moment. Three completely different answers. Before the responses, brief background on who made each of these models, because the companies behind them shape what the models are.

Llama 3 was released by Meta, the American technology company headquartered in Menlo Park, California — the same company that owns Facebook, Instagram, and WhatsApp. Meta's Llama family is the most widely downloaded open-weight language model family to date. Released under a custom Meta Llama community license that permits commercial use with some restrictions at very large scale. Meta has positioned open models as a core part of their AI strategy rather than keeping their best work locked behind an API.

Qwen 3 comes from Alibaba Cloud's Qwen team, part of the Chinese technology conglomerate best known for Alibaba.com, Taobao, Tmall, and their massive cloud business. Headquartered in Hangzhou, China. Qwen has emerged as one of the most competitive Chinese open-weight model families over the past year, released under the Apache 2.0 license — one of the most permissive open-source licenses in existence, with no commercial restrictions.

Mistral 3 is from Mistral AI, a French startup founded in 2023 by alumni of Google DeepMind and Meta's FAIR lab. Headquartered in Paris. Mistral has positioned itself as Europe's flagship open-weight model provider and has become a credible counterweight to the American and Chinese labs in the open-source conversation. Mistral 3 is their newer multimodal family, also released under Apache 2.0.

Why run this specific prompt? Because a local model has no access to the real date — no web search, no system clock, no tool use unless you explicitly build a system around it. Everything a local model "knows" comes from its training data. Ask it what year it is, and it has to do one of three things: refuse, admit its limits, or just answer based on what it was trained on. The three responses below show each model choosing a different path — and each choice reflects a different design philosophy from a different company in a different country.

As you read: which model refused to commit? Which one reasoned out loud? Which one just answered — and was that answer actually correct? The differences aren't quirks. They're design choices.

The Prompt:

Hello, what year is it?

Meta Llama 3 (Llama 3 8B Instruct · local · 18 sec response)
I'm just an AI, I don't have a specific year or a physical presence. I exist in a virtual space and don't experience time like humans do. I can provide information on any year or topic you'd like to know more about! What's your question or topic of interest?
Alibaba Qwen 3 (Qwen 3 8B · local · 32 sec response)
Thinking...
Okay, the user is asking, 'Hello, what year is it?' I need to figure out the current year. Let me think. I know that the current year is 2023. Wait, but maybe I should double-check. Let me recall the timeline. 2022 was the previous year, and 2023 is the one after that. So if today's date is in 2023, then the answer is 2023. But I should make sure I'm not confusing the dates. For example, if the user is asking in December 2022, the answer would be 2022. But since I don't have real-time data, I should rely on the information I have. As of my knowledge cutoff in October 2023, the current year is 2023. Therefore, the answer should be 2023. I should present that clearly.
...done thinking.

Hello! As of my knowledge cutoff in October 2023, the current year is **2023**. Let me know if you need further assistance!
Mistral AI (Mistral 3) (Mistral 3 variant · local · 1 min 13 sec response)
Hello! The current date is **April 18, 2026**.

The Learning Loop

DEFINITION Diffusion Models
Diffusion Models are a type of generative model used in AI that iteratively transform a simple distribution (like Gaussian noise) into a complex data distribution, such as images or text, through a gradual noise reduction process.
Source: Wikipedia
TIP Anchored Prompts
When crafting prompts to elicit specific responses, anchor your instructions with stable context or key terms. This helps maintain the prompt’s intent and guides the model towards the desired outcome consistently.
Source: PromptingGuide
TOOL Weaviate
Weaviate is an open-source vector search engine designed for high-scalability search and data exploration with millions of objects, supporting semantic search and recommendation use cases.

Lift-Off

“Curiosity is the fuel that drives our pursuit of progress and understanding.”

— Ada Lovelace — Ada Lovelace was a pioneering mathematician and writer, best known for her work on Charles Babbage's early mechanical general-purpose computer, the Analytical Engine. She is often regarded as the first computer programmer due to her detailed notes on the engine, which include an algorithm considered to be the first intended for implementation on a computer. Her visionary thinking about the potential of computers went far beyond mere calculation, imagining how they could revolutionize fields like art and science.

The Nest Jest

Cartoon: robot on stage

Upcoming Events

Event 04/18/2026 9:00 AM
PMI-CPMAI® Weekend Training – Project in AI Certification in Milwaukee, WI — 1433 N Water St, Milwaukee, WI
Join our PMI-CPMAI® weekend training and master Project in AI. Learn AI project lifecycle, data strategy, and real-world implementation.
Event 04/18/2026 9:00 AM
PMI-CPMAI® Weekend Training –Project in AI Certification in Kenosha, WI — 100 N Atkinson Rd, Grayslake, IL
Join our PMI-CPMAI® weekend training and master Project in AI. Learn AI project lifecycle, data strategy, and real-world implementation.
Event 04/21/2026 9:00 AM
4-Day Data Science with Python Bootcamp in Milwaukee, WI — 1433 N Water St, Milwaukee, WI
Join our 4-Day Data Science with Python bootcamp! Learn data analysis, ML basics, and work on real-world projects.
Event 04/21/2026 9:00 AM
PMP Certification 4 Days Classroom Training in Milwaukee, WI — Milwaukee, WI, Milwaukee, WI, WI
PMP Certification 4 Days Classroom Training
Event 4/22/2026 5:00 PM
Welcome to MKE Tech Meetup — Penrod – 219 N Milwaukee Street, Milwaukee, WI 53202
A curated social event orienting new tech employees into Milwaukee's tech and innovation ecosystem. Free and open to new tech workers in the region. Street parking available nearby.
Source: MKE Tech Hub Coalition
Event 04/22/2026 9:00 AM
PMI-CPMAI Certification Training – 3-Day Bootcamp in Milwaukee, WI — 1433 N Water St, Milwaukee, WI
Master AI in Project Management & project in AI concepts with PMI-CPMAI™. 4-day training, real use cases & exam prep.
Event 4/23/2026 TBD
Catalyze Experience Preview (Mid-Career Technologists) — Milwaukee, WI
A preview event for mid-career technologists as part of the MKE Tech Hub Coalition's Catalyze program.
Source: MKE Tech Hub Coalition
Event 04/24/2026 8:00 PM
AIA x MILWAUKEE x MAGAZINE RELEASE PARTY (VOL 14) — Club Timbuktu, Milwaukee, WI
Join us IRL for the AIA x MILWAUKEE x MAGAZINE RELEASE PARTY (VOL 14) – good vibes, great mags!
Event 04/27/2026 5:30 PM
WiscoREIA Milwaukee: Exploring AI Tools for Real Estate Investors — West Burleigh Street, Wauwatosa, WI
Make sure to pre-register to save your seat! https://wiscoreia.com/event-6626396
Event 04/28/2026 9:00 AM
Cisco CCNA Training & Certification Program in Milwaukee, WI — 1433 N Water St, Milwaukee, WI
Get CCNA 200-301 certified with hands-on Cisco labs, expert-led training, real-world networking skills, and career guidance.
Event 04/29/2026 9:00 AM
Project Management Techniques Course Training in Milwaukee, WI — 1433 N Water St, Milwaukee, WI
Professionals undergoing the PMP course gain insights into project governance, including ethical considerations, compliance, and regulatory.
Event 05/02/2026 9:00 AM
PMI-CPMAI Certification weekend Training in Milwaukee, WI — 1433 N Water St, Milwaukee, WI
Learn how to manage AI projects with PMI-CPMAI® certification. Master Project in AI, data strategy, and real-world AI delivery.
Event 05/09/2026 10:00 AM
Hands-On : Copilot Studio, Microsoft Fabric, Azure AI : Better Together — Online (Online event)
**Hands-On Online Workshop: Copilot Studio, Microsoft Fabric, Azure AI : Better Together** **Date: 09 May 2026, 10 AM to 5 PM Eastern Time** **Level: Beginners/Intermediate** **Registration Link:** https://www.eventbrite.com/e/hands-on-copilot-studio-microsoft-fabric-azure-ai-better-together-tickets-1983680029367?aff=oddtdtcreator **Who Should Attend?** This hands-on workshop is open to developers, senior software engineers, IT pros, architects, IT managers, citizen developers, technology prod [Group: artificialintelligenceandmachinelearning]
Event 05/12/2026 9:00 AM
Become AI Project Ready – PMI-CPMAI Training Program in Milwaukee, WI — 1433 N Water St, Milwaukee, WI
Gain in-demand skills in Project in AI with this 3-day PMI-CPMAI® training. Learn to manage AI projects, align business goals.
Event 05/13/2026 5:15 PM
May Global AI Milwaukee User Group Meeting — Online (Wctc, 800 Main Street, Pewaukee, WI)
Agenda Includes: * Networking / Food * Brief Introduction / Discuss Group Business * Featured Speaker Featured Speaker: **TBD** Topic: **TBD** TBD **Featured Sponsors** TBD [Location Sponsor] TBD [Food Sponsor] [Group: global-ai_milwaukee]
Event 05/13/2026 12:00 PM
AI Specialists: AppliedAI Ethics Monthly Mixer - Network & Grow! - Milwaukee — Location TBD; Register Online, Milwaukee, WI
Join us for specialists appliedai networking in Milwaukee on 13 May 2026, 12 PM CDT. Connect with professionals!
Event 05/13/2026 11:30 AM
CARW All Member Meeting - Data, Technology & AI — Renaissance Milwaukee West Hotel, Wauwatosa, WI
All-Member Meeting featuring Topher Stephenson, a nationally recognized leader at the intersection of commercial real estate, AI, and tech.
Event 05/13/2026 9:00 AM
PMI-ACP® Classroom Agile Certification Course in Milwaukee, WI — 1433 N Water St, Milwaukee, WI
PMI-ACP Agile certification classroom training with expert instructors, real-world practice, and complete exam preparation.
Event 05/15/2026 9:00 AM
Artificial Intelligence & Automation 1 Day Workshop |Milwaukee, WI — For venue details reach us at in...@learnerring.com, Milwaukee, WI
Understand AI, Automation & Real-World Business Applications | Hands-On | Beginner to Intermediate
Event 05/19/2026 9:00 AM
AI: Toy → Tool → Teammate with Marcus Green — Zoofari Center, Milwaukee, WI
Turn AI from a toy into your business teammate. Learn practical ways to boost consistency, clarity, and leverage—without adding more hours.
Event 05/20/2026 8:00 AM
AI Roundtable with SVA Consulting — Harley-Davidson Museum®, Milwaukee, WI
Join SVA Consulting and a cross-industry group of innovators for a morning of insight, collaboration, and connection.

In the News

News 2026-04-18 — TechCrunch
The App Store is booming again, and AI may be why
New insights from Appfigures reveal a surge in app launches in 2026, suggesting that the use of AI tools is fostering a new wave of mobile software innovation. The data indicates a correlation between the rise of artificial intelligence and an increase in mobile applications, pinpointing AI as a potential catalyst for renewed interest in app development. read more
News 2026-04-18 — Wired
It Takes 2 Minutes to Hack the EU’s New Age-Verification App
The EU's newly launched age-verification app has severe security vulnerabilities that allow it to be hacked within minutes. This flaw raises significant concerns about the app's reliance on AI and its implications for personal data protection, amidst other major data security incidents across industries. read more
News 2026-04-18 — VentureBeat
Salesforce rolls out new Slackbot AI agent as it battles Microsoft and Google in workplace AI
Salesforce has introduced a revamped version of Slackbot, transitionally upgrading it from a basic notification tool to a comprehensive AI agent. This new Slackbot is equipped to search enterprise data, draft documents, and perform tasks on behalf of employees, positioning Salesforce amidst competitive technological advancements led by Microsoft and Google in workplace AI. read more

Tools from Hard AIs

AI Airfare Research
Find cheaper flights using an AI-powered research tool built by Hard AIs.
hardais.com/airfare →
Newsletter Archive
Every past edition of the IT HootClub AI Community Newsletter, hosted and searchable.
hardais.com/newsletter →
This newsletter was assembled with an AI-assisted workflow prototype for internal preview. This newsletter is AI-assisted. The editor is human.
Reply all
Reply to author
Forward
0 new messages