Qwen3: Think Deeper, Act Faster (new open large language models)

13 views
Skip to first unread message

Alan Timm

unread,
Apr 29, 2025, 9:28:05 PM4/29/25
to RSSC-List
The Qwen folks have released a set of updated models with some very interesting and I think useful characteristics.

"We are open-weighting two MoE models: Qwen3-235B-A22B, a large model with 235 billion total parameters and 22 billion activated parameters, and Qwen3-30B-A3B, a smaller MoE model with 30 billion total parameters and 3 billion activated parameters. Additionally, six dense models are also open-weighted, including Qwen3-32B, Qwen3-14B, Qwen3-8B, Qwen3-4B, Qwen3-1.7B, and Qwen3-0.6B, under Apache 2.0 license.

Models Layers Heads (Q / KV) Tie Embedding Context Length
Qwen3-0.6B 28 16 / 8 Yes 32K
Qwen3-1.7B 28 16 / 8 Yes 32K
Qwen3-4B 36 32 / 8 Yes 32K
Qwen3-8B 36 32 / 8 No 128K
Qwen3-14B 40 40 / 8 No 128K
Qwen3-32B 64 64 / 8 No 128K

Models Layers Heads (Q / KV) # Experts (Total / Activated) Context Length
Qwen3-30B-A3B 48 32 / 4 128 / 8 128K
Qwen3-235B-A22B 94 64 / 4 128 / 8 128K"

These models take advantage of several recent developments, including a "thinking tag" so you can control how much thought is applied.  

Each of the models goes neck and neck with the state of the art, but there are two models specifically that caught my eye.

Qwen3-0.6b runs acceptably on raspberry pi sized hardware, and support tool use.  I'm adding it to Alfie to see if it can be useful.

Qwen3-30b-a3b is an MOE (Mixture of Experts) model with 3B parameters active at any one time.  It's very fast, and the 4bit quantized version fits on rtx 3090 4090 5090 gpus.

Alan

Gmail

unread,
Apr 30, 2025, 3:57:07 AM4/30/25
to Alan Timm, RSSC-List
Allen,
Do you know if these can run through GPT4all?


Thomas

-  

Need something prototyped, built or coded? I’ve been building prototypes for companies for 15 years. I am now incorporating generative AI into products.

-

Need a great hardworking engineer? I am currently looking for a new job opportunity in robotics and/ or AI. 

Contact me directly or through LinkedIn:   


On Apr 29, 2025, at 6:28 PM, Alan Timm <gest...@gmail.com> wrote:

The Qwen folks have released a set of updated models with some very interesting and I think useful characteristics.
--
You received this message because you are subscribed to the Google Groups "RSSC-List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rssc-list+...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/rssc-list/2750457b-61ff-4117-a035-bcdc7718bf2an%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages