Deepseek V4 Pro pricing page showing the 75% discount and reduced cache hit prices

Deepseek V4 Pro Price Drop Again on May 23, 2026

Worried about Deepseek returning to full price in June 2026, but unexpectedly it dropped again yesterday Here’s the information from Deepseek’s official website: For all models, the input cache hit price has been reduced to 1/10 of the launch price. This price adjustment takes effect from 2026/4/26 12:15 UTC. The deepseek-v4-pro model API pricing will be officially adjusted to 1/4 of the original price after the 75% discount promotion ends on 2026/05/31 15:59 UTC. ...

May 24, 2026 · 2 min · 378 words · Hummingbird Labs
TRAE IDE showing the maximum thinking attempts reached prompt for free DeepSeek V4 Pro model

More on TRAE China Version: Free Models Are Great But Slow

TRAE China Version: Real Data on Free Model Speed Let me start with the conclusion: if you use the free models in TRAE China’s version, they will be slower than custom paid models (even when comparing the same model). The free models are mainly slower in 3 aspects: When processing large tasks, free models may display a prompt: “Model has reached the maximum number of thinking attempts. Please type ‘continue’ to get more results.” When this happens, you need to manually type “continue” to proceed. As shown in Figure 1. ...

May 22, 2026 · 2 min · 393 words · Hummingbird Labs
TRAE IDE showing the list of 14 free LLM models available for trial

Why I Use TRAE: Free LLMs, Stability, and 1M Token Context

My Main Reason for Using TRAE: Free Programming LLMs Yes — the TRAE China version lets you try multiple large models for free. As shown in the screenshot, all of these models are available at no cost for trial. Here’s the full list of free models: Doubao-Seed-2.0-Code、 Doubao-Seed-1.8、 Doubao-Seed-Code、 MiniMax-M2.7、 MiniMax-M2.5、 GLM-5.1、 GLM-5V-Turbo、 GLM-5、 DeepSeek-V4-Pro、 DeepSeek-V4-Flash、 Kimi-K2.6、 Kimi-K2.5、 Qwen3.6-Plus、 Qwen3.5-Plus、 But here’s the catch: when using these free models, you often need to wait anywhere from 1 to 10 minutes. In my experience, the average wait is around 3 minutes. But honestly — when you’re heading to bed or stepping away for a coffee, waiting 3–10 minutes is perfectly acceptable. ...

May 22, 2026 · 3 min · 486 words · Hummingbird Labs
Alibaba Cloud console showing Qwen 3.6 Plus token usage and billing details

Using Qwen 3.6 Plus: Great but a Bit Expensive

I Think Qwen 3.6 Plus Has Strong Coding Capabilities, But My Costs Are Higher Than Expected I compared two approaches: 1、Using Qwen 3.6 Plus to write large-scale C# programs, then having DeepSeek v4 Pro conduct code reviews; 2、Using DeepSeek v4 Pro to write large-scale C# programs, then having Qwen 3.6 Plus conduct code reviews. I prefer the second approach for these reasons: 1、DeepSeek v4 Pro supports a context length of up to 1 million tokens. For large projects, this helps maintain clear logical connections between modules. Additionally, DeepSeek v4 Pro is currently more affordable (until May 31, 2026, it’s offered at 25% of the regular price—see screenshots in my previous blog). 2、Qwen 3.6 Plus delivers higher code quality but at a higher cost. Using it only for code reviews helps reduce overall expenses. ...

May 22, 2026 · 2 min · 389 words · Hummingbird Labs
DeepSeek platform showing token usage breakdown and cost for AI coding sessions

DeepSeek v4 Pro, Qwen 3.6 Plus, or Others: Which Should I Use?

i like deepseek and Qwen Before May 2026, I had never used DeepSeek, Qwen 3.6 Plus, or any other Chinese LLMs for programming. As readers of my previous blog might recall, I primarily relied on GitHub Copilot’s models, favoring Claude Sonnet 3.6 and Claude Opus 4.7 (a bit pricey—if you’re wealthy, pretend I didn’t say that). My secondary choice was GPT Codex 5.3. So when I first considered using DeepSeek or Qwen 3.6 Plus, I was skeptical—worried their code quality wouldn’t meet my standards. ...

May 21, 2026 · 2 min · 403 words · Hummingbird Labs