[{"content":"What to Do When AI Coding Tasks Exceed Context Limits or Output Length? When I use the Claude Sonnet 4.6 model for programming, I frequently encounter two situations:\nA single session\u0026rsquo;s task exceeds the context limit or output length, causing a significant drop in code quality. AI generates an excessively long response, triggering \u0026ldquo;Sorry, the response hit the length limit. Please rephrase your prompt.\u0026rdquo; GitHub Codespaces has a very useful feature: you can see the token consumption status for the current session. My solution is: before exceeding the context limit, open a new session and re-describe the unfinished task.\nI strongly recommend never letting a single session\u0026rsquo;s task exceed the context limit, otherwise it will cause a significant drop in AI code quality. In my own development experience, I\u0026rsquo;ve encountered similar situations more than once.\nAlthough everyone has the awareness: too long context increases the probability of AI hallucinations. I don\u0026rsquo;t think it\u0026rsquo;s caused by the model\u0026rsquo;s own capabilities; 160K tokens are already sufficient. But another possibility is: it\u0026rsquo;s caused by Copilot\u0026rsquo;s context management strategy.\nI Asked Gemini 3.5 Flash and Got This Response: When coding quality drops significantly after exceeding 160K tokens - this is actually a classic \u0026ldquo;scapegoat\u0026rdquo; phenomenon between underlying model capabilities and engineering wrappers.\nThis is mainly caused by GitHub Copilot\u0026rsquo;s context handling (engineering truncation, retrieval-augmented suppression, etc.), not the limits of the Claude 3.5 Sonnet model itself.\nCore Reason: Copilot\u0026rsquo;s \u0026ldquo;Context Management\u0026rdquo; Strategy Claude 3.5 Sonnet itself has a 200K token context window. Under ideal conditions, it maintains extremely high \u0026ldquo;Needle In A Haystack\u0026rdquo; retrieval capabilities across the entire 200K range. But inside Copilot, things get complicated: Sliding Window and Implicit Truncation: To save token costs (API fees) and ensure response speed, Copilot rarely sends the entire 160K+ raw conversation history word-for-word to Anthropic\u0026rsquo;s servers. It uses special algorithms for summarization, sliding window truncation, or vector retrieval (RAG).\nInformation Distortion: When your session is extremely long, what Copilot sends to Claude may no longer be the code you originally wrote, but \u0026ldquo;second-hand context\u0026rdquo; compressed and refined by Copilot. This engineering processing causes logical gaps in the code the model receives, and code quality naturally snowballs downward.\nSystem Prompt Interference: Copilot injects very heavy system prompts (to constrain its behavior as an IDE programming assistant). When the context is extremely long, the model may experience conflicts in attention allocation between \u0026ldquo;following the Copilot framework\u0026rdquo; and \u0026ldquo;understanding user\u0026rsquo;s long code\u0026rdquo;.\nMy Solution: Let AI Design Phased Tasks, Open New Sessions, and Re-describe New Stage Tasks Taking code review as an example:\nIn Session 1: You can first let AI perform a code review and sort the issues to be fixed. Get the review conclusions first; then open a new Session 2, with the prompt being the review conclusions, and let AI fix the P0 issues. Open another Session 3, input the code review conclusions again, and let AI continue fixing P1 issues. This way, you split one code review + bug fix into different sessions, which not only ensures coding quality but also reduces context consumption (reducing your costs).\nA Brief About Me I\u0026rsquo;ve worked at NetEase Games, Baidu, Tencent (8 years), and Meituan (nearly 7 years), leading large-scale R\u0026amp;D projects and managing teams of 100+ engineers.\nCurrently, I\u0026rsquo;m pursuing entrepreneurship in the AI field.\nWhy? The world runs on uncertainty — staying in corporate roles too long breeds addiction to certainty. Starting an AI venture is like setting sail into uncharted waters.\nFeel free to reach out: mailto:HummingbirdLabs@outlook.com.\n","permalink":"https://hummingbirdlabs.github.io/posts/post-007/","summary":"\u003ch3 id=\"what-to-do-when-ai-coding-tasks-exceed-context-limits-or-output-length\"\u003eWhat to Do When AI Coding Tasks Exceed Context Limits or Output Length?\u003c/h3\u003e\n\u003cp\u003eWhen I use the Claude Sonnet 4.6 model for programming, I frequently encounter two situations:\u003c/p\u003e\n\u003col\u003e\n\u003cli\u003eA single session\u0026rsquo;s task exceeds the context limit or output length, causing a significant drop in code quality.\u003c/li\u003e\n\u003cli\u003eAI generates an excessively long response, triggering \u0026ldquo;Sorry, the response hit the length limit. Please rephrase your prompt.\u0026rdquo;\u003c/li\u003e\n\u003c/ol\u003e\n\u003cp\u003eGitHub Codespaces has a very useful feature: you can see the token consumption status for the current session.\n\u003cimg alt=\"Figure 2\" loading=\"lazy\" src=\"/images/7-01.jpg\"\u003e\u003c/p\u003e","title":"What to Do When AI Coding Tasks Exceed Context Limits or Output Length?"},{"content":"Worried about Deepseek returning to full price in June 2026, but unexpectedly it dropped again yesterday Here\u0026rsquo;s the information from Deepseek\u0026rsquo;s official website: For all models, the input cache hit price has been reduced to 1/10 of the launch price. This price adjustment takes effect from 2026/4/26 12:15 UTC.\nThe deepseek-v4-pro model API pricing will be officially adjusted to 1/4 of the original price after the 75% discount promotion ends on 2026/05/31 15:59 UTC. This means: Deepseek V4 Pro will permanently stay at 25% of the original price.\nDeepSeek is so generous, I must support them by adding funds. So I happily recharged another 700 RMB (about $100 USD) to DeepSeek. Let\u0026rsquo;s see how long this $100 credit will last. Model capabilities still need improvement. Fixing certain UI interaction bugs isn\u0026rsquo;t necessarily faster than human developers. I\u0026rsquo;ve found that on Windows WPF UI, all models (deepseek-v4-pro, Qwen3.6 Plus, Claude Sonnet 4.6, etc.) don\u0026rsquo;t perform very well. They easily produce compilation errors, runtime errors, and various UI data passing anomalies.\nIs it because the Windows tech stack has fewer training materials for models, plus the numerous and complex library versions, leading to poor model performance?\nMy current solution is: add detailed local logs in debug mode to provide AI with more runtime information.\nStatic code + runtime logs = complete program information. Simply letting the LLM review code is not enough. Because the LLM can only judge program behavior based on static code. There are many hidden assumptions here. When AI reviews only a portion of code, it assumes other modules are working correctly.\nIf you let AI review all code in a project, the context becomes too long, leading to AI hallucinations (I mentioned this in a previous blog).\nSo I strongly recommend: always add detailed local logs in debug mode to provide AI with more runtime information.\nA Brief About Me I\u0026rsquo;ve worked at NetEase Games, Baidu, Tencent (8 years), and Meituan (nearly 7 years), leading large-scale R\u0026amp;D projects and managing teams of 100+ engineers.\nCurrently, I\u0026rsquo;m pursuing entrepreneurship in the AI field.\nWhy? The world runs on uncertainty — staying in corporate roles too long breeds addiction to certainty. Starting an AI venture is like setting sail into uncharted waters.\nFeel free to reach out: mailto:HummingbirdLabs@outlook.com.\n","permalink":"https://hummingbirdlabs.github.io/posts/post-006/","summary":"\u003ch2 id=\"worried-about-deepseek-returning-to-full-price-in-june-2026-but-unexpectedly-it-dropped-again-yesterday\"\u003eWorried about Deepseek returning to full price in June 2026, but unexpectedly it dropped again yesterday\u003c/h2\u003e\n\u003cp\u003eHere\u0026rsquo;s the information from Deepseek\u0026rsquo;s official website:\nFor all models, the input cache hit price has been reduced to 1/10 of the launch price. This price adjustment takes effect from 2026/4/26 12:15 UTC.\u003c/p\u003e\n\u003cp\u003eThe deepseek-v4-pro model API pricing will be officially adjusted to 1/4 of the original price after the 75% discount promotion ends on 2026/05/31 15:59 UTC.\n\u003cimg alt=\"Figure 1\" loading=\"lazy\" src=\"/images/6-01.jpg\"\u003e\u003c/p\u003e","title":"Deepseek V4 Pro Price Drop Again on May 23, 2026"},{"content":"TRAE China Version: Real Data on Free Model Speed Let me start with the conclusion: if you use the free models in TRAE China\u0026rsquo;s version, they will be slower than custom paid models (even when comparing the same model).\nThe free models are mainly slower in 3 aspects:\nWhen processing large tasks, free models may display a prompt: \u0026ldquo;Model has reached the maximum number of thinking attempts. Please type \u0026lsquo;continue\u0026rsquo; to get more results.\u0026rdquo; When this happens, you need to manually type \u0026ldquo;continue\u0026rdquo; to proceed. As shown in Figure 1. Calling free models requires queuing; this was already mentioned in the previous blog. As shown in Figure 2. Additionally, during task processing, there\u0026rsquo;s a chance you\u0026rsquo;ll need to queue again.\nThe DeepSeek V4 Pro model has think mode enabled by default, with a thinking depth of around 200, which makes it slower when processing large tasks. To illustrate this point, I recorded the execution time from my own project.\nThe specific execution time for a task was: begin 11:34 / end 12:24. As shown in Figure 3.\nBut here\u0026rsquo;s what I found: the free Qwen3.6 Plus model is significantly faster than the paid DeepSeek V4 Pro model when handling large tasks — and I mean much faster. Moreover, the free Qwen3.6 Plus model has shorter queue times.\nThose Are the Real Data and Facts. Now Here\u0026rsquo;s My Opinion: DeepSeek Is a Great Company, and Its Pricing and Services Truly Benefit the Public I can understand that DeepSeek is still in development, which is why there are some limitations.\nBut the technology and services they provide are deeply imbued with a sense of human mission; based on this sense of mission and responsibility, this company is contributing to increasing the average intelligence of humanity.\nThis sense of mission and responsibility, in my personal view, stems from the founder\u0026rsquo;s simple beliefs and character. In short, it\u0026rsquo;s not about money — it\u0026rsquo;s about serving the people.\nA Brief About Me I\u0026rsquo;ve worked at NetEase Games, Baidu, Tencent (8 years), and Meituan (nearly 7 years), leading large-scale R\u0026amp;D projects and managing teams of 100+ engineers.\nCurrently, I\u0026rsquo;m pursuing entrepreneurship in the AI field.\nWhy? The world runs on uncertainty — staying in corporate roles too long breeds addiction to certainty. Starting an AI venture is like setting sail into uncharted waters.\nFeel free to reach out: mailto:HummingbirdLabs@outlook.com.\n","permalink":"https://hummingbirdlabs.github.io/posts/post-005/","summary":"\u003ch1 id=\"trae-china-version-real-data-on-free-model-speed\"\u003eTRAE China Version: Real Data on Free Model Speed\u003c/h1\u003e\n\u003cp\u003eLet me start with the conclusion: if you use the free models in TRAE China\u0026rsquo;s version, they will be slower than custom paid models (even when comparing the same model).\u003c/p\u003e\n\u003cp\u003eThe free models are mainly slower in 3 aspects:\u003c/p\u003e\n\u003col\u003e\n\u003cli\u003e\n\u003cp\u003eWhen processing large tasks, free models may display a prompt: \u0026ldquo;Model has reached the maximum number of thinking attempts. Please type \u0026lsquo;continue\u0026rsquo; to get more results.\u0026rdquo; When this happens, you need to manually type \u0026ldquo;continue\u0026rdquo; to proceed. As shown in Figure 1.\n\u003cimg alt=\"Figure 1\" loading=\"lazy\" src=\"/images/5-01.jpg\"\u003e\u003c/p\u003e","title":"More on TRAE China Version: Free Models Are Great But Slow"},{"content":"My Main Reason for Using TRAE: Free Programming LLMs Yes — the TRAE China version lets you try multiple large models for free. As shown in the screenshot, all of these models are available at no cost for trial. Here\u0026rsquo;s the full list of free models:\nDoubao-Seed-2.0-Code、 Doubao-Seed-1.8、 Doubao-Seed-Code、 MiniMax-M2.7、 MiniMax-M2.5、 GLM-5.1、 GLM-5V-Turbo、 GLM-5、 DeepSeek-V4-Pro、 DeepSeek-V4-Flash、 Kimi-K2.6、 Kimi-K2.5、 Qwen3.6-Plus、 Qwen3.5-Plus、\nBut here\u0026rsquo;s the catch: when using these free models, you often need to wait anywhere from 1 to 10 minutes. In my experience, the average wait is around 3 minutes. But honestly — when you\u0026rsquo;re heading to bed or stepping away for a coffee, waiting 3–10 minutes is perfectly acceptable.\nAnother thing worth noting: TRAE also supports custom models. You can top up credits directly on DeepSeek\u0026rsquo;s official platform, or on Alibaba Cloud, then use your API key inside TRAE to call models. As shown below: My Second Main Reason for Using TRAE: Fewer Freezes and Timeouts During Task Execution When I previously used Copilot\u0026rsquo;s LLMs for AI coding, a recurring problem was the model getting stuck on a command, effectively blocking all subsequent tasks.\nOn TRAE, I encounter far fewer of these situations. Moreover, the entire workflow requires very few manual permission confirmations. This frees up my time and lets me run more tasks in parallel.\nIn fact, I\u0026rsquo;m currently juggling 4 projects simultaneously:\nTRAE: rendering astronomical survey data into images. GitHub Codespaces: an offline old-photo AI restoration tool built with C# and WPF on Windows. Local VS 2026 IDE: a pet costume image generator built with C# and WPF — for example, dressing a puppy in a spacesuit or a kitten in a gothic dress. Local VS 2026 IDE: deploying LLMs locally on Windows with C# and WPF, and benchmarking model performance across different GPUs and CPUs. My Third Main Reason for Using TRAE: DeepSeek v4 Pro Supports a 1-Million-Token Context Window I\u0026rsquo;ve observed that Claude Sonnet 4.6 and Opus 4.7 both show noticeable code quality degradation once the task context exceeds 168K tokens.\nDeepSeek v4 Pro, by contrast, supports a 1-million-token context window. This allows it to maintain consistent code quality even when working on large-scale projects.\nMy Next Blog: Rendering Astronomical Survey Data into Images I love astronomy. I love looking at images of the universe. That\u0026rsquo;s why I built this project. I hope to share it with you soon — I think you\u0026rsquo;ll enjoy it too.\nBeyond Earth lie the stars and the vast cosmic ocean. That is the ultimate destination for humanity.\nA Brief About Me I\u0026rsquo;ve worked at NetEase Games, Baidu, Tencent (8 years), and Meituan (nearly 7 years), leading large-scale R\u0026amp;D projects and managing teams of 100+ engineers.\nNow, I\u0026rsquo;m building an AI startup.\nWhy? The world runs on uncertainty — staying in corporate roles too long breeds addiction to certainty. Starting an AI venture is like setting sail into uncharted waters.\nFeel free to reach out: mailto:HummingbirdLabs@outlook.com.\n","permalink":"https://hummingbirdlabs.github.io/posts/post-004/","summary":"\u003ch1 id=\"my-main-reason-for-using-trae-free-programming-llms\"\u003eMy Main Reason for Using TRAE: Free Programming LLMs\u003c/h1\u003e\n\u003cp\u003eYes — the TRAE China version lets you try multiple large models for free. As shown in the screenshot, all of these models are available at no cost for trial.\nHere\u0026rsquo;s the full list of free models:\u003c/p\u003e\n\u003cp\u003eDoubao-Seed-2.0-Code、\nDoubao-Seed-1.8、\nDoubao-Seed-Code、\nMiniMax-M2.7、\nMiniMax-M2.5、\nGLM-5.1、\nGLM-5V-Turbo、\nGLM-5、\nDeepSeek-V4-Pro、\nDeepSeek-V4-Flash、\nKimi-K2.6、\nKimi-K2.5、\nQwen3.6-Plus、\nQwen3.5-Plus、\u003c/p\u003e\n\u003cp\u003e\u003cimg alt=\"Figure 1\" loading=\"lazy\" src=\"/images/4-01.jpg\"\u003e\u003c/p\u003e\n\u003cp\u003eBut here\u0026rsquo;s the catch: when using these free models, you often need to wait anywhere from 1 to 10 minutes. In my experience, the average wait is around 3 minutes. But honestly — when you\u0026rsquo;re heading to bed or stepping away for a coffee, waiting 3–10 minutes is perfectly acceptable.\u003c/p\u003e","title":"Why I Use TRAE: Free LLMs, Stability, and 1M Token Context"},{"content":"I Think Qwen 3.6 Plus Has Strong Coding Capabilities, But My Costs Are Higher Than Expected I compared two approaches: 1、Using Qwen 3.6 Plus to write large-scale C# programs, then having DeepSeek v4 Pro conduct code reviews; 2、Using DeepSeek v4 Pro to write large-scale C# programs, then having Qwen 3.6 Plus conduct code reviews.\nI prefer the second approach for these reasons: 1、DeepSeek v4 Pro supports a context length of up to 1 million tokens. For large projects, this helps maintain clear logical connections between modules. Additionally, DeepSeek v4 Pro is currently more affordable (until May 31, 2026, it’s offered at 25% of the regular price—see screenshots in my previous blog). 2、Qwen 3.6 Plus delivers higher code quality but at a higher cost. Using it only for code reviews helps reduce overall expenses.\nBelow is a partial cost breakdown from my usage of Qwen 3.6 Plus. It might look cheap at first glance: one entry shows 876K tokens costing 1.7 RMB (≈ $0.24). But in practice, completing a single large engineering task often costs 30 RMB (≈ $4.00). The credits I top up on Alibaba Cloud deplete much faster with Qwen than with DeepSeek.\nAnother important note: Alibaba grants new users 1 million free tokens for many models, as shown below.\nBut is 1 million tokens truly generous? From my hands-on coding experience: 1 million tokens only cover 1–3 large programming tasks or several code reviews. For heavy AI-assisted coding users, 1 million tokens feel like a 100ml beer—barely a sip. So, if an article boasts about \u0026ldquo;burning 100 million tokens,\u0026rdquo; it likely reflects limited real-world AI coding experience.\nTo wrap up, I’d like to acknowledge: 1、ByteDance’s TRAE IDE for its innovation; 2、DeepSeek v4 Pro for its generous long-context support and current affordability (I’ll share updated billing data in June); 3、Qwen 3.6 Plus for its strong coding capabilities and responsive API.\nIn upcoming blogs, I’ll detail how to leverage AI coding within TRAE.\nA Brief About Me I’ve worked at NetEase Games, Baidu, Tencent (8 years), and Meituan (nearly 7 years), leading large-scale R\u0026amp;D projects and managing teams of 100+ engineers.\nNow, I’m building an AI startup.\nWhy? The world runs on uncertainty—staying in corporate roles too long breeds addiction to certainty. Starting an AI venture is like setting sail into uncharted waters.\nFeel free to reach out: mailto:HummingbirdLabs@outlook.com.\n","permalink":"https://hummingbirdlabs.github.io/posts/post-003/","summary":"\u003ch1 id=\"i-think-qwen-36-plus-has-strong-coding-capabilities-but-my-costs-are-higher-than-expected\"\u003eI Think Qwen 3.6 Plus Has Strong Coding Capabilities, But My Costs Are Higher Than Expected\u003c/h1\u003e\n\u003cp\u003eI compared two approaches:\n1、Using Qwen 3.6 Plus to write large-scale C# programs, then having DeepSeek v4 Pro conduct code reviews;\n2、Using DeepSeek v4 Pro to write large-scale C# programs, then having Qwen 3.6 Plus conduct code reviews.\u003c/p\u003e\n\u003cp\u003eI prefer the second approach for these reasons:\n1、DeepSeek v4 Pro supports a context length of up to 1 million tokens. For large projects, this helps maintain clear logical connections between modules. Additionally, DeepSeek v4 Pro is currently more affordable (until May 31, 2026, it’s offered at 25% of the regular price—see screenshots in my previous blog).\n2、Qwen 3.6 Plus delivers higher code quality but at a higher cost. Using it only for code reviews helps reduce overall expenses.\u003c/p\u003e","title":"Using Qwen 3.6 Plus: Great but a Bit Expensive"},{"content":"i like deepseek and Qwen Before May 2026, I had never used DeepSeek, Qwen 3.6 Plus, or any other Chinese LLMs for programming. As readers of my previous blog might recall, I primarily relied on GitHub Copilot’s models, favoring Claude Sonnet 3.6 and Claude Opus 4.7 (a bit pricey—if you’re wealthy, pretend I didn’t say that). My secondary choice was GPT Codex 5.3.\nSo when I first considered using DeepSeek or Qwen 3.6 Plus, I was skeptical—worried their code quality wouldn’t meet my standards.\nI knew strategies like syntax/structure constraints and cross-model code reviews could mitigate risks, but I still wanted the base model’s capability to be as strong as possible.\nFirst Steps with DeepSeek I started by topping up credits on DeepSeek’s official platform. Proof below: Over the next few days, I intensively tested DeepSeek v4 Pro. To give you a clear picture, here’s my usage breakdown:\nMay 17, 2026 Cost: 18.28 RMB (≈ $2.53) Total tokens: 66,488,180 Input (cached): 61,606,016 Input (uncached): 4,193,347 Output: 687,817\nMay 20, 2026 Cost: 6.61 RMB (≈ $0.92) Total tokens: 38,690,681 Input (cached): 37,049,600 Input (uncached): 1,387,345 Output: 253,736\nIf I maintain my recent high-intensity AI coding pace with DeepSeek v4 Pro: Daily cost: ~40 RMB (≈ $5.55); Monthly cost (30 days): ~1,200 RMB (≈ $166.50).\nIs this cheap? Compared to Copilot Pro+ ($39/month for 1,500 premium requests, e.g., one Claude Sonnet 4.6 call), no.\nBut compared to Copilot’s post-June 2026 pricing (see my first blog), it’s a bargain.\nImportant Note:DeepSeek v4 Pro is currently 25% off until May 31, 2026 (see screenshot below). After June, prices will revert to standard rates. I’ll share updated billing data in a follow-up blog to track real-world costs post-discount.\nWhat’s Next? In my next post, I’ll analyze Qwen 3.6 Plus’s AI coding costs.\nAfter that, I’ll dive into: 1、Token-saving strategies without sacrificing code quality. 2、Cost-cutting methods that don’t rely on reducing token usage. 3、Balancing affordability and reliability—how to save money while maintaining high code standards.\nA Shoutout to Google Gemini Today, I must praise Google Gemini. When I pasted an image asking for help, it returned a step-by-step guide image—truly impressive! About Me I’ve worked at Tencent (8 years), Meituan (7 years), Baidu, and NetEase Games, leading large-scale R\u0026amp;D projects.\nNow, I’m building an AI startup—because uncertainty fuels innovation, and corporate roles breed complacency.\nReach out: mailto: HummingbirdLabs@outlook.com.\nLet’s discuss AI coding, cost optimization, or the future of LLMs.\n","permalink":"https://hummingbirdlabs.github.io/posts/post-002/","summary":"\u003ch1 id=\"i-like-deepseek-and-qwen\"\u003ei like deepseek and Qwen\u003c/h1\u003e\n\u003cp\u003eBefore May 2026, I had never used DeepSeek, Qwen 3.6 Plus, or any other Chinese LLMs for programming.\nAs readers of my previous blog might recall, I primarily relied on GitHub Copilot’s models, favoring Claude Sonnet 3.6 and Claude Opus 4.7 (a bit pricey—if you’re wealthy, pretend I didn’t say that). My secondary choice was GPT Codex 5.3.\u003c/p\u003e\n\u003cp\u003eSo when I first considered using DeepSeek or Qwen 3.6 Plus, I was skeptical—worried their code quality wouldn’t meet my standards.\u003c/p\u003e","title":"DeepSeek v4 Pro, Qwen 3.6 Plus, or Others: Which Should I Use?"},{"content":"I Love GitHub Copilot, But Its June 2026 Billing Changes Worried Me Let me be clear: I genuinely love GitHub Copilot. As a loyal user and Copilot Pro+ subscriber ($39.00 per month), the 1,500 premium requests per month shown below have been invaluable—it’s the fuel behind my AI-powered coding workflow.\nI’d call this the most affordable, seamless token fuel for AI coding available. I’m deeply grateful to Microsoft for this service—though it’s disappointing that starting June 2026, billing will shift to a per-token model.\nUsing Microsoft’s Preview tool (https://copilot-billing-preview.github.com/), I analyzed my April and May statements. Under the new June pricing: 1、April would cost $141.04 2、May (through May 18) would cost $425.15\nThis is extremely expensive. Extremely expensive. Extremely expensive. (Yes, I’m repeating it three times.) I understand Microsoft’s pricing strategy—they rely on third-party models and lack full control over upstream LLM costs. Still, the jump is staggering.\nWhat’s Next? Given this, I’ll now explore more affordable large models that can handle complex coding tasks, like DeepSeek v4 Pro and Qwen 3.6 Plus. My next blog will compare their coding capabilities and cost efficiency. For developers deeply reliant on AI coding, tokens should feel as abundant and accessible as rain—not a luxury resource. Feel free to reach out: mailto:HummingbirdLabs@outlook.com.\nreach out to discuss AI coding tools, cost strategies, or stormy billing surprises. P.S. As a former engineer at Tencent (8 years), Meituan (7 years), Baidu, and NetEase Games, I’ve seen tech pricing shifts before. But this one stings.\n","permalink":"https://hummingbirdlabs.github.io/posts/post-001/","summary":"\u003ch1 id=\"i-love-github-copilot-but-its-june-2026-billing-changes-worried-me\"\u003eI Love GitHub Copilot, But Its June 2026 Billing Changes Worried Me\u003c/h1\u003e\n\u003cp\u003eLet me be clear: I genuinely love GitHub Copilot. As a loyal user and Copilot Pro+ subscriber ($39.00 per month), the 1,500 premium requests per month shown below have been invaluable—it’s the fuel behind my AI-powered coding workflow.\u003c/p\u003e\n\u003cp\u003e\u003cimg alt=\"图1\" loading=\"lazy\" src=\"/images/1-00.jpg\"\u003e\u003c/p\u003e\n\u003cp\u003eI’d call this the most affordable, seamless token fuel for AI coding available. I’m deeply grateful to Microsoft for this service—though it’s disappointing that starting June 2026, billing will shift to a per-token model.\u003c/p\u003e","title":"GitHub Copilot’s June 2026 Billing Changes: My April and May Statements (Preview) Shocked Me"}]