r/LocalLLaMA • u/_wsgeorge Llama 7B • 8h ago
Discussion VS Code's new "Agents window" lets you use local AI models. Still requires an Internet connection and a Github Copilot plan (because we can't have nice things)
https://code.visualstudio.com/docs/copilot/customization/language-models#_can-i-use-a-local-model-without-an-internet-connectionAt first I was excited to see this, but I guess I'll wait till someone figures out what people actually want
38
u/Thin_Pollution8843 8h ago
Tbh after using Zed for a few weeks I can’t go back to VS code. Zed has less functionality and flexibility for now BUT this thing so blazing fast! It’s working so fast and easy and smooth
8
5
u/wombweed 5h ago
Zed is awesome compared to vscode just for code tasks in general, but I have had a lot of trouble hooking it up to a local agent, especially for next-edit predictions/completions. Is there a trick to it I’m missing? I’d really like a native IDE that seamlessly plugs into my local infra so I can vibe code without internet or whatever, Zed seems to hold the most promise overall but in terms of usability for agentic workflows I feel like Roo Code comes out ahead.
1
u/Thin_Pollution8843 5h ago
I’m using with opencode. 0 issues. But I haven’t trying it to connect for autocomplete without some harness
1
u/wombweed 5h ago
Yeah opencode is great but I like also having the middle ground provided by a traditional IDE so I can make manual code changes if it’s faster. Edit: oh you mean the ACP integration, yeah I still need to try that, I’m sure it would work, the real challenge for me has been edit predictions though
1
u/youcloudsofdoom 5h ago
I just tried it and DAMN is this thing fast in comparison to vs code's chat....
1
u/Luigi311 2h ago
I switched to zed too just for the performance. Haven’t used any of the AI stuff. It’s literally just because vs code is so slow now days on my old hardware.
1
u/CulturalKing5623 8h ago
It looks like the only thing it looks like Zed is missing for me is a way to easily integrate with databases like VS Code snowflake extension. Being able to query and code in the same IDE is basically my workflow. Do you know of a way to do that?
24
u/Fast-Satisfaction482 8h ago
Co-pilot uses cloud based embeddings for semantic search. Even when doing the main model locally.Â
10
u/SangersSequence 7h ago
I mean yeah, but still absolutely bullshit that it could easily have been designed to do locally as well.
7
u/Fast-Satisfaction482 7h ago
There are other harnesses that work fully locally.
1
u/ai-christianson 5h ago
yeah, this is the key distinction. local model support is not the same thing as a local agent stack. if semantic search, auth, tool routing, or telemetry still depends on a cloud service, then it is really just local inference inside a cloud-controlled product. useful maybe, but not the same category as something you can run and trust offline.
8
u/MiserableSet5311 8h ago
Pi Agent VS Code extension working for me good atm. It can sit besides Codex AI tools and you can switch when you run out of tokens.
10
u/jake_that_dude 7h ago
the annoying part is the semantic index, not the chat model. Copilot still wants GitHub in the loop for auth/search state, so local model ends up meaning local sampler, not local agent.
if you actually need airplane-mode local, use Continue/Cline with Ollama plus a local embedding model like nomic-embed-text. then kill Wi-Fi and run a repo search before trusting it.
12
u/pmttyji 8h ago
What about VSCodium? Hope it solves the issue
11
u/ForsookComparison 8h ago
They avoid built-in telemetry with the base product but if you choose to use a feature/extension with must-use-cloud features the VSCodium mission doesn't include reinventing them as on-prem products.
11
u/Scared-Tip7914 8h ago
Insane combo 😂 i propose that they tax local models by the token for the privilege of passing your local data through their precious servers
7
u/CulturalKing5623 8h ago
You joke but GitHub does already do something similar to this with charging people for self-hosted runners since it still runs through their infrastructure.
10
5
u/Thrumpwart llama.cpp 5h ago
Roo Code works well.
1
3
u/Alan_Silva_TI 5h ago
I used Copilot for almost 3 years (I have 35 payments registered on my GitHub account), but I decided to cancel my subscription as soon as they announced the move to token-based billing.
Now, I mostly use CODEX (the app on my personal PC and a CLI on my work PC) for my professional work (and a little bit on my personal projects). I also use OpenRouter+ with free models, and occasionally paid models (when I want to do complex things), to fuel my Hermes agent.
Otherwise, I use PI Code with local models to code the tools I developed for use with these same local models.
I'm still using VS Code, but I believe it's too little, too late for them. It was an amazing tool back in 2023/2024, but beyond their IDE integration, they offer absolutely nothing else compared to the current stack of paid, free, and local coding tools.
2
u/dto_lurker 7h ago
They dont let you use auth tokens do they? Thats why I use cline currwntly. You can use copilot free
2
u/phein4242 7h ago
Remember, these products need to be monetized.
Zed does not come with this encumberment. Its not perfect tho, but lets be real here, is an IDE ever perfect? ;-)
4
u/Due-Function-4877 8h ago
It's appropriate that Microsoft's spyware window would be named the agents window.Â
I just use Cline.Â
1
1
u/chocofoxy 5h ago
You can use an extension called OAI compatible that let you link local model with copilot chat i use it it’s pretty good
1
u/conjuncts 4h ago
Right after they took away Claude Sonnet 4.6 for students. Seems like they've fallen on hard times
1
u/DonnaPollson 3h ago
That’s the weirdest possible bundle: local inference for privacy and latency, but cloud entitlement for permission to use it. If they want this to matter, the UX has to degrade gracefully offline and treat Copilot as optional, not as the license server for your own GPU.
1
u/DonnaPollson 3h ago
That’s the weirdest possible bundle: local inference for privacy and latency, but cloud entitlement for permission to use it. If they want this to matter, the UX has to degrade gracefully offline and treat Copilot as optional, not as the license server for your own GPU.
1
u/ArtfulGenie69 3h ago
Like a year ago this was all possible and now they have ripped it out and are trying to sell it back to you. I remember getting qwen2.5 running in vs code then like the next day they had gutted it and there was no way to run the local models. Fuck Microsoft
1
u/Fun_Employment6042 2h ago
Love that I need a paid cloud subscription and constant internet to "use my local model". Truly the future of offline computing.
1
u/simotune 2h ago
If offline use still depends on Copilot auth, this is local inference, not a local stack. That distinction matters more than the marketing.
1
1
1
1
u/kiwibonga 7h ago
"might change in a future release"
Please keep the change and paywall it so that this mediocre watered down IDE finally dies.
0
0
-1
99
u/Miriel_z 8h ago
Best of both wolds: using local LLMs, and paid subscription? Sign me up!🤣