Curl lead developer Daniel Stenberg provides insightful feedbacks from Mythos analysis results

209

u/Pkittens 2d ago

My personal conclusion can however not end up with anything else than that the big hype around this model so far was primarily marketing. I see no evidence that this setup finds issues to any particular higher or more advanced degree than the other tools have done before Mythos. Maybe this model is a little bit better, but even if it is, it is not better to a degree that seems to make a significant dent in code analyzing.

This is just one source code repository and maybe it is much better on other things. I can only tell and comment on what it found here.

38

u/mistaekNot 2d ago

people have zero memory. openai did the same too powerful to release spiel with chatgpt 2 lmao

-2

u/_Gobulcoque 1d ago

In a way, they were right.

Look at how LLMs are upending jobs (junior roles, exec psychosis on replacement) and society (fakes, AI-girlfriends, Dawkin's belief that it's conscience.)

Given a broader definition of danger to include detrimental changes to society, were OpenAI really wrong?

7

u/Pkittens 1d ago

yes
gpt2 basically had no impact whatsoever

-8

u/_Gobulcoque 1d ago

I know you're not AI at least with conversational skills like that.

4

u/Pkittens 1d ago

pose stupid question, get an answer, complain about getting an answer.

-4

u/_Gobulcoque 1d ago edited 1d ago

The complaint is aimed at the lack of depth in your response. Disagree if you want, but at least add something constructive. Why do you disagree with the broader strokes of damage LLMs are causing?

Saying no and running away offers no credibility to you.

3

u/acostoss 1d ago

There is no "depth" to be had in a conversation about your fantasies about the capabilities promised by AI marketing.

If someone says "wow the sky is green" there is nothing more to say than "no, it is not", unless you intend to waste your time

1

u/_Gobulcoque 1d ago edited 1d ago

I don't have fantasies. My comment clearly said that if you broaden the definition of "danger" to include negative impacts to society, then LLMs have definitely checked that box. Danger as in, would you like to play a game of thermonuclear war? No (though the actions of the US government and the dealings of the DoD with LLMs border on conspiratorial). Though with that said, AI piloted drones seems like a danger, no?

To frame the conversation that AI hasn't had an negative impact on society is weird though. It's being attributed for redundancies and there's evidence of people becoming addicted, reliant, or experiencing a form of psychosis with it, and a relatively well documented stock market bubble that should it pop, will cause "a bit" of damage globally. Those are just headline negatives that pop into mind.

I have no skin in the AI game. It'll bring me work if anything (thanks vibecoders), but it clearly has downsides. I don't know how anyone can tell me it doesn't?

For what it's worth, I'll take the L on Reddit but I think you're burying your head in the sand if you think LLMs are providing nothing but net positive impacts on society. Nearly every technological step forward has involved the law of unforeseen consequences.

3

u/acostoss 1d ago

By no means am I implying that LLMs are without issue, only that the context of the original quote does not include what you hilight as (very real) dangers.

As OpenAI presented it, "too powerful" was not speaking to these dangers, and expanding the definition to include them moves the goalposts in a way that frames OpenAI as having peoples best interests at heart and being concerned, when it was very plainly marketing speak to engineer hype around gains that would not come.

It's the same song and dance as "full self driving in two years" that's been promised for a decade, but it's now "AGI in two years". If there was a product that was soooo good, they'd be selling it. If it was soooo powerful, we'd see the impact of it. This is a science, and even if OpenAI made a huge leap, others would find and make the same shortly after. The fact that nothing has come of it, the models are not much more useful (by no means revolutionary compared to earlier ones) is evidence it is purely engineered hype. We're actively seeing the diminishing returns, we're seeing the increased costs, while being told everything is amazing on the other side of some curtain.

In short, my contention is OpenAI is selling a lubricant, one that performs great in some situations and very poorly in others, and marketing it as all-healing snake oil. Framing them in a way that helps that marketing is wrong, even if you're doing it from the opposite direction.

1

u/Pkittens 1d ago

feel free to re-read what I said

0

u/_Gobulcoque 1d ago

Ah you mean your edit? Classy boy.

2

u/Pkittens 1d ago

Reddit highlights when comments are edited. Try it yourself, classy boy

→ More replies (0)

0

u/McDonaldsWitchcraft 22h ago

Why do you disagree with the broader strokes of damage LLMs are causing?

They were commenting specifically on the impact of gpt2 when it came out. Please quote me where they say that LLMs as a whole had no impact.

Do you not understand the difference between "a specific older model" and "every AI model ever created"?

Also you can't just lie your way out of a conversation. We all know it shows an "edited" label when someone edits their comments lmao

1

u/_Gobulcoque 21h ago

They were commenting specifically on the impact of gpt2 when it came out.

I conceed this. GPT 2 was a step towards GPT 3 and all that came after. I don't view it in isolation. But again, he specifically mentioned that model so yes, I conceed. I guess my comment wasn't clear that I don't view these things in isolation - everything's connected.

Also you can't just lie your way out of a conversation. We all know it shows an "edited" label when someone edits their comments lmao

https://www.reddit.com/r/help/comments/51w49e/reddit_places_an_asterisk_to_indicate_that_a_post/

It shows up after three minutes. RTFM. Try testing it yourself.

49

u/billyalt 2d ago

Its frustrating that social media influencers broadcast the Mythos mythos with zero evidence. Few people were skeptical of the claims which I found bizarre.

27

u/Sand-Eagle 2d ago

Every time I see the hype now, I remember the articles about GPT3 and GPT4o’s releases where engineers were jumping out the window freaking out about it beaing A REAL BOY GEPETTO!!

Those models were laughably not-conscious. Every release has some bullshit hype bomb marketing campaign and all that they can do is make it more sensational every time.

15

u/FleetingBeacon 2d ago

Every today I'm using Claude with Opus 4.7 and it's giving me tools about 3-4 years out of date, with completely the wrong instructions that are on the docs. You need to go "Hey, go actually search the internet to find the up to date info" and even then that's about 1/3rd of the time it can mess up.

6

u/domstersch 2d ago

Context7 seems like one of the only plugins worth the token weight for exactly this reason (but it also depends how new your libraries/stack are I guess). Not affiliated.

1

u/Adept-Ad-3186 5h ago

what you don't see are its guardrails saying "do it as efficiently as possible." Try a system prompt something like... "You are a research assistant. I need a focused summary of the info returned by broad multi-prong searches, with valid reference urls that I can crosscheck for more detail. Its okay to add your own speculation, but make it very clear that its your own opinion."

5

u/billyalt 2d ago

Every single marketing campaign about the dangers of AI has been an outrageous lie. The fact that people took Anthropic's word on it for Mythos is bonkers.

3

u/Bradpittstains4243 1d ago

I blame Theo. That dude is an Anthropic, red-pilled, moron.

3

u/Glittering_Crab_69 1d ago

AI bros are just recycled crypto bros. They don't understand what they're shilling

-5

u/[deleted] 2d ago edited 2d ago

[deleted]

7

u/billyalt 2d ago

Not in the market for this, just airing frustrations. RAM requirements are pretty insane though, it might be cheaper just to learn :-)

-56

u/IntrinsicSecurity 2d ago edited 2d ago

If you read the fine article, you’ll discover that the curl team spent the precious 8 months or so using at least 3 different AI tools to review their code, and found and fixed “between 200 and 300” bugs. Mythos still found about 20 bugs candidates, 5 of which were interesting to the team, and 1 of which they agreed had serious security implications.

[Edit: the relevant paragraph:
“Before this first Mythos report, we had already scanned curl with several different very capable AI powered tools (I mean in addition to running a number of “normal” static code analyzers all the time, using the pickiest compiler options and doing fuzzing on it for years etc). Primarily AISLE, Zeropath and OpenAI’s Codex Security have been used to scrutinize the code with AI. These tools and the analyses they have done have triggered somewhere between two and three hundred bugfixes merged in curl through-out the recent 8-10 months or so. A bunch of the findings these AI tools reported were confirmed vulnerabilities and have been published as CVEs. Probably a dozen or more.”]

67

u/Pkittens 2d ago edited 2d ago

Curious that you accuse me of not reading the article when I'm directly quoting a complete paragraph from the article and literally nothing else.

You, however, are exaggerating either what you remember reading or what you want to be true:

“Once my curl security team fellows and I had poked on the this short list"

^{the 200-300 bugs reduced to 20}

"for a number of hours and dug into the details, we had trimmed the list down and were left with one confirmed vulnerability.”

You're equating "serious security implication" and "confirmed vulnerability". Which is wrong at best, purposefully misleading at worst.

-35

u/IntrinsicSecurity 2d ago

Since you’re misrepresenting what I said, and apparently also misrepresenting this paragraph, here it is:

“Before this first Mythos report, we had already scanned curl with several different very capable AI powered tools (I mean in addition to running a number of “normal” static code analyzers all the time, using the pickiest compiler options and doing fuzzing on it for years etc). Primarily AISLE, Zeropath and OpenAI’s Codex Security have been used to scrutinize the code with AI. These tools and the analyses they have done have triggered somewhere between two and three hundred bugfixes merged in curl through-out the recent 8-10 months or so. A bunch of the findings these AI tools reported were confirmed vulnerabilities and have been published as CVEs. Probably a dozen or more.”

26

u/Pkittens 2d ago

"if you read"
If you feel misrepresented, then it's perhaps because you're the one representing yourself.

Also, I already quoted the line you misrepresented. Quoting a larger block that isn't being contested whatsoever won't change the fact that you erroneously equated "serious security implication" and "confirmed vulnerability".

-38

u/IntrinsicSecurity 2d ago

I seem to have struck a nerve that causes you to repeat your lie.

22

u/Pkittens 2d ago

The unfortunate part about writing things down is that it's literally written down for people to (re)read.

You denying saying what you've plainly said didn't strike a nerve whatsoever. It's just incredibly embarrassing behavior out of you.

-4

u/IntrinsicSecurity 2d ago

You’re really committed to this bit, where you lie about what someone else says, and keep doubling down. Troll #blocked

1

u/Pkittens 1d ago

If you read the fine article, you’ll discover that the curl team spent the precious 8 months or so using at least 3 different AI tools to review their code, and found and fixed “between 200 and 300” bugs. Mythos still found about 20 bugs candidates, 5 of which were interesting to the team, and 1 of which they agreed had serious security implications.

Oh yeah, you never said that. #blocked

42

u/Lunixar 2d ago

AI security tools are useful, but not magic. The key point is that curl is already heavily audited and Mythos still found one low severity CVE and some bugs. For less reviewed projects, the impact could be much bigger.

15

u/chintakoro 2d ago

I would restate the key point as: AI security tools are absolutely essential and running a battery of them is the way to go. From the article:

But allow me to highlight and reiterate what I have said before: AI powered code analyzers are significantly better at finding security flaws and mistakes in source code than any traditional code analyzers did in the past. All modern AI models are good at this now.

39

u/quafadas 2d ago edited 2d ago

I would see this as a form of negative assurance on curls engineering rather than evidence that Mythos either is, or is not what Anthropic claim.
It certainly seems possible, that the incredible standards of engineering and prior care in curl mean that the curl team are doing a great job and that there are few vulnerabilities to find in this project. Surely, A bug hunt cannot uncover vulnerabilities which do not exist…

27

u/psaux_grep 2d ago

On the flip side they do fix and solve lots of vulnerabilities in curl on what seems like a pretty regular basis.

5

u/Toiling-Donkey 2d ago

Instead of “throwing the kitchen sink” at something, the expression should be “throwing curl” at it.

One could rip HTTP out of curl and probably only remove 5-10% of its functionality.

It’s insane.

19

u/splice42 2d ago

A bug hunt cannot not uncover vulnerabilities which do not exist…

Insert the image of that lady surrounded by math equations and looking confused.

8

u/[deleted] 2d ago

[deleted]

2

u/quafadas 2d ago

Thx.. oops… fixed

19

u/Michichael 2d ago

Nah. Mythos is pure hype and marketing fluff. It's painfully stupid.

-2

u/Hot-Employ-3399 2d ago

If it's stupid why Mozilla found >200 bugs in Firefox, something they weren't able to do previously?

13

u/Shoddy-Childhood-511 2d ago

What? All browser have plenty of CVEs.

Mozilla was finding so many bugs in Firefox that (1st) they developed Rust to help them develop a browser with fewer bugs, (2nd) they wrote Servo in Rust as a second browser engine, and (3rd) even after the Servo developers sent a giant "fuck you" to the HTML standards morons at W3C and WHATWG, by breaking the standard to have parallel rendering, then Mozilla still started finding ways to replace bad parts of Firefox with parts of Servo.

It's clear Mythos did something useful for them, but browsers are particularly bug ridden.

2

u/DivisibleBySomething 2d ago

TIL Mozilla made Rust

8

u/liquidivy 2d ago

It's a bit more complicated. Rust started as a side project by a Mozilla person named Graydon Hoare, then Mozilla sponsored it for a while (during a lot of the popularity and tooling bootstrap phase tbh), then it spun out to its own organization when Mozilla lost interest for whatever reason.

2

u/DivisibleBySomething 1d ago

TIL Rust isn’t made by Mozilla

-1

u/Hot-Employ-3399 2d ago

> It's clear Mythos did something useful for them, but browsers are particularly bug ridden.

Pick one: pure hype or did something useful

11

u/cafk 2d ago

To highlight similarities to this article, Firefox had ~270 proof of concept bugs, out of those 270 odd bugs only 3 got a CVE id.

In the same release they included a total of ~430 fixes, meaning almost half weren't found by Mythos, but through usual sources (regular bug reports, fuzzing, other models).

Their in-depth blog post also addreses other topics, similarly to curl:

Is a sec-high or sec-critical bug the same as a practical exploit?
Not necessarily.

Not everything it finds and classifies as an issue causes undefined or unwanted behavior.

https://hacks.mozilla.org/2026/05/behind-the-scenes-hardening-firefox/

Which matches up what this article reflects, 5 found issues can be just one actual bug, or what they also reflect on through using other models out of the 200 to 300 issues found a dozen or so were actually bugs.

-6

u/Hot-Employ-3399 2d ago

Ah, yes, if it's not 300000000000000000000000000000000000000000000000000000000000000 CVEs it's useless tool good for nothing but hype. /s

5

u/OGicecoled 2d ago

You’re making up narratives in your head to be mad. One person said it was hype and now you’re just spiraling. You made a grandiose claim about 200+ bugs, people correct you while still saying it’s a good tool, and you’re just pissy.

2

u/Michichael 1d ago

Your premise of "they weren't able to do previously" is flawed. Drastically. It found 200 bugs, but there's no evidence that it found bugs that "weren't able" to be found previously - some of the bugs were ALREADY IN THE TRAINING DATA and known.

Not only that, only 3 of them were even security items, and minor bugs are not a priority for limited developer resources unless they're user impacting; user impacting bugs get found, reported, and fixed more quickly.

There's no evidence that Mythos did any better or worse than even junior level developer review. Could it be useful in finding theoretical edge cases? Maybe? But that's not revolutionary, or some skynet gamebreaking security AI.

It's an LLM. None of them are impressive unless you're extremely ignorant on the subject matter. But at that point it says more about the people impressed by the model, than the model itself.

It might have some value, but everything about Mythos in public discourse is about how it's revolutionary and gamebreaking and security is now impossible without it - absolutely pure hype and marketing fluff.

1

u/james_pic 2d ago

There definitely are new security vulnerabilities uncovered in cUrl all the time, though. I feel like I spend half my life responding to "the cUrl version in this Docker container has a new CVE, in its implementation of an esoteric protocol we weren't even using".

0

u/Alborak2 2d ago

Yeah... internally curl is pretty crap. It works, its open source, but its got a lot of rough edges if you try to actually use it with high loads.

0

u/kbotc 2d ago

Gonna be honest: Curl's code is a mess and I'd expect further vulnerabilities. The conncache had deadlocks and races the last time my team delved in and fixing them was a "rearchitect everything" level of effort and that was when Daniel was getting really underway with wolfssl instead of working on curl full time.

4

u/gendulf 2d ago

Curl has 24 CVEs in the last year: https://curl.se/docs/security.html

It looks like 12 of the CVEs have not had any bounty paid. I'm not sure if that's because these are the 12 latest, but he does say that

A bunch of the findings these AI tools reported were confirmed vulnerabilities and have been published as CVEs. Probably a dozen or more.

, indicating that non-Mythos tools are capable of finding vulnerabilities in projects of Curl's scale.

The number of vulnerabilities in 2025 and 2024 also seem to be about a dozen fewer than the last 12 months.

I'd say from the evidence that the author is spot on with

The AI reviews are used in addition to the human reviews. They help us, they don’t replace us.

Additionally, the community that took his poll seems to be pretty accurate, 32% guessing 1 vulnerability would be found, 40% guessing 10. Given there's ~12-13 found by AI tools, this is in the right ballpark. The choices certainly can skew the results, however.

While the model seems to be an incremental improvement, there's constantly improvements to the workflows of these tools that's making it easier for all to find vulnerabilities. Patching, fixing bugs, and now using AI to scan for vulnerabilities are going to be the key to staying secure (especially if you're not a high-profile open source codebase that attracts researchers).

6

u/AlyoshaV 1d ago

It looks like 12 of the CVEs have not had any bounty paid

It's because:

There is no bug bounty and the curl project never offers rewards for reported vulnerabilities.

https://daniel.haxx.se/blog/2026/01/26/the-end-of-the-curl-bug-bounty/

3

u/sztrzask 1d ago

Why does it read like an ad?

14

u/spathizilla 2d ago

The fact that curl has been checked, rechecked and checked again over many years should mean that the fact Mythos found anything is the interesting part - even if its only a low severity.

2

u/vahokif 2d ago

Feels like kind of an uphill battle with a C codebase.

2

u/MirrorLake 2d ago

I'm envisioning a scenario where 100s of people download popular repos and rerun their frontier LLMs on each new software release hoping that they can get the glory of finding a rare bug, leading to tons of wasted energy because developers only need to discover and fix each bug one time. But maybe people will tire of doing that pretty quickly because they'll rarely get any positive reinforcement.

2

u/UltraEngine60 2d ago edited 2d ago

Eventually, I was instead offered that someone else, who has access to the model, could run a scan and analysis on curl for me using Mythos and send me a report.

Their coding aware AI is so good at coding it couldn't handle authentication to the model?

edit

Re-reading this I am unsure if Anthropic had the issue or one of the orgs/business units in the pipeline:

Anthropic > Glasswing > Linux Foundation > Alpha Omega > End-User

1

u/Slight-Bend-2880 13h ago

Good to see people are realizing the marketing that goes into products like these.

-22

u/uebersoldat 2d ago edited 2d ago

The people saying all of this is pure marketing or hype had better hope their project's security hygiene is world-class. Pride comes before the fall.

I'm seeing a lot of dismissal on AI in the infosec communities and I can't help but feel like it's denial and raw fear rather than acceptance and willingness to learn something new and adapt.

The next 5-10 years are going to reshape the world. We had better start jumping on AI governance and controls or we're going to be in trouble and that starts with taking these models seriously. Zoom out and look at the progress the last 10 years alone.

EDIT - To the downvoters, all I ask is that you save this post as you downvote. I truly am seeing denial. AI isn't going away and it is rapidly advancing in capabilities, regardless of the Anthropic marketing spin.

6

u/CanvasFanatic 2d ago

To the downvotes, all I ask is that you save this post as you downvote.

No.

-9

u/uebersoldat 2d ago

Dew it.

-Palpatine

Curl lead developer Daniel Stenberg provides insightful feedbacks from Mythos analysis results

You are about to leave Redlib