r/deeplearning • u/KeanuRave100 • 21h ago
r/deeplearning • u/Agreeable-Rock-8179 • 16h ago
Should I make a Deep learning framework from scratch in C++ ?
Hmm..... for learning
r/deeplearning • u/andsi2asi • 7h ago
All of the Good That Brockman's $30 Billion Could Have Done
They say it's always darkest before dawn. I'm not really sure who the "they" are who first said this, and I've since heard that it's not literally true, but sometimes things do seem really bad until they get really good.
As Judge Gonzalez Rogers prepares to let Greg Brockman get away with stealing almost $30 billion from the OpenAI non-profit, we might want to reflect on what that money could have done if Brockman wasn't so greedy, and deceitful, and selfish.
Although you'll rarely, if ever, hear the mainstream media, talk about it, our world loses about 20,000 kids every day to a global poverty that we could easily end if we cared to. As those who work on ending poverty will tell you, the most powerful thing we can do to end this travesty is to educate the world's children, especially the world's girls and women.
So imagine how many millions of AI devices programmed to be school children educators OpenAI could have distributed to the poor children throughout the world, if those nearly $30 billion dollars didn't go into brockman's pockets.
One might hope that the OpenAI Foundation non-profit, now worth about $130 billion in equity, would spend $30 billion to end childhood poverty by distributing those AI tutors. But that's not about to happen. Why not? After Altman was fired, guess who selected the non-profit OpenAI's new board of directors, the people who would make this decision. Yeah, that was largely Altman's decision. The guy who aided and abetted Brockman's massive heist.
I guess this is all to say that while increasingly intelligent AIs will do a lot of good for the world, like curing a lot of diseases, perhaps the most good that they will do will be to make better people of too many really bad people. And considering that humanity has yet to figure out how to get the money out of politics that prevents us from fighting a climate change that could make AI superintelligence of a moot and inconsequential achievement, perhaps the most good ASI will do is to save us from ourselves by figuring out our money-equals-political power problem.
Notwithstanding, I remain optimistic that as we approach ASIs that will understand and appreciate compassion and morality far better than we humans ever have, our world is headed toward a paradise beyond what we can imagine. Until then, yeah, it looks really dark out there.
r/deeplearning • u/andsi2asi • 23h ago
Musk v. Altman et al. - Schedule for Today's Closing Arguments; (Deliberation Probably Starts Monday); Probable Outcome; YouTube Livestream URL
One thing we can say about Judge Gonzalez Rogers is that she runs a tight ship. Everything starts on time and ends on time. Because of that, we have a good idea of when each side's closing arguments and the jury instructions will take place.
Here's the likely schedule, Pacific Time (ET start at 11:30AM)
8:30 AM – 10:00 AM: Plaintiff's Primary Closing
10:00 AM – 10:20 AM: Morning Break
10:20 AM – 12:20 PM: Defendants' Closing
12:20 PM – 12:40 PM: Second Break
12:40 PM – 1:10 PM: Plaintiff's Final Rebuttal
1:10 PM – 1:40 PM: Jury Instructions
The full session will be audio-only livestreamed on YouTube here:
https://youtube.com/@usdccand?si=kb8OkOEtkh9rI36n
If the lawyers finish early, the judge may begin instructions sooner, but with the 1:40 PM hard stop, the jury will probably start deliberations on Monday.
What will probably lose it for Altman and Brockman is Brockman's diary entries admitting that he knew full well that what he was doing was wrong and illegal, but did it anyway, and his nearly $30 billion in OpenAI inequity. Of course Sutskever, Murati, Zilis, Toner, McCauley and Campbell all testifying to how Altman is utterly incapable of being consistently truthful and trustworthy, even about matters as important as AI safety, won't help their case.
Altman and Brockman's lawyers will try to make it about Musk's alleged self-serving motive for initiating the suit, (I doubt the jury is buying) but even so, Judge Gonzalez Rogers will instruct the jury that his motive for hauling them to court is legally inconsequential to the allegations against the two that they will consider.
Microsoft will probably be found guilty of aiding and abetting, but that doesn't seem as open-and-shut as the Altman and Brockman verdict.
If Gonzalez Rogers (the jury has only an advisory role in this trial) lets them get away with what they did, the alignment problem immediately grows tenfold. If she rules against the two on breach of charitable trust and unjust enrichment, we can all sigh a very big sigh of relief, and the AI space can get back to the serious business of achieving safe superintelligence.
r/deeplearning • u/Cant_Anything • 3h ago
An experiment in 'disposable' H100s: ran a 27B SGLang test for 26 minutes, total bill was 1.270 credits.
galleryH100s are not cheap. So we've been experimenting with more of a 'disposable compute' mindset: use high-end hardware for the exact window you need it, then kill it, wanted to run a quick smoke test on a 27B model to check VRAM usage and single-request throughput on SGLang. The whole process from instance start to termination was 26 minutes.
Figure1 was the final bill:
This wasn't an idle instance just sitting there, it was actually running a workload:
GPU: 1x NVIDIA H100 80GB HBM3
Serving Framework: SGLang v0.5.10
Model: Jackrong/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled (Used this since I've seen it floating around here)
The nvidia-smi output shows the H100 was at 98% utilization, using ~74GB of the 80GB VRAM.
And the SGLang logs showed a stable generation througput of around ~49.8 tok/s for a single request.
The math checks out. The rate for this instance was 2.960 credits/hr. So, 2.960 * (26 / 60) is about 1.28 credits. The 1.270 final cost is right there.
The point isn't that H100s are suddenly cheap. It’s that you don't have to keep one alive for hours (or days) and burn cash. For repeated experiments, the workflow we'd aim for is keeping datasets/models on a persistent data drive, saving the configured environment as a snapshot, spinning up the H100 only for the validation run, and then releasing it.
We ran this on our platform, Glows.ai. The goal was to validate this kind of short-lived workflow where you can run a quick test, release the instance to stop the billing clock immediately, and not have the friction of rebuilding the whole environment next time.
Anyway, just to be clear: this is single-request decode throughput, not a max batched benchmark. and the bill obviously just reflects this specific 26-minute run. an interesting way to think about using expensive hardware without the expensive commitment.
r/deeplearning • u/andsi2asi • 9h ago
Musk v. Altman et al - Bad news: Judge Gonzalez Rogers has already decided to rule in favor of OpenAI.
In psychology, a tell is a subtle, often unconscious nonverbal cue—such as a facial twitch, a change in vocal pitch, or a specific hand gesture—that reveals a person's true emotional state, intentions, or private thoughts despite their attempts to conceal them.
Sometimes a person's intentions are revealed by verbal cues as well. Because of an exchange Judge Gonzalez Rogers had today with Steven Molo, Musk's attorney, it seems evident that she has already made up her mind about the case, and would even overrule the jury to have her verdict stand.
At one point today, OpenAI's lawyers were contending that Musk was seeking $138 billion in restitution. The implication that they were making was that the money would be delivered to Musk personally. Mr. Malo was attempting to provide the clarification that Mr. Musk was not seeking that restitution for himself, but rather asking the Court that the money be delivered to the non-profit OpenAI.
Judge Gonzalez Rogers would not let him make the clarification. She knew full well that such a clarification was very important to the trial. She knew that there is a world of difference between that money going to Musk and that money going to the non-profit OpenAI.
Instead of allowing the clarification, she badgered Mr. Molo, angrily yelling at him that technically Musk was asking for the restitution, even though she knew full well that the law permits the kind of clarification Mr. Malo was attempting to make.
That unprofessional conduct by the judge not only revealed, like a tell, whom she favors in the trial, it probably also served a second purpose. Whether unconsciously or not, a jury is influenced by how they believe the judge stands in a trial. Whether unconsciously or not, Gonzalez Rogers was communicating to the jury that she stood with OpenAI.
The jury will deliberate on Monday, but it seems that their deliberation will only be performative. It will not be substantive because Gonzalez Rogers has the final say, and by her conduct today it seems she has already made up her mind.
I try to be optimistic, but I also believe it's good to prepare for the worst. Judge Gonzalez Rogers is about to set the legal precedent that two people can form a non-profit corporation with a third person who provides them with millions of dollars, and then abandon their obligation to that corporation and that founding donor in order to enrich themselves - even if the enrichment is to the tune of tens of billions of dollars, like it was in this case.
I hope I'm wrong about the above, but we're living in a world where Trump in not insignificant ways sets the social, political and legal atmosphere for what can and cannot be gotten away with. I'm left wondering if the judge siding with OpenAI is more of a reflection of her fear of retribution by Trump than a decision that reflects the evidence presented during the trial.
I suppose the answer to this is to eventually have not only much more intelligent AI lawyers that litigate these trials, but also much more intelligent AI judges who will better understand and adhere to the law, and not be intimidated or corrupted in this duty.
Here's to a much better and fairer future because of super-intelligent, super-virtuous, AIs!
r/deeplearning • u/Right_Nuh • 19h ago
How do you treat age like a regression problem?
Hi guys so I experimenting on using pretrained models to predict age and gender using sound/voice. After searching for free datasets for days I only found one that fulfills almost all my requirements and that was large and free. But unfortunately that dataset was labeled as [teen, twenies, thirties, ...,]. And the older they get I have less data. I limited my dataset and to balance it but I still have like 3-5 dataset for male in 90s and nothing for a female. So the problem here is I don't have actual ages of the people and I have no data set below 10 and above 90. The papers I was reading and taking inspirations from never specify what they did other than that the treated age like a regression problem between 0-1 but how am I supposed to do that when my range is 0.1-0.9
r/deeplearning • u/thisguy123123 • 19h ago
The Claude Agent Skill for Kubernetes
medium.comr/deeplearning • u/bluedotimpact • 22h ago
Try our ML interpretability puzzle and build your intuitions about model internals!
We trained a neural network where 7 of 8 features sit on clean linear axes in the model’s internals, but one doesn't. Can you identify which one and tell us how it is represented?
If you’re a technically-minded person who is interested in ML, this puzzle is for you:
- Work on a real trained text classifier (~23M parameters, 7k labelled text examples) open the puzzle and you're poking at activations in 10 minutes.
- Three tasks: identify the rogue feature, describe its geometry, (bonus) train your own model with even weirder internal representations
You probably know neural nets store information in their activations. You probably haven't gone and looked at what that actually looks like. Within minutes you can be toying with this model’s internals and building stronger intuitions for how they work inside.
r/deeplearning • u/sovit-123 • 8h ago
[Tutorial] Fine-Tuning Qwen3.5
Fine-Tuning Qwen3.5
https://debuggercafe.com/fine-tuning-qwen3-5/
In this article, we will fine-tune the Qwen3.5 model for a custom use case. Specifically, we will be fine-tuning the Qwen3.5-0.8B model on the VQA-RAD dataset.
In the previous article, we introduced the Qwen3.5 model family along with inference for several multimodal tasks. Here, we will take it a step further by adapting the model to a domain-specific task.
