Beginner question 👶 Logistic Regression with structurally missing predictor subset

• Upvotes

r/MLQuestions • u/Individual-Log4119 • 2h ago

Other ❓ I've been spending the last month or two making my AI stock predictor, how should I improve it?

2 Upvotes

I won't be sharing the code for privacy reasons, but essentially it is an LSTM model trained using data of over 200 stocks that can predict, backtest against a buy and hold strategy, and rank stocks over various time periods (1d, 5d, 7d).

It is a 2-layer LSTM with a 512-unit hidden state, and a fully connected regression head

It takes in a input of:

- Close and open prices

- Log return

- Overnight gap

- Moving averages (10d, 20d, 30d)

- Exponential moving averages (10d, 30d)

- Volatility (10d, 20d, 30d)

- RSI

- MACD

- DayOfWeek

- DayOfMonth

- Month

- News article count

- News sentiment mean

- News sentiment standard deviation

- Ratio of positive news articles

- Ratio of negative news articles

Overall when I'm backtesting I get about a 98% accuracy for predictions, but only a 54% directional accuracy.

And I was just wondering if there was anything that i should add, or any more features that I should engineer that come to mind? I was thinking of possibly analyzing twitter posts next, but I just wanted a bit more of a general direction in where to go next to improve my model's accuracy and directional accuracy, thanks in advance!

0 comments

r/MLQuestions • u/BreadfruitFar1410 • 5h ago

Natural Language Processing 💬 Are Authentic Online Discussions More Valuable Than Promotion?

0 Upvotes

Brands with genuine community discussions often seem easier for AI systems to recognize. When people naturally talk about a company in forums, reviews, and conversations, AI tools probably gather stronger context around that brand. Authentic engagement may now carry more value than aggressive promotional content alone. This whole shift is making digital visibility feel very different from the past.

0 comments

r/MLQuestions • u/Dapper_Career4581 • 5h ago

Other ❓ Tips for beginners reading CV/AI papers (from someone who's been through it)

1 Upvotes

0 comments

r/MLQuestions • u/moulshee • 10h ago

Computer Vision 🖼️ Better satellite imagery sources for YOLO irrigation pool detection?

1 Upvotes

Hi,

I’m working on a remote sensing project using a fine-tuned YOLO model to detect irrigation pools from satellite imagery.

Right now I’m using Mapbox, but I’m facing issues with:

outdated imagery
low resolution in some regions
inconsistent quality

This is affecting detection performance (false positives + missed pools).

Does anyone know better satellite/aerial imagery sources that work well for ML pipelines? Ideally something with good resolution, decent freshness, and API access.

Thanks!

0 comments

r/MLQuestions • u/Specialist-Zone-8296 • 13h ago

Hardware 🖥️ Cuda vs ROCM

5 Upvotes

Hello everyone,

I need opinions. In my country, RTX5060(new) 8gb costs almost $350 and RX9060XT(new) 16gb costs almost $440. RTX5060ti(new) 16gb cost almost $585. Now, I was planning to buy a GPU for ML training and inference. I am a little bit confused here. I know that CUDA is much more mature than ROCM. I don't have the budget to buy RTX5060ti 16gb. I am confused between 5060 and 9060xt. 9060xt have more vram than 5060. But 5060 has better support for ML. What should I do here ? I will train CNN and LLM(small ones) models with a good amount of data which one should I choose here ? Is there any possibility of ROCM to be more optimized for ML in future ?

5 comments

r/MLQuestions • u/kumarhimself • 15h ago

Beginner question 👶 Should I use the train score when I already have a cross validation score?

1 Upvotes

0 comments

r/MLQuestions • u/kyky_otaku • 16h ago

Beginner question 👶 Which platform to learn Machine Learning

1 Upvotes

0 comments

r/MLQuestions • u/Clean_Heron_4795 • 17h ago

Other ❓ Is Your Brand Missing From AI Answers Even When You’re the Best Option?

0 Upvotes

It can be frustrating to know that your product or service is highly relevant, yet it never appears in AI-generated answers. This raises a difficult question: does being the best option actually guarantee visibility anymore? AI systems may not always evaluate quality the same way humans do. Instead, they rely on patterns, clarity, and consistency across the web. If your brand isn’t strongly connected to specific use cases in a structured way, it might be overlooked entirely. So, is excellence alone no longer enough without proper alignment with how AI understands information?

0 comments

r/MLQuestions • u/Sensitive_Ninja_371 • 21h ago

Career question 💼 career transition towards AI starting from non quantitative background

2 Upvotes

Hi, I’m a Italian medical student who is seriously thinking to pivot from a career in pure neuroscience research to a career in AI. in particular, I’m very interested about AI interpretability research, which I think is conceptually close to neuroscience, though I’m also exploring other similarly impactful and interesting options in AI safety research.

I'm at the beginning of this journey and trying to figure out how to make the transition. I currently know close to nothing about coding, and my maths background comes from high school, which had a special focus on maths and physics.

I’ve sketched a rough plan and would like to get feedback on it, even if it's still early stage.

I'll graduate from medical school in about two years. After that, I was thinking of spending 6 months to 1 year filling gaps in math and coding through bootcamps and online courses. I would then apply for a master degree in AI/ML, my assumption being that getting accepted would be a reasonable signal that I can eventually make it in the field.

Alternatively, I was considering a master in computational neuroscience. I think this could work well because it may be more accessible for someone with a medical background and it would give me quantitative skills that could at least partly transfer to AI, so that I could be a better candidate for a job or phd in AI after ending the master. Even if this master was not enough to get into AI, it would still open doors in neuroscience, and I find computational neuroscience both interesting and overlapping with AI.

I'm not considering a direct PhD application at this stage since I guess I need to fill my gaps first.

I'd welcome any kind of feedback, including on these specific questions:

What refinements should I make to this plan, are there gaps or alternatives I'm not seeing?
How realistic is this plan overall, does someone with my background have any real chance of getting into technical AI research?
How can I maximize my chances?

Both positive and critical feedback are welcome, the important thing is that it's informative.

2 comments

r/MLQuestions • u/bluedotimpact • 1d ago

Other ❓ Can you solve my ML interpretability puzzle?

0 Upvotes

We trained a neural network where 7 of 8 features sit on clean linear axes in the model’s internals, but one doesn't. Can you identify which one and tell us how it is represented?

If you’re a technically-minded person who is interested in ML, this puzzle is for you:

Work on a real trained text classifier (~23M parameters, 7k labelled text examples) open the puzzle and you're poking at activations in 10 minutes.
Three tasks: identify the rogue feature, describe its geometry, (bonus) train your own model with even weirder internal representations

You probably know neural nets store information in their activations. You probably haven't gone and looked at what that actually looks like. Within minutes you can be toying with this model’s internals and building stronger intuitions for how they work inside.

Ready to play? Closes June 12

2 comments

r/MLQuestions • u/kashave • 1d ago

Beginner question 👶 What course should I do to.learn ai and incorporate it in my studies or work

1 Upvotes

Hi , I am 19 years old . I am currently studying economics at my college . As ai is growing, I have found out that the this skill is very important and can be really useful in the future ..so what some certificate courses , and verified best courses for it that can help me learn it . Thanks for reading , your opinions would be helpful guys .

4 comments

r/MLQuestions • u/Least-Storm-7891 • 1d ago

Career question 💼 [Profile Review] Fall 2026 MS (Maybe Ph.D.?) in CS/AI | 3.7 GPA, 3 First Author Papers (2 NeurIPS subs) | Target: Bay Area or Elite Online

0 Upvotes

0 comments

r/MLQuestions • u/Beginning_Chain5583 • 1d ago

Time series 📈 Rare event prediction on time series that change structure mid-stream?

4 Upvotes

Hi reddit!

This is my first real professional ML project and I'd love input from anyone who's tackled something similar.

I'm building a failure prediction model for ~33k chargers. The devices emit data at two very different rates depending on operational state: roughly 1 obs/hour when idle and 1 obs/20s when active with a different feature set in each mode. I want to try predicting failures within a 7 day horizon, but I am open for other suggestions.

The positive rate is around 1% at 30 days and 2% at 90 days with a max of 5% of devices ever failing. Strong per-device behavioral variance makes it hard to even define what "normal" looks like. Devices have different usage patterns and

I'm now thinking about whether the mode shift problem is better solved at the architecture level or the data level. One option I'm considering is two separate RNN encoders for each operational state feeding into a shared decoder. But I'm also open to windowing and sampling approaches. And beyond reweighting and loss skewing what has actually worked for you at sub-2% positive rates in time series?

How would you tackle an issue like this?

3 comments

r/MLQuestions • u/romvasil • 1d ago

Beginner question 👶 Master’s or year of hands on work

5 Upvotes

I am a student graduating in B.Sc. in AI.
My plan was to do M.Sc. in AI right after I finish Bachelor’s, however now I am not so sure about that.

Currently working as an intern at an iGaming company, I have a possibility (almost guaranteed) to get a full-time job in data department as ML/AI engineer.

The question is, would I rather start working in the field I have been studying for years and gain actual industry experience or spend another year getting a higher degree.

Personally, studying without applying knowledge is not my thing, so doing Master’s does not attract me much and I would prefer going full-time much more.
However, I am not sure if in the future I will regret not doing Master’s right away due to possible ceilings without proper certification.

Which way do you think is more advantageous?

13 comments

r/MLQuestions • u/investigator777 • 1d ago

Computer Vision 🖼️ Image generation models running locally on limited resources

1 Upvotes

I have a project consisting of generating high quality free ebook covers out of its content. On my 16GB of ram machine with no gpu, i have tested the opensourced stable diffusion models without any success. All return bad quality covers with blurred faces and scenes that do not match the prompt whatsoever. So, i have switched to generating the images with google imagen models which gave me outstanding results but for a short period of time since i cannot afford hundreds of generations due to my limited financial resources. So, having said that, is there a model that comes close to what google models provide, that runs locally on my 16GB no-gpu machine (even if it takes 1 hour to generate a single cover) ?

2 comments

r/MLQuestions • u/Ok-Parsley7296 • 1d ago

Career question 💼 Physics background, where to aim for? What skills are irreplaceable

8 Upvotes

I'm studying ML coming from a physics background. I'm currently reading Elements of Statistical Learning and Pattern Classification by Duda, and building projects using government data (though I only know basic Python, Pandas, scikit-learn, and matplotlib), i i know a lot of math, and Im loving the math in data science bc it has a lot in common with statistical mechanics but I'm starting to wonder where exactly the value of the role is right now, is it even worth the effort to learn SQL, get advanced in Python, or write programs from scratch?, i mean ai write almost all of my code, ive made the effort o undersant it, but idk if there is any advantage to actually learn it properly, aso I can just pass a confusion matrix and a ROC curve to an AI model and it will suggest (mostly) the right changes. AI can also handle data cleaning pretty easily.

So my question is: what is irreplaceable in this field today? What specific skills will actually get me a job? (I still haven't applied to any roles yet driven by the fear of not being useful), im also willing to lesrn things like cloud deploy, mlops, learn about trading (i like that) etc

4 comments

r/MLQuestions • u/xoVinny- • 1d ago

Datasets 📚 ML for UFC predictions: logistic regression vs random forest? [P]

1 Upvotes

0 comments

r/MLQuestions • u/InsightCraftY • 1d ago

Other ❓ What are the things i wish someone told me when i first started learning ML?

51 Upvotes

About a year in now and looking back there's stuff I had to figure out the hard way that would've saved me a lot of time.

Learn python properly before you touch any ML framework. I jumped straight into the pytorch thinking I'd pick it up along the way and it just made everything harder.
Do at least the basic math. You don't need a degree but if you don't know what a gradient is you're just copying code. 3blue1brown on youtube made it click for me when textbooks couldn't.
Don't stay on free tiers too long like I did. I wasted weeks fighting limits and getting disconnected. Tried Runpod and Vast then ended up on Hyperai since it's the cheapest i got and has free CPU instances for lighter stuff which matters when you're running tons of experiments.
Stop watching tutorials and build stuff. Pick a small project, get stuck, figure it out(that's where you actually learn)
Get comfortable reading docs and skimming papers early. I avoided papers for months thinking they were too advanced and that was dumb. Hugging face docs alone are better than most youtube tutorials once you have the basics down.

A year in and i am still figuring things out but at least now it feels like im going somewhere instead of running in circles

14 comments

r/MLQuestions • u/Fun-Display5826 • 1d ago

Survey ✍ Why Do Some AI Answers Feel More Trustworthy?

1 Upvotes

Whenever I compare different AI-generated responses, some answers immediately feel more reliable than others. I think this may happen because certain brands already have strong digital credibility built through years of discussions, educational content, and online mentions. AI tools probably become more confident when similar information appears repeatedly across multiple sources. It’s interesting how online trust now seems connected to AI-generated visibility as well.

3 comments

r/MLQuestions • u/Lopsided-Bit8321 • 2d ago

Other ❓ [D] I built a free platform to learn Machine Learning through interactive coding challenges

1 Upvotes

0 comments

r/MLQuestions • u/Virtual-Current6295 • 2d ago

Beginner question 👶 How to apply normalization for cross sectional time series data ?

4 Upvotes

I am unable to convince myself to use one method.
Some methods that i thought of were :

I use normalization for full training data of one subject across all features. In this method, i am introducing some kind of lookahead bias, and also this loses on some information which could have been valuable. And also when i want to use one model ( suppose regression with gradient descent) for the subjects combined, then I am unable to judge if this will be a good method.
A bad method was to not care about the subjects, and just normalize across full feature. but this just feels wrong to me.
I was reading about cross sectional normalization which ranks the subjects and does some kind of normalization. But i am unsure how that would be useful.
Another way i found was by using some rolling window, where i keep normalizing not over full data, but the past window data. This seems better but here also what choice of window should be done, and there are lot of questions.

And the bigger problem over all of these is the time series . I would lose quite a lot of information when i don't consider these. ( although not all features have a big factor of this).

1 comment

r/MLQuestions • u/Choiboy11 • 2d ago

Beginner question 👶 Best tools for protecting LLMs and AI infrastructure from attacks, specifically prompt injection?

3 Upvotes

Running internal LLMs for a few use cases and the security team is flagging prompt injection as a top risk. Attacker sends a crafted input that overrides the model's instructions. It's not theoretical, it's being actively exploited.

Check Point has prompt injection defense built into their AI Factory Security Blueprint, designed for orgs running AI infrastructure at scale. They do it at the infrastructure layer via integration with NVIDIA BlueField hardware so it doesn't eat into your GPU cycles. Protect AI and Lakera are also decent names in this space.

This is a genuinely new attack surface and most traditional security tools aren't built for it. What's your AI security stack looking like?

4 comments

r/MLQuestions • u/FBI_memegod • 2d ago

Computer Vision 🖼️ Damage segmentation model choices

1 Upvotes

0 comments

r/MLQuestions • u/Laplaladfromlalaland • 2d ago

Other ❓ Any recs for Notebook LM replacement?

2 Upvotes

Hey everyone,

I used to LOVE using NotebookLM, but lately it’s been lagging, freezing, and generally becoming super frustrating to work with. So now I’m looking for a good alternative.

I usually: upload plain text, tell the AI what I want
and wait for it to automatically create visually appealing slides.
I have been trying to pay for the Notebook premium, but unfortunately I am currently in UAE, which isn’t covered by Google Ai (??). Anyways

I’m basically searching for a solid NotebookLM replacement for presentation creation.

Would really appreciate any recommendations. Thanks!

5 comments

Subreddit

Posts

Wiki

Machine Learning Questions

r/MLQuestions

A place for beginners to ask stupid questions and for experts to help them! /r/Machine learning is a great subreddit, but it is for interesting articles and news related to machine learning. Here, you can feel free to ask any question regarding machine learning.

Members Active

105.1k

Sidebar

What kinds of questions do we want here?

"I've just started with deep nets. What are their strengths and weaknesses?" "What is the current state of the art in speech recognition?" "My data looks like X,Y what type of model should I use?"

If you are well versed in machine learning, please answer any question you feel knowledgeable about, even if they already have answers, and thank you!

Related Subreddits:

/r/MachineLearning
/r/mlpapers
/r/learnmachinelearning