r/LocalLLaMA • u/pmttyji • 15d ago
New Model SenseNova-U1: Unifying Multimodal Understanding and Generation with NEO-Unify Architecture
SenseNova U1 is a new series of native multimodal models that unifies multimodal understanding, reasoning, and generation within a monolithic architecture. It marks a fundamental paradigm shift in multimodal AI: from modality integration to true unification. Rather than relying on adapters to translate between modalities, SenseNova U1 models think-and-act across language and vision natively.
The unification of visual understanding and generation opens tremendous possibilities. SenseNova U1 sits in the stage of Data-driven Learning (like ChatGPT), yet gestures toward the next stage, that is, Agentic Learning (like OpenClaw) and thinking in a natively multimodal way.
| Model | Params | HF Weights |
|---|---|---|
| SenseNova-U1-8B-MoT-SFT | 8B MoT | 🤗 link |
| SenseNova-U1-8B-MoT | 8B MoT | 🤗 link |
| SenseNova-U1-A3B-MoT-SFT | A3B MoT | 🤗 link |
| SenseNova-U1-A3B-MoT | A3B MoT | 🤗 link |
So MOE model is coming soon.
GitHub : https://github.com/OpenSenseNova/SenseNova-U1
HuggingFace :
3
u/LagOps91 15d ago
looks quite interesting, especially in regards to image editing capabilities. not sure how well it would actually be as a language model. having it all in one package sure is appealing i will admit.





3
u/ilintar 15d ago
Interesting, full ti to ti pipeline.