Upload date
All time
Last hour
Today
This week
This month
This year
Type
All
Video
Channel
Playlist
Movie
Duration
Short (< 4 minutes)
Medium (4-20 minutes)
Long (> 20 minutes)
Sort by
Relevance
Rating
View count
Features
HD
Subtitles/CC
Creative Commons
3D
Live
4K
360°
VR180
HDR
4,627 results
Mixture of Experts (MoE) is everywhere: Meta / Llama 4, DeepSeek, Mistral. But how does it actually work? Do experts specialize?
26,769 views
9 months ago
Many people think that mixture of expert models have domain experts, i.e. math experts, code experts, language experts.
3,772 views
2 weeks ago
For more information about Stanford's online Artificial Intelligence programs visit: https://stanford.io/ai To learn more about ...
61,984 views
Imagine having a whole team of specialists at your disposal, each an expert in a different field, and a smart coordinator who ...
298 views
5 months ago
Mixture-of-Experts (MoE) models now power leading AI systems like GPT-4, Qwen3, DeepSeek-v3, and Gemini 1.5. But behind ...
1,569 views
6 months ago
DeepSeek: The First Open-Weight Reasoning Model! In this video, I'll break down DeepSeek's two flagship models— V3 and R1 ...
47,201 views
11 months ago
In this AI Research Roundup episode, Alex discusses the paper: 'Geometric Regularization in Mixture-of-Experts: The Disconnect ...
11 views
Never get stuck without AI again. Run three Small Language Models (SLMs)—also called Local LLMs—TinyLlama, Gemma-3 and ...
5,878 views
Slides: https://drive.google.com/file/d/11OSdPJLZ1v4QH2KHlEYGYCts5qEdR5gN/view?usp=sharing At Ray Summit 2025, ...
650 views
2 months ago
Get started now with open source & privacy focused password manager by Proton! https://proton.me/pass/bycloudai In this video, ...
44,624 views
7 months ago
Most developers default to transformers without understanding the alternatives. This video breaks down the intuition behind four ...
19,204 views
3 months ago
Time Stamps: 00:00 Bi-weekly vLLM project update (v0.9.2 and v0.10.0) 14:30 Scaling MoE models with llm-d 55:40 Q&A + ...
2,013 views
Streamed 5 months ago
I changed 2 settings in LM Studio and I increased my t/s by about 4x. My 8gb gpu (rtx 4060) now runs GPT OSS 120b at 20t/s and ...
12,377 views
4 months ago
In this video we'll go through three methods of running SUPER LARGE AI models locally, using model streaming, model serving, ...
13,737 views
How is it possible for a 120 billion parameter AI model to run on a single consumer GPU? This isn't magic—it's the result of ...
1,579 views
You'll also learn about real-world MoE models like Mixtral, and DeepSeek, which achieve state-of-the-art performance while ...
4,915 views
... models efficiently before I start a quick intro about myself um I am researching LMS at Cerebras one of my recent gigs is MOE ...
206 views
1 month ago
Resources: - Understanding and Coding the KV Cache in LLMs from Scratch: ...
14,282 views
NVIDIA Nemotron 3 is an open family of hybrid Mamba Transformer MoE models, designed for agentic AI with long context and ...
585 views
Arcee AI is a the startup I've found to be taking the most real approach to monetizing their open models. With a bunch of ...
400 views
13 hours ago