ViewTube

ViewTube
Sign inSign upSubscriptions
Filters

Upload date

Type

Duration

Sort by

Features

Reset

147,700 results

IBM Technology
What is Mixture of Experts?

Want to play with the technology yourself? Explore our interactive demo → https://ibm.biz/BdK8fn Learn more about the ...

7:58
What is Mixture of Experts?

48,818 views

1 year ago

Maarten Grootendorst
A Visual Guide to Mixture of Experts (MoE) in LLMs

In this highly visual guide, we explore the architecture of a Mixture of Experts in Large Language Models (LLM) and Vision ...

19:44
A Visual Guide to Mixture of Experts (MoE) in LLMs

48,738 views

1 year ago

AI Papers Academy
Introduction to Mixture-of-Experts | Original MoE Paper Explained

In this video we go back to the extremely important Google paper which introduced the Mixture-of-Experts (MoE) layer with ...

4:41
Introduction to Mixture-of-Experts | Original MoE Paper Explained

11,544 views

1 year ago

Julia Turc
Mixture of Experts: How LLMs get bigger without getting slower

Mixture of Experts (MoE) is everywhere: Meta / Llama 4, DeepSeek, Mistral. But how does it actually work? Do experts specialize?

26:42
Mixture of Experts: How LLMs get bigger without getting slower

26,800 views

9 months ago

Stanford Online
Stanford CS336 Language Modeling from Scratch | Spring 2025 | Lecture 4: Mixture of experts

For more information about Stanford's online Artificial Intelligence programs visit: https://stanford.io/ai To learn more about ...

1:22:04
Stanford CS336 Language Modeling from Scratch | Spring 2025 | Lecture 4: Mixture of experts

62,088 views

9 months ago

Chris Hay
MoE Models Don't Work Like You Think - Inside GPT-OSS

Many people think that mixture of expert models have domain experts, i.e. math experts, code experts, language experts.

18:28
MoE Models Don't Work Like You Think - Inside GPT-OSS

3,780 views

2 weeks ago

bycloud
1 Million Tiny Experts in an AI? Fine-Grained MoE Explained

To try everything Brilliant has to offer—free—for a full 30 days, visit https://brilliant.org/bycloud/ . You'll also get 20% off an annual ...

12:29
1 Million Tiny Experts in an AI? Fine-Grained MoE Explained

55,233 views

1 year ago

Cerebras
Mixture of Experts Explained: How to Build, Train & Debug MoE Models in 2025

Mixture-of-Experts (MoE) models now power leading AI systems like GPT-4, Qwen3, DeepSeek-v3, and Gemini 1.5. But behind ...

4:32
Mixture of Experts Explained: How to Build, Train & Debug MoE Models in 2025

1,571 views

6 months ago

Organic Mechanisms
How To Create And Use A Pharmacophore In MOE | MOE Tutorial

Molecular Operating Environment (MOE) tutorial covering how to create and use a pharmacophore in MOE. When docking ...

4:12
How To Create And Use A Pharmacophore In MOE | MOE Tutorial

7,295 views

3 years ago

SaM Solutions
Mixture-of-Experts (MoE) LLMs: The Future of Efficient AI Models

Imagine having a whole team of specialists at your disposal, each an expert in a different field, and a smart coordinator who ...

6:01
Mixture-of-Experts (MoE) LLMs: The Future of Efficient AI Models

299 views

5 months ago

AI Research Roundup
Why Orthogonal Weights Fail in MoE Models

In this AI Research Roundup episode, Alex discusses the paper: 'Geometric Regularization in Mixture-of-Experts: The Disconnect ...

3:35
Why Orthogonal Weights Fail in MoE Models

11 views

2 weeks ago

Stanford Online
Stanford CS25: V1 I Mixture of Experts (MoE) paradigm and the Switch Transformer

In deep learning, models typically reuse the same parameters for all inputs. Mixture of Experts (MoE) defies this and instead ...

1:05:44
Stanford CS25: V1 I Mixture of Experts (MoE) paradigm and the Switch Transformer

40,385 views

3 years ago

Paper With Video
[2024 Best AI Paper] MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts

This video was created using https://paperspeech.com. If you'd like to create explainer videos for your own papers, please visit the ...

11:36
[2024 Best AI Paper] MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts

71 views

1 year ago

No Hype AI
How Did They Do It? DeepSeek V3 and R1 Explained

DeepSeek: The First Open-Weight Reasoning Model! In this video, I'll break down DeepSeek's two flagship models— V3 and R1 ...

11:15
How Did They Do It? DeepSeek V3 and R1 Explained

47,221 views

11 months ago

Matt Williams
What are the different types of models - The Ollama Course

Dive into the world of Ollama and discover the various types of AI models at your fingertips. This comprehensive guide breaks ...

6:49
What are the different types of models - The Ollama Course

38,292 views

1 year ago

AI with Lena Hall
Transformers vs MoE vs RNN vs Hybrid: Intuitive LLM Architecture Guide

Most developers default to transformers without understanding the alternatives. This video breaks down the intuition behind four ...

16:56
Transformers vs MoE vs RNN vs Hybrid: Intuitive LLM Architecture Guide

19,204 views

3 months ago

Anyscale
Ray + vLLM  Efficient Multi Node Orchestration for Sparse MoE Model Serving | Ray Summit 2025

Slides: https://drive.google.com/file/d/11OSdPJLZ1v4QH2KHlEYGYCts5qEdR5gN/view?usp=sharing At Ray Summit 2025, ...

30:58
Ray + vLLM Efficient Multi Node Orchestration for Sparse MoE Model Serving | Ray Summit 2025

653 views

2 months ago

AI Coffee Break with Letitia
MAMBA and State Space Models explained | SSM explained

We simply explain and illustrate Mamba, State Space Models (SSMs) and Selective SSMs. SSMs match performance of ...

22:27
MAMBA and State Space Models explained | SSM explained

82,727 views

1 year ago

bycloud
The REAL AI Architecture That Unifies Vision & Language

Get started now with open source & privacy focused password manager by Proton! https://proton.me/pass/bycloudai In this video, ...

10:13
The REAL AI Architecture That Unifies Vision & Language

44,629 views

7 months ago

Marktechpost AI
NVIDIA Releases Nemotron 3: Hybrid Mamba Transformer Models With Latent MoE .....

NVIDIA Nemotron 3 is an open family of hybrid Mamba Transformer MoE models, designed for agentic AI with long context and ...

5:30
NVIDIA Releases Nemotron 3: Hybrid Mamba Transformer Models With Latent MoE .....

586 views

1 month ago

Cerebras
Daria Soboleva   Training and Serving MoE Models Efficiently

... models efficiently before I start a quick intro about myself um I am researching LMS at Cerebras one of my recent gigs is MOE ...

9:18
Daria Soboleva Training and Serving MoE Models Efficiently

206 views

1 month ago

bycloud
Mamba Might Just Make LLMs 1000x Cheaper...

Check out HubSpot's ChatGPT at work bundle! https://clickhubspot.com/twc Would mamba bring a revolution to LLMs and ...

14:06
Mamba Might Just Make LLMs 1000x Cheaper...

141,245 views

1 year ago

LLM Implementation
How 120B+ Parameter Models Run on One GPU (The MoE Secret)

How is it possible for a 120 billion parameter AI model to run on a single consumer GPU? This isn't magic—it's the result of ...

6:47
How 120B+ Parameter Models Run on One GPU (The MoE Secret)

1,582 views

5 months ago

Red Hat
[vLLM Office Hours #29] Scaling MoE with llm-d

Time Stamps: 00:00 Bi-weekly vLLM project update (v0.9.2 and v0.10.0) 14:30 Scaling MoE models with llm-d 55:40 Q&A + ...

1:02:27
[vLLM Office Hours #29] Scaling MoE with llm-d

2,013 views

Streamed 5 months ago

Professor Rahul Jain
MoE AI Models Explained | Future of Scalable & Efficient Artificial Intelligence

Unlock the future of Artificial Intelligence with this quick and powerful explanation of MoE (Mixture of Experts) AI Models. In under ...

2:01
MoE AI Models Explained | Future of Scalable & Efficient Artificial Intelligence

67 views

1 month ago