ViewTube

ViewTube
Sign inSign upSubscriptions
Filters

Upload date

Type

Duration

Sort by

Features

Reset

180,360 results

IBM Technology
What is Mixture of Experts?

Want to play with the technology yourself? Explore our interactive demo → https://ibm.biz/BdK8fn Learn more about the ...

7:58
What is Mixture of Experts?

48,777 views

1 year ago

Maarten Grootendorst
A Visual Guide to Mixture of Experts (MoE) in LLMs

In this highly visual guide, we explore the architecture of a Mixture of Experts in Large Language Models (LLM) and Vision ...

19:44
A Visual Guide to Mixture of Experts (MoE) in LLMs

48,688 views

1 year ago

AI Papers Academy
Introduction to Mixture-of-Experts | Original MoE Paper Explained

In this video we go back to the extremely important Google paper which introduced the Mixture-of-Experts (MoE) layer with ...

4:41
Introduction to Mixture-of-Experts | Original MoE Paper Explained

11,537 views

1 year ago

Julia Turc
Mixture of Experts: How LLMs get bigger without getting slower

Mixture of Experts (MoE) is everywhere: Meta / Llama 4, DeepSeek, Mistral. But how does it actually work? Do experts specialize?

26:42
Mixture of Experts: How LLMs get bigger without getting slower

26,773 views

9 months ago

Chris Hay
MoE Models Don't Work Like You Think - Inside GPT-OSS

Many people think that mixture of expert models have domain experts, i.e. math experts, code experts, language experts.

18:28
MoE Models Don't Work Like You Think - Inside GPT-OSS

3,773 views

2 weeks ago

Cerebras
Mixture of Experts Explained: How to Build, Train & Debug MoE Models in 2025

Mixture-of-Experts (MoE) models now power leading AI systems like GPT-4, Qwen3, DeepSeek-v3, and Gemini 1.5. But behind ...

4:32
Mixture of Experts Explained: How to Build, Train & Debug MoE Models in 2025

1,569 views

6 months ago

bycloud
1 Million Tiny Experts in an AI? Fine-Grained MoE Explained

To try everything Brilliant has to offer—free—for a full 30 days, visit https://brilliant.org/bycloud/ . You'll also get 20% off an annual ...

12:29
1 Million Tiny Experts in an AI? Fine-Grained MoE Explained

55,229 views

1 year ago

bycloud
Mamba Might Just Make LLMs 1000x Cheaper...

Check out HubSpot's ChatGPT at work bundle! https://clickhubspot.com/twc Would mamba bring a revolution to LLMs and ...

14:06
Mamba Might Just Make LLMs 1000x Cheaper...

141,233 views

1 year ago

MoeIsBetter
Moe - Outta There (Official Video)

Outta There (Prod By Ayo N Keyz) "Rich Dreamin" Available Now on ALL streaming platforms LINK BELOW ...

2:32
Moe - Outta There (Official Video)

4,601,709 views

6 years ago

Stanford Online
Stanford CS336 Language Modeling from Scratch | Spring 2025 | Lecture 4: Mixture of experts

For more information about Stanford's online Artificial Intelligence programs visit: https://stanford.io/ai To learn more about ...

1:22:04
Stanford CS336 Language Modeling from Scratch | Spring 2025 | Lecture 4: Mixture of experts

61,986 views

9 months ago

Stanford Online
Stanford CS25: V1 I Mixture of Experts (MoE) paradigm and the Switch Transformer

In deep learning, models typically reuse the same parameters for all inputs. Mixture of Experts (MoE) defies this and instead ...

1:05:44
Stanford CS25: V1 I Mixture of Experts (MoE) paradigm and the Switch Transformer

40,379 views

3 years ago

Cerebras
Daria Soboleva   Training and Serving MoE Models Efficiently

... models efficiently before I start a quick intro about myself um I am researching LMS at Cerebras one of my recent gigs is MOE ...

9:18
Daria Soboleva Training and Serving MoE Models Efficiently

206 views

1 month ago

Soumyajit Das
MOE Explained in 150 seconds

In this quick 150-second deep dive, we explore the architecture behind some of the world's most powerful AI models: Mixture of ...

2:32
MOE Explained in 150 seconds

20 views

3 weeks ago

SaM Solutions
Mixture-of-Experts (MoE) LLMs: The Future of Efficient AI Models

Imagine having a whole team of specialists at your disposal, each an expert in a different field, and a smart coordinator who ...

6:01
Mixture-of-Experts (MoE) LLMs: The Future of Efficient AI Models

298 views

5 months ago

People also watched

Hal Shin
MiniMax M2.1 vs Claude Opus 4.5: 10x Cheaper, but is it Better?

Can a significantly cheaper model actually compete with Claude Opus 4.5 for real production work? In this video, I run the exact ...

44:54
MiniMax M2.1 vs Claude Opus 4.5: 10x Cheaper, but is it Better?

2,618 views

2 days ago

Mehul Mohan
China's new AI model - better than Opus 4.5?

Thanks to Kilo AI for sponsoring this video. Sign up on Kilo AI here: https://kilo.codes/NmehEES and use promo code MEHUL ...

9:39
China's new AI model - better than Opus 4.5?

10,070 views

5 days ago

Matt Maher
I Tested Claude Code: $20 vs. $200 Subscription

In this video, I put Anthropic's Claude Code to the test, comparing the $20 Sonnet subscription with the premium $200 Opus plan.

18:57
I Tested Claude Code: $20 vs. $200 Subscription

58,777 views

6 months ago

HuggingFace
MoE Token Routing Explained: How Mixture of Experts Works (with Code)

This video dives deep into Token Routing, the core algorithm of Mixture of Experts (MoE) models. Slides: ...

34:15
MoE Token Routing Explained: How Mixture of Experts Works (with Code)

3,574 views

6 days ago

Matt Williams
Optimize Your AI - Quantization Explained

Run massive AI models on your laptop! Learn the secrets of LLM quantization and how q2, q4, and q8 settings in Ollama can save ...

12:10
Optimize Your AI - Quantization Explained

361,545 views

1 year ago

Mathemaniac
The weirdest paradox in statistics (and machine learning)

AD: Get Exclusive NordVPN deal here ➼ https://nordvpn.com/mathemaniac. It's risk-free with Nord's 30-day money-back ...

21:44
The weirdest paradox in statistics (and machine learning)

1,133,666 views

3 years ago

Fahd Mirza
Minimax Drops M2-Her: A Steamy Roleplay Model with 20M Users

This video tests M2-her model, supporting role-playing, multi-turn conversations and other dialogue scenarios. Get 50% ...

9:23
Minimax Drops M2-Her: A Steamy Roleplay Model with 20M Users

2,744 views

2 days ago

650 AI Lab
Research Paper Deep Dive -  The Sparsely-Gated Mixture-of-Experts (MoE)

In this video we are taking a deep dive to learn the more about the Mixture of Experts (or MoE), how it works and internal ...

22:39
Research Paper Deep Dive - The Sparsely-Gated Mixture-of-Experts (MoE)

3,317 views

3 years ago

sudoremove and 노토랩세미나
DeepSeek Engram: Adding Tables to Transformers

#Engram #deepseek Chapter --- 00:00 Intro 00:16 - Model Architecture Description and Serving/Structure Optimization Problem ...

15:08
DeepSeek Engram: Adding Tables to Transformers

2,951 views

3 days ago

Towards Data Science
Liam Fedus & Barret Zoph - AI scaling with mixture of expert models

... scientists at Google Brain, came on the podcast to talk about AI scaling, sparsity and the present and future of MoE models.

40:48
Liam Fedus & Barret Zoph - AI scaling with mixture of expert models

2,499 views

3 years ago

LLM Implementation
How 120B+ Parameter Models Run on One GPU (The MoE Secret)

How is it possible for a 120 billion parameter AI model to run on a single consumer GPU? This isn't magic—it's the result of ...

6:47
How 120B+ Parameter Models Run on One GPU (The MoE Secret)

1,579 views

5 months ago

MoeIsBetter
MoeIsBetter - Outta There (Audio)

Official Visualizer: https://smarturl.it/OuttaThereVisualizer Follow Moe: https://www.instagram.com/Moeisbetter/ (C) 2019 Moe ...

2:38
MoeIsBetter - Outta There (Audio)

547,227 views

5 years ago

bycloud
The REAL AI Architecture That Unifies Vision & Language

Get started now with open source & privacy focused password manager by Proton! https://proton.me/pass/bycloudai In this video, ...

10:13
The REAL AI Architecture That Unifies Vision & Language

44,626 views

7 months ago

Vizuara
Mixture of Experts (MoE) Introduction

In this lecture, we start looking at the second major component of the DeepSeek architecture after MLA: that is Mixture of Experts ...

29:59
Mixture of Experts (MoE) Introduction

5,353 views

8 months ago

Trend Guards
Why MoE Models Are Taking Over AI (Deep Dive)

The Mixture-of-Experts (MoE) architecture is transforming the entire AI industry — powering breakthrough models like ...

7:10
Why MoE Models Are Taking Over AI (Deep Dive)

163 views

1 month ago

BrainOmega
Hands-on 2: Mixture of Experts (MoE) from Scratch

Support BrainOmega ☕ Buy Me a Coffee: https://buymeacoffee.com/brainomega Stripe: ...

10:00
Hands-on 2: Mixture of Experts (MoE) from Scratch

6,487 views

6 months ago