ViewTube

ViewTube
Sign inSign upSubscriptions
Filters

Upload date

Type

Duration

Sort by

Features

Reset

183,652 results

IBM Technology
What is Mixture of Experts?

Want to play with the technology yourself? Explore our interactive demo → https://ibm.biz/BdK8fn Learn more about the ...

7:58
What is Mixture of Experts?

48,801 views

1 year ago

Maarten Grootendorst
A Visual Guide to Mixture of Experts (MoE) in LLMs

In this highly visual guide, we explore the architecture of a Mixture of Experts in Large Language Models (LLM) and Vision ...

19:44
A Visual Guide to Mixture of Experts (MoE) in LLMs

48,708 views

1 year ago

AI Papers Academy
Introduction to Mixture-of-Experts | Original MoE Paper Explained

In this video we go back to the extremely important Google paper which introduced the Mixture-of-Experts (MoE) layer with ...

4:41
Introduction to Mixture-of-Experts | Original MoE Paper Explained

11,540 views

1 year ago

Julia Turc
Mixture of Experts: How LLMs get bigger without getting slower

Mixture of Experts (MoE) is everywhere: Meta / Llama 4, DeepSeek, Mistral. But how does it actually work? Do experts specialize?

26:42
Mixture of Experts: How LLMs get bigger without getting slower

26,787 views

9 months ago

Stanford Online
Stanford CS336 Language Modeling from Scratch | Spring 2025 | Lecture 4: Mixture of experts

For more information about Stanford's online Artificial Intelligence programs visit: https://stanford.io/ai To learn more about ...

1:22:04
Stanford CS336 Language Modeling from Scratch | Spring 2025 | Lecture 4: Mixture of experts

62,064 views

9 months ago

Chris Hay
MoE Models Don't Work Like You Think - Inside GPT-OSS

Many people think that mixture of expert models have domain experts, i.e. math experts, code experts, language experts.

18:28
MoE Models Don't Work Like You Think - Inside GPT-OSS

3,775 views

2 weeks ago

bycloud
1 Million Tiny Experts in an AI? Fine-Grained MoE Explained

To try everything Brilliant has to offer—free—for a full 30 days, visit https://brilliant.org/bycloud/ . You'll also get 20% off an annual ...

12:29
1 Million Tiny Experts in an AI? Fine-Grained MoE Explained

55,231 views

1 year ago

Organic Mechanisms
How To Create And Use A Pharmacophore In MOE | MOE Tutorial

Molecular Operating Environment (MOE) tutorial covering how to create and use a pharmacophore in MOE. When docking ...

4:12
How To Create And Use A Pharmacophore In MOE | MOE Tutorial

7,292 views

3 years ago

People also watched

Mo Bitar
There's no skill in AI coding

I've begun to suspect something you have probably begun to suspect as well: there's no real skill in AI coding. The models are ...

10:58
There's no skill in AI coding

5,405 views

4 hours ago

Digital Engine
AI's first kills show we're close to disaster. Godfather of AI

AI and robots make dangerous leap. Visit https://brilliant.org/digitalengine to learn more about AI. You'll also find loads of fun ...

19:17
AI's first kills show we're close to disaster. Godfather of AI

476,929 views

13 days ago

sudoremove and 노토랩세미나
DeepSeek Engram: Adding Tables to Transformers

#Engram #deepseek Chapter --- 00:00 Intro 00:16 - Model Architecture Description and Serving/Structure Optimization Problem ...

15:08
DeepSeek Engram: Adding Tables to Transformers

3,039 views

3 days ago

노정석
EP 83. Transformers: The Pilgrimage of the Reincarnated Token

This session intuitively explores why Transformers work the way they do, focusing on the journey a token undergoes when it's ...

53:56
EP 83. Transformers: The Pilgrimage of the Reincarnated Token

2,608 views

2 days ago

Matt Williams
Optimize Your AI - Quantization Explained

Run massive AI models on your laptop! Learn the secrets of LLM quantization and how q2, q4, and q8 settings in Ollama can save ...

12:10
Optimize Your AI - Quantization Explained

361,720 views

1 year ago

Sam Witteveen
Kimi K2.5- The Agent Swarm

In this video, I look at Kimi K2.5 the latest model from Moonshot AI and how it crushes with Agent Swarm to do tasks Site: Blog: ...

20:24
Kimi K2.5- The Agent Swarm

24,645 views

1 day ago

Welch Labs
How DeepSeek Rewrote the Transformer [MLA]

Thanks to KiwiCo for sponsoring today's video! Go to https://www.kiwico.com/welchlabs and use code WELCHLABS for 50% off ...

18:09
How DeepSeek Rewrote the Transformer [MLA]

851,099 views

10 months ago

Basil Alabdullah
Preparation of Protein Using MOE Software

الارساء الجزيئي Molecular Docking في هذا المقطع سنتعلم كيفية تحضير البروتين وانشاء الموقع الفعال ليكون جاهز لعملية الارساء ...

46:36
Preparation of Protein Using MOE Software

4,187 views

2 years ago

Donato Capitella
Kimi-K2(1T)/GLM 4.7(355B) on a 4-Node Strix Halo Cluster - 512GB of Unified Memory

In this video, I demonstrate running large-scale Mixture-of-Experts (MoE) models on a 4-node cluster of AMD Strix Halo systems.

9:36
Kimi-K2(1T)/GLM 4.7(355B) on a 4-Node Strix Halo Cluster - 512GB of Unified Memory

7,847 views

5 days ago

HuggingFace
MoE Token Routing Explained: How Mixture of Experts Works (with Code)

This video dives deep into Token Routing, the core algorithm of Mixture of Experts (MoE) models. Slides: ...

34:15
MoE Token Routing Explained: How Mixture of Experts Works (with Code)

3,608 views

6 days ago

Cerebras
Mixture of Experts Explained: How to Build, Train & Debug MoE Models in 2025

Mixture-of-Experts (MoE) models now power leading AI systems like GPT-4, Qwen3, DeepSeek-v3, and Gemini 1.5. But behind ...

4:32
Mixture of Experts Explained: How to Build, Train & Debug MoE Models in 2025

1,569 views

6 months ago

SaM Solutions
Mixture-of-Experts (MoE) LLMs: The Future of Efficient AI Models

Imagine having a whole team of specialists at your disposal, each an expert in a different field, and a smart coordinator who ...

6:01
Mixture-of-Experts (MoE) LLMs: The Future of Efficient AI Models

299 views

5 months ago

MoeIsBetter
Moe - Outta There (Official Video)

Outta There (Prod By Ayo N Keyz) "Rich Dreamin" Available Now on ALL streaming platforms LINK BELOW ...

2:32
Moe - Outta There (Official Video)

4,601,857 views

6 years ago

MoeIsBetter
MoeIsBetter - Outta There (Audio)

Official Visualizer: https://smarturl.it/OuttaThereVisualizer Follow Moe: https://www.instagram.com/Moeisbetter/ (C) 2019 Moe ...

2:38
MoeIsBetter - Outta There (Audio)

547,228 views

5 years ago

AI Research Roundup
Why Orthogonal Weights Fail in MoE Models

In this AI Research Roundup episode, Alex discusses the paper: 'Geometric Regularization in Mixture-of-Experts: The Disconnect ...

3:35
Why Orthogonal Weights Fail in MoE Models

11 views

2 weeks ago

bycloud
Mamba Might Just Make LLMs 1000x Cheaper...

Check out HubSpot's ChatGPT at work bundle! https://clickhubspot.com/twc Would mamba bring a revolution to LLMs and ...

14:06
Mamba Might Just Make LLMs 1000x Cheaper...

141,237 views

1 year ago

No Hype AI
How Did They Do It? DeepSeek V3 and R1 Explained

DeepSeek: The First Open-Weight Reasoning Model! In this video, I'll break down DeepSeek's two flagship models— V3 and R1 ...

11:15
How Did They Do It? DeepSeek V3 and R1 Explained

47,211 views

11 months ago

Stanford Online
Stanford CS25: V1 I Mixture of Experts (MoE) paradigm and the Switch Transformer

In deep learning, models typically reuse the same parameters for all inputs. Mixture of Experts (MoE) defies this and instead ...

1:05:44
Stanford CS25: V1 I Mixture of Experts (MoE) paradigm and the Switch Transformer

40,382 views

3 years ago

Paper With Video
[2024 Best AI Paper] MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts

This video was created using https://paperspeech.com. If you'd like to create explainer videos for your own papers, please visit the ...

11:36
[2024 Best AI Paper] MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts

71 views

1 year ago

AI with Lena Hall
Transformers vs MoE vs RNN vs Hybrid: Intuitive LLM Architecture Guide

Most developers default to transformers without understanding the alternatives. This video breaks down the intuition behind four ...

16:56
Transformers vs MoE vs RNN vs Hybrid: Intuitive LLM Architecture Guide

19,204 views

3 months ago

Marktechpost AI
NVIDIA Releases Nemotron 3: Hybrid Mamba Transformer Models With Latent MoE .....

NVIDIA Nemotron 3 is an open family of hybrid Mamba Transformer MoE models, designed for agentic AI with long context and ...

5:30
NVIDIA Releases Nemotron 3: Hybrid Mamba Transformer Models With Latent MoE .....

585 views

1 month ago

AILinkDeepTech
Mixture of Experts (MoE) Coding | MoE Code Implementation | Mixture of Experts Model

Mixture of Experts (MoE) Coding | MoE Code Implementation | Mixture of Experts Model MoE Code: ...

7:04
Mixture of Experts (MoE) Coding | MoE Code Implementation | Mixture of Experts Model

729 views

11 months ago