ViewTube

ViewTube
Sign inSign upSubscriptions
Filters

Upload date

Type

Duration

Sort by

Features

Reset

179,918 results

IBM Technology
What is Mixture of Experts?

Want to play with the technology yourself? Explore our interactive demo → https://ibm.biz/BdK8fn Learn more about the ...

7:58
What is Mixture of Experts?

48,751 views

1 year ago

Maarten Grootendorst
A Visual Guide to Mixture of Experts (MoE) in LLMs

In this highly visual guide, we explore the architecture of a Mixture of Experts in Large Language Models (LLM) and Vision ...

19:44
A Visual Guide to Mixture of Experts (MoE) in LLMs

48,657 views

1 year ago

AI Papers Academy
Introduction to Mixture-of-Experts | Original MoE Paper Explained

In this video we go back to the extremely important Google paper which introduced the Mixture-of-Experts (MoE) layer with ...

4:41
Introduction to Mixture-of-Experts | Original MoE Paper Explained

11,533 views

1 year ago

Julia Turc
Mixture of Experts: How LLMs get bigger without getting slower

Mixture of Experts (MoE) is everywhere: Meta / Llama 4, DeepSeek, Mistral. But how does it actually work? Do experts specialize?

26:42
Mixture of Experts: How LLMs get bigger without getting slower

26,762 views

9 months ago

Stanford Online
Stanford CS336 Language Modeling from Scratch | Spring 2025 | Lecture 4: Mixture of experts

For more information about Stanford's online Artificial Intelligence programs visit: https://stanford.io/ai To learn more about ...

1:22:04
Stanford CS336 Language Modeling from Scratch | Spring 2025 | Lecture 4: Mixture of experts

61,964 views

9 months ago

Chris Hay
MoE Models Don't Work Like You Think - Inside GPT-OSS

Many people think that mixture of expert models have domain experts, i.e. math experts, code experts, language experts.

18:28
MoE Models Don't Work Like You Think - Inside GPT-OSS

3,770 views

2 weeks ago

People also watched

Sam Witteveen
Kimi K2.5- The Agent Swarm

In this video, I look at Kimi K2.5 the latest model from Moonshot AI and how it crushes with Agent Swarm to do tasks Site: Blog: ...

20:24
Kimi K2.5- The Agent Swarm

20,094 views

20 hours ago

AI Revolution
DeepSeek Leaks MODEL1: New Flagship AI Shocks The Industry

At the same time, Zhipu AI officially released GLM-4.7-Flash, a long-context MoE model built for real coding and reasoning that ...

15:40
DeepSeek Leaks MODEL1: New Flagship AI Shocks The Industry

36,983 views

6 days ago

Basil Alabdullah
Preparation of Protein Using MOE Software

الارساء الجزيئي Molecular Docking في هذا المقطع سنتعلم كيفية تحضير البروتين وانشاء الموقع الفعال ليكون جاهز لعملية الارساء ...

46:36
Preparation of Protein Using MOE Software

4,178 views

2 years ago

sudoremove and 노토랩세미나
DeepSeek Engram: Adding Tables to Transformers

#Engram #deepseek Chapter --- 00:00 Intro 00:16 - Model Architecture Description and Serving/Structure Optimization Problem ...

15:08
DeepSeek Engram: Adding Tables to Transformers

2,785 views

3 days ago

서울대학교 산업공학과 DSBA 연구실
[Paper Review] Mamba: Linear-Time Sequence Modeling with Selective State Spaces

발표자 : 석사과정 천재원 1. 논문 제목: Mamba: Linear-Time Sequence Modeling with Selective State Spaces 2. 논문 링크: ...

1:21:45
[Paper Review] Mamba: Linear-Time Sequence Modeling with Selective State Spaces

20,095 views

1 year ago

Matt Williams
Optimize Your AI - Quantization Explained

Run massive AI models on your laptop! Learn the secrets of LLM quantization and how q2, q4, and q8 settings in Ollama can save ...

12:10
Optimize Your AI - Quantization Explained

361,450 views

1 year ago

노정석
EP 83. Transformers: The Pilgrimage of the Reincarnated Token

This session intuitively explores why Transformers work the way they do, focusing on the journey a token undergoes when it's ...

53:56
EP 83. Transformers: The Pilgrimage of the Reincarnated Token

2,379 views

1 day ago

xCreate
Let's Run GLM-4-7-Flash THINKING - Local AI Super-Intelligence? | REVIEW

Zhipu have released another benchmark topping for its size update to their GLM model. So let's see how well it performs locally.

16:59
Let's Run GLM-4-7-Flash THINKING - Local AI Super-Intelligence? | REVIEW

3,659 views

7 days ago

Maarten Grootendorst
Intuition behind Mamba and State Space Models | Enhancing LLMs!

Mamba is an exciting LLM architecture that, when used with Transformers, might introduce new capabilities we haven't seen ...

24:06
Intuition behind Mamba and State Space Models | Enhancing LLMs!

21,485 views

10 months ago

IBM Technology
RAG vs Fine-Tuning vs Prompt Engineering: Optimizing AI Models

Ready to become a certified watsonx AI Assistant Engineer? Register now and use code IBMTechYT20 for 20% off of your exam ...

13:10
RAG vs Fine-Tuning vs Prompt Engineering: Optimizing AI Models

536,846 views

9 months ago

IBM Technology
Mixture of Experts: Boosting AI Efficiency with Modular Models #ai #machinelearning #moe

AI news moves fast. Sign up for a monthly newsletter for AI updates from IBM ...

0:51
Mixture of Experts: Boosting AI Efficiency with Modular Models #ai #machinelearning #moe

4,723 views

1 year ago

bycloud
1 Million Tiny Experts in an AI? Fine-Grained MoE Explained

To try everything Brilliant has to offer—free—for a full 30 days, visit https://brilliant.org/bycloud/ . You'll also get 20% off an annual ...

12:29
1 Million Tiny Experts in an AI? Fine-Grained MoE Explained

55,229 views

1 year ago

Organic Mechanisms
How To Create And Use A Pharmacophore In MOE | MOE Tutorial

Molecular Operating Environment (MOE) tutorial covering how to create and use a pharmacophore in MOE. When docking ...

4:12
How To Create And Use A Pharmacophore In MOE | MOE Tutorial

7,290 views

2 years ago

Cerebras
Mixture of Experts Explained: How to Build, Train & Debug MoE Models in 2025

Mixture-of-Experts (MoE) models now power leading AI systems like GPT-4, Qwen3, DeepSeek-v3, and Gemini 1.5. But behind ...

4:32
Mixture of Experts Explained: How to Build, Train & Debug MoE Models in 2025

1,568 views

6 months ago

MoeIsBetter
MoeIsBetter - Outta There (Audio)

Official Visualizer: https://smarturl.it/OuttaThereVisualizer Follow Moe: https://www.instagram.com/Moeisbetter/ (C) 2019 Moe ...

2:38
MoeIsBetter - Outta There (Audio)

547,216 views

5 years ago

SaM Solutions
Mixture-of-Experts (MoE) LLMs: The Future of Efficient AI Models

Imagine having a whole team of specialists at your disposal, each an expert in a different field, and a smart coordinator who ...

6:01
Mixture-of-Experts (MoE) LLMs: The Future of Efficient AI Models

298 views

5 months ago

MoeIsBetter
Moe - Outta There (Official Video)

Outta There (Prod By Ayo N Keyz) "Rich Dreamin" Available Now on ALL streaming platforms LINK BELOW ...

2:32
Moe - Outta There (Official Video)

4,601,660 views

6 years ago

AI Research Roundup
Why Orthogonal Weights Fail in MoE Models

In this AI Research Roundup episode, Alex discusses the paper: 'Geometric Regularization in Mixture-of-Experts: The Disconnect ...

3:35
Why Orthogonal Weights Fail in MoE Models

11 views

2 weeks ago

No Hype AI
How Did They Do It? DeepSeek V3 and R1 Explained

DeepSeek: The First Open-Weight Reasoning Model! In this video, I'll break down DeepSeek's two flagship models— V3 and R1 ...

11:15
How Did They Do It? DeepSeek V3 and R1 Explained

47,195 views

11 months ago

Paper With Video
[2024 Best AI Paper] MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts

This video was created using https://paperspeech.com. If you'd like to create explainer videos for your own papers, please visit the ...

11:36
[2024 Best AI Paper] MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts

71 views

1 year ago

Stanford Online
Stanford CS25: V1 I Mixture of Experts (MoE) paradigm and the Switch Transformer

In deep learning, models typically reuse the same parameters for all inputs. Mixture of Experts (MoE) defies this and instead ...

1:05:44
Stanford CS25: V1 I Mixture of Experts (MoE) paradigm and the Switch Transformer

40,376 views

3 years ago

Marktechpost AI
NVIDIA Releases Nemotron 3: Hybrid Mamba Transformer Models With Latent MoE .....

NVIDIA Nemotron 3 is an open family of hybrid Mamba Transformer MoE models, designed for agentic AI with long context and ...

5:30
NVIDIA Releases Nemotron 3: Hybrid Mamba Transformer Models With Latent MoE .....

585 views

1 month ago

AI with Lena Hall
Transformers vs MoE vs RNN vs Hybrid: Intuitive LLM Architecture Guide

Most developers default to transformers without understanding the alternatives. This video breaks down the intuition behind four ...

16:56
Transformers vs MoE vs RNN vs Hybrid: Intuitive LLM Architecture Guide

19,204 views

3 months ago