ViewTube

ViewTube
Sign inSign upSubscriptions
Filters

Upload date

Type

Duration

Sort by

Features

Reset

7,091 results

Related queries

speculative decoding

cross attention

masked self attention

multi head attention

bert vs gpt

statquest transformer

bert transformer

encoder vs decoder

StatQuest with Josh Starmer
Decoder-Only Transformers, ChatGPTs specific Transformer, Clearly Explained!!!

Transformers are taking over AI right now, and quite possibly their most famous use is in ChatGPT. ChatGPT uses a specific type ...

36:45
Decoder-Only Transformers, ChatGPTs specific Transformer, Clearly Explained!!!

206,298 views

2 years ago

HuggingFace
Transformer models: Decoders

A general high-level introduction to the Decoder part of the Transformer architecture. What is it, when should you use it?

4:27
Transformer models: Decoders

74,955 views

4 years ago

Efficient NLP
Which transformer architecture is best? Encoder-only vs Encoder-decoder vs Decoder-only models

Try Voice Writer - speak your thoughts and let AI handle the grammar: https://voicewriter.io The battle of transformer architectures: ...

7:38
Which transformer architecture is best? Encoder-only vs Encoder-decoder vs Decoder-only models

45,261 views

2 years ago

Luke Ditria
Decoder-Only Transformer for Next Token Prediction: PyTorch Deep Learning Tutorial

In this tutorial video I introduce the Decoder-Only Transformer model to perform next-token prediction! Donations, Help Support ...

15:11
Decoder-Only Transformer for Next Token Prediction: PyTorch Deep Learning Tutorial

4,052 views

1 year ago

StatQuest with Josh Starmer
Encoder-Only Transformers (like BERT) for RAG, Clearly Explained!!!

... Matrix Math Behind Transformers: https://youtu.be/KphmOJnLAdI Coding a Decoder-Only Transformer from Scratch in PyTorch: ...

18:52
Encoder-Only Transformers (like BERT) for RAG, Clearly Explained!!!

77,956 views

1 year ago

Super Data Science: ML & AI Podcast with Jon Krohn
How Decoder-Only Transformers (like GPT) Work

Learn about encoders, cross attention and masking for LLMs as SuperDataScience Founder Kirill Eremenko returns to the ...

18:56
How Decoder-Only Transformers (like GPT) Work

3,968 views

1 year ago

Numeryst
Encoder-Only vs Decoder-Only Transformers | What’s the Difference?

https://www.youtube.com/watch?v=_mNuwiaTOSk&list=PLLlTVphLQsuPL2QM0tqR425c-c7BvuXBD&index=1 In this video, we'll ...

1:46
Encoder-Only vs Decoder-Only Transformers | What’s the Difference?

217 views

1 month ago

CodeEmporium
Blowing up Transformer Decoder architecture

ABOUT ME ⭕ Subscribe: https://www.youtube.com/c/CodeEmporium?sub_confirmation=1 Medium Blog: ...

25:59
Blowing up Transformer Decoder architecture

20,635 views

2 years ago

Lennart Svensson
Transformers - Part 7 - Decoder (2): masked self-attention

This is the second video on the decoder layer of the transformer. Here we describe the masked self-attention layer in detail.

8:37
Transformers - Part 7 - Decoder (2): masked self-attention

22,342 views

5 years ago

Super Data Science: ML & AI Podcast with Jon Krohn
Encoder-Decoder Transformers vs Decoder-Only vs Encoder-Only: Pros and Cons

Learn about encoders, cross attention and masking for LLMs as SuperDataScience Founder Kirill Eremenko returns to the ...

8:45
Encoder-Decoder Transformers vs Decoder-Only vs Encoder-Only: Pros and Cons

3,918 views

1 year ago

People also watched

Under The Hood
How Attention Mechanism Works in Transformer Architecture

llm #embedding #gpt The attention mechanism in transformers is a key component that allows models to focus on different parts of ...

22:10
How Attention Mechanism Works in Transformer Architecture

73,305 views

9 months ago

Gal Lahat
I Visualised Attention in Transformers

To try everything Brilliant has to offer—free—for a full 30 days, visit https://brilliant.org/GalLahat/ . You'll also get 20% off an annual ...

13:01
I Visualised Attention in Transformers

175,699 views

5 months ago

Lex Clips
Transformers: The best idea in AI | Andrej Karpathy and Lex Fridman

Lex Fridman Podcast full episode: https://www.youtube.com/watch?v=cdiD-9MMpb0 Please support this podcast by checking out ...

8:38
Transformers: The best idea in AI | Andrej Karpathy and Lex Fridman

426,870 views

3 years ago

Stanford Online
Stanford CME295 Transformers & LLMs | Autumn 2025 | Lecture 1 - Transformer

For more information about Stanford's graduate programs, visit: https://online.stanford.edu/graduate-education September 26, ...

1:41:59
Stanford CME295 Transformers & LLMs | Autumn 2025 | Lecture 1 - Transformer

246,296 views

2 months ago

Niels Rogge
How a Transformer works at inference vs training time

I made this video to illustrate the difference between how a Transformer is used at inference time (i.e. when generating text) vs.

49:53
How a Transformer works at inference vs training time

68,553 views

2 years ago

The AI Hacker
Illustrated Guide to Transformers Neural Network: A step by step explanation

Transformers are the rage nowadays, but how do they work? This video demystifies the novel neural network architecture with ...

15:01
Illustrated Guide to Transformers Neural Network: A step by step explanation

1,150,815 views

5 years ago

Grant Sanderson
Visualizing transformers and attention | Talk for TNG Big Tech Day '24

An overview of transforms, as used in LLMs, and the attention mechanism within them. Based on the 3blue1brown deep learning ...

57:45
Visualizing transformers and attention | Talk for TNG Big Tech Day '24

982,190 views

1 year ago

AI Engineer
Mastering LLM Inference Optimization From Theory to Cost Effective Deployment: Mark Moyou

LLM inference is not your normal deep learning model deployment nor is it trivial when it comes to managing scale, performance ...

33:39
Mastering LLM Inference Optimization From Theory to Cost Effective Deployment: Mark Moyou

27,970 views

11 months ago

Jeff Heaton
Transformer-Based Time Series with PyTorch (10.3)

This video covers deep learning as we explore the transformative power of Transformer models in time series analysis using ...

6:33
Transformer-Based Time Series with PyTorch (10.3)

29,655 views

2 years ago

Learn With Jay
Encoder Architecture in Transformers | Step by Step Guide

We break down the Encoder architecture in Transformers, layer by layer! If you've ever wondered how models like BERT and GPT ...

23:39
Encoder Architecture in Transformers | Step by Step Guide

6,414 views

10 months ago

Jake Watson
PyTorch - Transformers from scratch! [Encoder/Decoder only]

Hello! In this video I'm going over a small summer project I under took in which I implemented some Encoder/Decoder only ...

9:42
PyTorch - Transformers from scratch! [Encoder/Decoder only]

32 views

3 weeks ago

Learn With Jay
Decoder Architecture in Transformers | Step-by-Step from Scratch

Transformers have revolutionized deep learning, but have you ever wondered how the decoder in a transformer actually works?

41:29
Decoder Architecture in Transformers | Step-by-Step from Scratch

6,751 views

8 months ago

AD Academy
Tutorial 13: Variations of Transformer Architecture |  Encoder only model | Decoder only model

Dive deep into the world of Transformer Architectures! In this video, we explore three key variations of transformers, breaking ...

3:29
Tutorial 13: Variations of Transformer Architecture | Encoder only model | Decoder only model

287 views

2 years ago

StatQuest with Josh Starmer
Coding a ChatGPT Like Transformer From Scratch in PyTorch

Decoder-Only Transformers: https://youtu.be/bQ5BoolX9Ag The Essential Matrix Algebra for Neural Networks: ...

31:11
Coding a ChatGPT Like Transformer From Scratch in PyTorch

102,709 views

1 year ago

HuggingFace
Transformer models: Encoders

A general high-level introduction to the Encoder part of the Transformer architecture. What is it, when should you use it?

4:46
Transformer models: Encoders

89,570 views

4 years ago

Andrej Karpathy
Let's build GPT: from scratch, in code, spelled out.

We build a Generatively Pretrained Transformer (GPT), following the paper "Attention is All You Need" and OpenAI's GPT-2 ...

1:56:20
Let's build GPT: from scratch, in code, spelled out.

6,703,126 views

2 years ago

Julien Simon
Decoder-only inference: a step-by-step deep dive

In this deep dive video, we explore the step-by-step process of transformer inference for text generation, with a focus on ...

42:04
Decoder-only inference: a step-by-step deep dive

31,386 views

11 months ago

Kevin Nguyen Tech
AI & Deep Learning Course #36 - Encoder Only and Decoder Only Transformers

Follow the rest of the series here: https://www.youtube.com/playlist?list=PLn2ipk-jqgZhmSSK3QPWpdEoTPeWjbGh_ Code for the ...

6:57
AI & Deep Learning Course #36 - Encoder Only and Decoder Only Transformers

212 views

9 months ago

Udacity
Introduction to LLMs: Encoder Vs Decoder Models

This video is an excerpt taken from our Generative AI Nanodegree program: ...

2:58
Introduction to LLMs: Encoder Vs Decoder Models

11,513 views

1 year ago

3Blue1Brown
Attention in transformers, step-by-step | Deep Learning Chapter 6

Demystifying attention, the key mechanism inside transformers and LLMs. Instead of sponsored ad reads, these lessons are ...

26:10
Attention in transformers, step-by-step | Deep Learning Chapter 6

3,509,389 views

1 year ago