vllm :: ViewTube

What is vLLM? Efficient AI Inference for Large Language Models

Ready to become a certified watsonx AI Assistant Engineer? Register now and use code IBMTechYT20 for 20% off of your exam ...

4:58

What is vLLM? Efficient AI Inference for Large Language Models

55,129 views

6 months ago

Donato Capitella

Running vLLM on Strix Halo (AMD Ryzen AI MAX) + ROCm Performance Updates

This video is divided into two parts: a technical guide on running vLLM on the AMD Ryzen AI MAX (Strix Halo) and an update on ...

18:06

Running vLLM on Strix Halo (AMD Ryzen AI MAX) + ROCm Performance Updates

7,861 views

1 day ago

Fahd Mirza

How to Install vLLM-Omni Locally | Complete Tutorial

This tutorial is a step-by-step hands-on guide to locally install vLLM-Omni. Buy Me a Coffee to support the channel: ...

8:40

How to Install vLLM-Omni Locally | Complete Tutorial

2,804 views

2 days ago

Red Hat

Ready to serve your large language models faster, more efficiently, and at a lower cost? Discover how vLLM, a high-throughput ...

6:13

Optimize LLM inference with vLLM

7,628 views

5 months ago

Anyscale

Fast LLM Serving with vLLM and PagedAttention

LLMs promise to fundamentally change how we use AI across all industries. However, actually serving these models is ...

32:07

Fast LLM Serving with vLLM and PagedAttention

53,779 views

2 years ago

Matou Studio

vLLM tout comme ollama peut servir des llm en local, il a ses avantages et il peut être utilisé dans openwebui... Chapitres de la ...

17:00

vLLM : Une Introduction

1,460 views

8 months ago

Vizuara

In this video, we understand how VLLM works. We look at a prompt and understand what exactly happens to the prompt as it ...

1:13:42

How the VLLM inference engine works?

8,687 views

3 months ago

NeuralNine

Today we learn about vLLM, a Python library that allows for easy and fast deployment and inference of LLMs.

15:19

vLLM: Easily Deploying & Serving LLMs

21,695 views

3 months ago

Donato Capitella

vLLM on Dual AMD Radeon 9700 AI PRO: Tutorials, Benchmarks (vs RTX 5090/5000/4090/3090/A100)

In this follow-up to my previous dual AMD R97000 AI PRO build, we shift focus from Llama.cpp to vLLM, a framework specifically ...

23:39

vLLM on Dual AMD Radeon 9700 AI PRO: Tutorials, Benchmarks (vs RTX 5090/5000/4090/3090/A100)

5,194 views

10 days ago

Savage Reviews

Ollama vs VLLM vs Llama.cpp: Best Local AI Runner in 2025?

Best Deals on Amazon: https://amzn.to/3JPwht2 ‎ ‎ MY TOP PICKS + INSIDER DISCOUNTS: https://beacons.ai/savagereviews I ...

2:06

Ollama vs VLLM vs Llama.cpp: Best Local AI Runner in 2025?

9,876 views

3 months ago

Genpakt

What is vLLM & How do I Serve Llama 3.1 With It?

People who are confused to what vLLM is this is the right video. Watch me go through vLLM, exploring what it is and how to use it ...

7:23

What is vLLM & How do I Serve Llama 3.1 With It?

40,944 views

1 year ago

Red Hat

[vLLM Office Hours #36] LIVE from Zürich vLLM Meetup - November 6, 2025

The first official vLLM Meetup in Europe took place in Zürich — hosted by Red Hat, IBM, and Mistral AI and streamed live to the ...

2:18:03

[vLLM Office Hours #36] LIVE from Zürich vLLM Meetup - November 6, 2025

2,456 views

Streamed 1 month ago

Anyscale

Embedded LLM’s Guide to vLLM Architecture & High-Performance Serving | Ray Summit 2025

At Ray Summit 2025, Tun Jian Tan from Embedded LLM shares an inside look at what gives vLLM its industry-leading speed, ...

32:18

Embedded LLM’s Guide to vLLM Architecture & High-Performance Serving | Ray Summit 2025

689 views

1 month ago

Databricks

vLLM is an open-source highly performant engine for LLM inference and serving developed at UC Berkeley. vLLM has been ...

35:53

Accelerating LLM Inference with vLLM

23,733 views

1 year ago

Aleksandar Haber PhD

Install and Run Locally LLMs using vLLM library on Windows

vllm #llm #machinelearning #ai #llamasgemelas #wsl #windows It takes a significant amount of time and energy to create these ...

11:46

Install and Run Locally LLMs using vLLM library on Windows

3,280 views

1 month ago

vLLM Semantic Router

User Experience is something we do care about. It is happy to share the for dashboard: 1. You can chat with vLLM-SR directly and ...

1:04

Introducing vLLM Semantic Router Dashboard 🔥

735 views

2 months ago

PyTorch

vLLM: Easy, Fast, and Cheap LLM Serving for Everyone - Simon Mo, vLLM vLLM is an open source library for fast, easy-to-use ...

24:47

vLLM: Easy, Fast, and Cheap LLM Serving for Everyone - Simon Mo, vLLM

1,174 views

1 month ago

Aleksandar Haber PhD

Install and Run Locally LLMs using vLLM library on Linux Ubuntu

vllm #llm #machinelearning #ai #llamasgemelas It takes a significant amount of time and energy to create these free video ...

11:08

Install and Run Locally LLMs using vLLM library on Linux Ubuntu

1,331 views

1 month ago