vllm :: ViewTube

How to Install vLLM-Omni Locally | Complete Tutorial

This tutorial is a step-by-step hands-on guide to locally install vLLM-Omni. Buy Me a Coffee to support the channel: ...

8:40

How to Install vLLM-Omni Locally | Complete Tutorial

3,189 views

5 days ago

Prompt Engineer

This Changes AI Serving Forever | vLLM-Omni Walkthrough

Serving modern AI models has become quite complicated different stacks for LLMs, vision models, audio, and video inference.

3:57

This Changes AI Serving Forever | vLLM-Omni Walkthrough

268 views

19 hours ago

Donato Capitella

Running vLLM on Strix Halo (AMD Ryzen AI MAX) + ROCm Performance Updates

This video is divided into two parts: a technical guide on running vLLM on the AMD Ryzen AI MAX (Strix Halo) and an update on ...

18:06

Running vLLM on Strix Halo (AMD Ryzen AI MAX) + ROCm Performance Updates

11,490 views

5 days ago

AINexLayer

vLLM-Omni Explained: "Supercharging" AI with Omnimodal Speed

Most AI models today are stuck in a world of words, but the future is omnimodal. In this video, we break down vLLM-Omni, a new ...

6:27

vLLM-Omni Explained: "Supercharging" AI with Omnimodal Speed

68 views

5 days ago

YourAvgDev

vLLM for Intel xpu on Dual Intel Arc B580 - Setup and Demo for VERY FAST LLM Performance!

Write up and instructions here: https://www.roger.lol/blog/accessible-ai-vllm-on-intel-arc Let's go through the process in setting up ...

1:54:11

vLLM for Intel xpu on Dual Intel Arc B580 - Setup and Demo for VERY FAST LLM Performance!

20 views

12 hours ago

Embedded LLM

vLLM Bangkok Meet Up 2025: Approach to Training ThaiLLM (Data-Centric + Real-World Perspectives)

Sharing insights from an AI Engineer at Bangkok Silicon on building ThaiLLM, with a strong focus on the training data and ...

14:14

vLLM Bangkok Meet Up 2025: Approach to Training ThaiLLM (Data-Centric + Real-World Perspectives)

11 views

3 days ago

GitHub Daily Trend AI Podcast

GitHub - vllm-project/vllm-omni: A framework for efficient model inference with omni-modality models

https://github.com/vllm-project/vllm-omni A framework for efficient model inference with omni-modality models ...

5:09

GitHub - vllm-project/vllm-omni: A framework for efficient model inference with omni-modality models

54 views

5 days ago

Gourcer

vllm-project/vllm-omni - Gource visualisation

Watch the development journey of vllm-omni by vllm-project! A framework for efficient model inference with omni-modality ...

0:38

vllm-project/vllm-omni - Gource visualisation

55 views

5 days ago

AI21 Labs

Efficient LLM Serving with vLLM (Ray x AI21 Meetup)

Discover how vLLM achieves dynamic, efficient inference through features like PagedAttention, continuous batching, and KV ...

23:29

Efficient LLM Serving with vLLM (Ray x AI21 Meetup)

121 views

6 days ago

Embedded LLM

vLLM Bangkok Meet Up 2025: Preventing Catastrophic Forgetting in Thai Continued Pretraining

Jessada Pranee (NECTEC) — an AI engineer on the Pathumma LLM team — walks through a practical problem in multilingual ...

15:17

vLLM Bangkok Meet Up 2025: Preventing Catastrophic Forgetting in Thai Continued Pretraining

15 views

5 days ago

AI Explained in 5 Minutes

Simple Tricks to Instantly Improve Your LLM Performance ⚡ LMCache Explained: Accelerating LLM Inference for the Future of AI ...

7:40

Simple Tricks to Instantly Improve Your LLM Performance

3 views

5 days ago

AI21 Labs

Scaling LLM Batch Inference with vLLM + Ray (Ray x AI21 Meetup)

Learn how Ray orchestrates CPU and GPU workloads to efficiently run batch inference at scale, ensuring GPUs stay fully utilized ...

24:10

Scaling LLM Batch Inference with vLLM + Ray (Ray x AI21 Meetup)

181 views

6 days ago

AI Explained in 5 Minutes

LMCache Solves vLLM's Biggest Problem In this AI Explained video, we dive deep into the comparison between vLLM and ...

6:23

LMCache Solves vLLM's Biggest Problem

5 views

4 days ago

Tales Of Tensors

Speculative Decoding: 3× Faster LLM Inference with Zero Quality Loss

Speculative decoding is one of the most important performance optimizations in modern LLM serving—and most people still don't ...

7:40

Speculative Decoding: 3× Faster LLM Inference with Zero Quality Loss

83 views

5 days ago

F5 DevCentral Community

AI Inference for VLLM modelswith F5 BIG-IP & Red Hat OpenShift

This demo showcases load balancing of VLLM AI inference model servers hosted in OpenShift, how this is different from regular ...

5:15

AI Inference for VLLM modelswith F5 BIG-IP & Red Hat OpenShift

108 views

4 days ago

YourAvgDev

Gen AI on Intel Arc GPUs - Building a Dual Arc B580 LLM Inference Server! (24 GB VRAM!)

How to get 24 GB VRAM for cheap? Let's try 2 Intel Arc B580s as a cheap solution! I am going to start a really cool video series ...

22:26

Gen AI on Intel Arc GPUs - Building a Dual Arc B580 LLM Inference Server! (24 GB VRAM!)

225 views

13 hours ago

ABV: Art, Beats & Ventures

Build Workflows with Ease: Open‑Source AI Agent Platform Sim

Launch powerful AI agent workflows in minutes with Sim, an open-source platform to visually design, run, and scale agentic flows.

0:39

Build Workflows with Ease: Open‑Source AI Agent Platform Sim

12 views

6 days ago

Binary Verse AI

GLM 4.7 Review: From $3 Agentic Workflows to Local Uncensored Roleplay

https://binaryverseai.com/glm-4-7-review-3-benchmarks-z-ai-install-api-use/ GLM-4.7 showed up with a suspiciously clean ...

11:28

GLM 4.7 Review: From $3 Agentic Workflows to Local Uncensored Roleplay

466 views

6 days ago

Neural Intel Media

GLM-4.7 Deep Dive: 358B Parameters, Agentic Reasoning, and the Future of Open Weights

In this episode of the Neural Intel podcast, we go under the hood of GLM-4.7, the newest native agentic LLM from Z.AI. Released ...

33:33

GLM-4.7 Deep Dive: 358B Parameters, Agentic Reasoning, and the Future of Open Weights

0 views

5 days ago

Zachary Huang

Top AI Researcher: We've Been Lied To About LLM Training

Understanding r1-zero-like training: A critical perspective: https://arxiv.org/pdf/2503.20783 Defeating the Training-Inference ...

34:24

Top AI Researcher: We've Been Lied To About LLM Training

8,592 views

3 days ago