Qwen2-VL#

This example demonstrates how to use the Qwen2-VL model to to answer questions about images or videos.

NOTE: The Qwen2-VL model should be used in GPU environments.

import cv2
import numpy as np
from vision_agent_tools.models.qwen2_vl import Qwen2VL

# (replace this path with your own!)
video_path = "path/to/your/my_video.mp4"

# Load the video into frames
cap = cv2.VideoCapture(video_path)
frames = []
while cap.isOpened():
    ret, frame = cap.read()
    if not ret:
        break
    frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
    frames.append(frame)
cap.release()
frames = np.stack(frames, axis=0)

# Initialize the Qwen2VL model
run_inference = Qwen2VL()
prompt = "Here are some frames of a video. Describe this video in detail"
# Time to put Qwen2VL to work!
answer = run_inference(video=frames, prompt=prompt)

# Print the output answer
print(answer)

`Qwen2VL` #

Bases: BaseMLModel

Qwen2-VL is a model that is capable of accurately identifying and comprehending the content within images, regardless of their clarity, resolution, or extreme aspect ratios.

NOTE: The Qwen2-VL model should be used in GPU environments.

`call(prompt=None, images=None, video=None, frames=MAX_NUMBER_OF_FRAMES)` #

Qwen2-VL model answers questions about a video or image.

Parameters:

Name	Type	Description	Default
`prompt`	`str`	The prompt with the question to be answered.	`None`
`images`	`list[Image]`	A list of images for the model to process. None if using video.	`None`
`video`	`VideoNumpy \| None`	A numpy array containing the different images, representing the video.	`None`
`frames`	`int`	The number of frames to be used from the video.	`MAX_NUMBER_OF_FRAMES`

Returns:

Type	Description
`list[str]`	list[str]: The answers to the prompt.

`init(model_config=None)` #

Initializes the Qwen2-VL model.

Qwen2-VL#

Qwen2VL #

__call__(prompt=None, images=None, video=None, frames=MAX_NUMBER_OF_FRAMES) #

__init__(model_config=None) #

`Qwen2VL` #

`call(prompt=None, images=None, video=None, frames=MAX_NUMBER_OF_FRAMES)` #

`init(model_config=None)` #