vllm.transformers_utils.processors.nemotron_vl ¶
LlamaNemotronNanoVLProcessor ¶
Bases: InternVLProcessor
This model doesn't define its own HF processor, so we implement our own one here.
The image processor is given by: https://huggingface.co/nvidia/Llama-3.1-Nemotron-Nano-VL-8B-V1/blob/main/image_processing.py
Source code in vllm/transformers_utils/processors/nemotron_vl.py
LlamaNemotronVLEmbedProcessor ¶
Bases: InternVLProcessor
Processor for LlamaNemotronVL embedding model.
Inherits from NemotronVLProcessor and specializes it for embedding tasks: - Uses SigLIP transform with normalization instead of base transform - Uses different image context token ()
Source code in vllm/transformers_utils/processors/nemotron_vl.py
build_siglip_transform ¶
build_siglip_transform(input_size: int)
Build transform for SigLIP vision encoder with normalization.
Extends the base transform from nemotron_vl with SigLIP-specific normalization.