Skip to content

Conversation

jonathanmiddleton
Copy link

Add FastVLM (https://huggingface.co/papers/2412.13303) support.

  • FastVlm model
  • sideband load of Core ML model package to overcome HF cache issue
  • (removed stale tests)

- Add `test_fastvlm` function to validate new ML model functionality.
- Introduce `FastVLMMultiModalProjector` and `FastVLM` model classes to support vision-text multimodal operations.
- Enhance `convert.py` by adding CoreML vision tower support and new CLI options (`--only-llm`, `--skip-vision`).
- Refactor utilities to handle CoreML vision tower loading and add `force_download` option.
- Cleanup unused and redundant test cases in `test_utils.py`.
- Add configuration for `llava_qwen2` in prompt and model utilities.
@Copilot Copilot AI review requested due to automatic review settings September 4, 2025 18:42
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds FastVLM support to the mlx-vlm project, implementing a new vision-language model with unique Core ML vision tower loading capabilities. FastVLM uses a sideband loading mechanism to overcome Hugging Face cache compatibility issues with Core ML model packages.

  • Add FastVLM model implementation with Core ML vision tower support
  • Implement sideband loading system for Core ML model packages to work around HF cache limitations
  • Update utilities and conversion scripts to support new model architecture

Reviewed Changes

Copilot reviewed 10 out of 11 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
mlx_vlm/utils.py Add Core ML import, model type mapping, and vision tower loading logic
mlx_vlm/tests/test_utils.py Comment out stale quantization tests
mlx_vlm/tests/test_models.py Add comprehensive FastVLM model tests
mlx_vlm/prompt_utils.py Add message format support for llava_qwen2 model type
mlx_vlm/models/fastvlm/ New FastVLM model implementation with language model and Core ML integration
mlx_vlm/hf_tools/mlpackage_cache.py New Core ML package caching and resolution utilities
mlx_vlm/convert.py Add conversion support with Core ML file copying and new CLI options

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant