-
-
Notifications
You must be signed in to change notification settings - Fork 173
Fastvlm #495
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Fastvlm #495
Conversation
- Add `test_fastvlm` function to validate new ML model functionality. - Introduce `FastVLMMultiModalProjector` and `FastVLM` model classes to support vision-text multimodal operations. - Enhance `convert.py` by adding CoreML vision tower support and new CLI options (`--only-llm`, `--skip-vision`). - Refactor utilities to handle CoreML vision tower loading and add `force_download` option. - Cleanup unused and redundant test cases in `test_utils.py`. - Add configuration for `llava_qwen2` in prompt and model utilities.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds FastVLM support to the mlx-vlm project, implementing a new vision-language model with unique Core ML vision tower loading capabilities. FastVLM uses a sideband loading mechanism to overcome Hugging Face cache compatibility issues with Core ML model packages.
- Add FastVLM model implementation with Core ML vision tower support
- Implement sideband loading system for Core ML model packages to work around HF cache limitations
- Update utilities and conversion scripts to support new model architecture
Reviewed Changes
Copilot reviewed 10 out of 11 changed files in this pull request and generated 4 comments.
Show a summary per file
File | Description |
---|---|
mlx_vlm/utils.py | Add Core ML import, model type mapping, and vision tower loading logic |
mlx_vlm/tests/test_utils.py | Comment out stale quantization tests |
mlx_vlm/tests/test_models.py | Add comprehensive FastVLM model tests |
mlx_vlm/prompt_utils.py | Add message format support for llava_qwen2 model type |
mlx_vlm/models/fastvlm/ | New FastVLM model implementation with language model and Core ML integration |
mlx_vlm/hf_tools/mlpackage_cache.py | New Core ML package caching and resolution utilities |
mlx_vlm/convert.py | Add conversion support with Core ML file copying and new CLI options |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
Co-authored-by: Copilot <[email protected]>
Add FastVLM (https://huggingface.co/papers/2412.13303) support.