Skip to content

Conversation

baichuanzhou
Copy link
Contributor

As the title suggests, this pr is to add the Mantis model to the supported model list.

Supported Models:
The implementation supports 6 Mantis model from the Mantis huggingface collection

I have picked three single-image tasks, and one multi-image task. Here's my reproduced results in comparison with the Mantis paper's results:

Mantis-Idefics2:

TextVQA* OKVQA ScienceQA-IMG MMMU MUIRBench
reproduced 63.5 52.5 81.8 39.7 45.3
reported 63.5 52.6 81.3 41.1 44.5

Mantis-LLaMA3-SigLIP:

TextVQA* OKVQA ScienceQA-IMG MMMU MUIRBench
reproduced 59.3 53.0 75.3 40.9 35.0
reported 59.2 55.4 74.9 40.1 36.1

Mantis-LLaMA3-CLIP:

TextVQA* OKVQA ScienceQA-IMG MMMU MUIRBench
reproduced 55.8 51.2 73.1 41.0 41.2
reported 56.4 53.0 73.8 38.0 37.4

I highlighted the MUIRBench score from Mantis-LLAMA3-CLIP as it is significantly higher than reported in the paper.

TextVQA is tested with ocr tokens as augmentation.

@Luodian Luodian merged commit 337f698 into EvolvingLMMs-Lab:main Jul 25, 2024
@BrenchCC
Copy link

Could you tell me the source of the dataset you used? I am currently replicating and optimizing the MANTIS model, but my test results on TextVQA are significantly different from the reported results. I am using lmms-lab/textvqa for my tests.

@baichuanzhou
Copy link
Contributor Author

I also used lmms-lab/textvqa to reproduce the results. But as I mentioned earlier, you need to use ocr tokens as augmentation to get these results. To do that, simply set the ocr_token to True in the yaml file.

MichalCiesiolka pushed a commit to MichalCiesiolka/lmms-eval-llmzszl that referenced this pull request Apr 3, 2025
Add model Mantis to the LMMs-Eval supported model list
dadwadw233 pushed a commit to dadwadw233/lmms-eval that referenced this pull request Apr 28, 2025
Add model Mantis to the LMMs-Eval supported model list
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants