LAMB

LAMB: A Training-Free Method to Enhance the Long-Context Understanding of SSMs via Attention-Guided Token Filtering

Intro

LAMB is a training-free method that significantly improves the long-context understanding of Mamba models through attention-guided token filtering.

Our analysis reveals that performance degradation in state space models (SSMs) on long context primarily stems from exponential decay in hidden state memory, which can be effectively mitigated by preserving a small subset of critical tokens identified via their attention patterns. Motivated by this insight, LAMB introduces an attention-based metric for token selection, substantially enhancing the retention of critical context. Extensive evaluations demonstrate that LAMB achieves up to 30.35% improvement over previous state-of-the-art techniques across various long-context benchmarks.

Getting Started

1. Install Environment

bash ./build_env.sh

This script will create a dedicated Python environment and install all required packages.

Evaluation

This section details how to evaluate LAMB.

2. Configuration Parameters

The run_eval.py script uses the following main arguments:

Flag	Description	Default
`-d`, `--device`	CUDA device ID.	`'0'`
`--model`	Path to the pre-trained model or model name from Hugging Face.	`"state-spaces/mamba2-1.3b"`
`--config`	Path to the remapping configuration JSON file for LAMB. (None for vanilla model)	None
`-lt`, `--long_eval_task`	Specify the evaluation task for LongBench task.	`'no'`
`--helmet_config`	Specify the evaluation task for Helmet task. (None for no Helmet task)	`None`
`--sample_path`	Path to the `.txt` file for perplexity calculation. Used in perplexity task `--ppl`.	`"subseq_lambada.txt"`
`--ppl`	Enable perplexity task on special input	`False`

3. Running Evaluations

3.1 LongBench Evaluation

To run LongBench tasks, set the --long_eval_task argument to yes (LongBench), e (LongBench-E), or c (a subset).

```bash
# vanilla
CUDA_VISIBLE_DEVICES=0 python run_eval.py \
  --model state-spaces/mamba2-1.3b \
  --long_eval_task c \
  --device 0

# ours
CUDA_VISIBLE_DEVICES=0 python run_eval.py \
  --model state-spaces/mamba2-1.3b \
  --config ./remapping_configs/mamba2-1.3b_topk1024_pk9.json \
  --long_eval_task c \
  --device 0
```

3.2 Helmet Evaluation

To run Helmet tasks, set the --helmet_config argument to config path (under folder "./helmet/configs/") and set the --helmet_output_dir to specify the output path.

```bash
# vanilla
CUDA_VISIBLE_DEVICES=0 python run_eval.py \
  --model state-spaces/mamba2-1.3b \
  --helmet_config ./helmet/configs/longqa_short.yaml \
  --helmet_output_dir ./helmet/output \
  --device 0

# ours
CUDA_VISIBLE_DEVICES=0 python run_eval.py \
  --model state-spaces/mamba2-1.3b \
  --config ./remapping_configs/mamba2-1.3b_topk1024_pk9.json \
  --helmet_config ./helmet/configs/longqa_short.yaml \
  --helmet_output_dir ./helmet/output \
  --device 0
```

3.3 Custom Perplexity Calculation

To calculate perplexity on a custom text file, add --ppl and specify the input file using --sample_path.

```bash
# vanilla
CUDA_VISIBLE_DEVICES=0 python run_eval.py \
  --model state-spaces/mamba2-1.3b \
  --sample_path ./subseq_lambada.txt \
  --device 0 \
  --ppl

# ours
CUDA_VISIBLE_DEVICES=0 python run_eval.py \
  --model state-spaces/mamba2-1.3b \
  --config ./remapping_configs/mamba2-1.3b_topk1024_pk9.json \
  --sample_path ./subseq_lambada.txt \
  --device 0 \
  --ppl
```

Disclaimer:

This “research quality code” is for Non-Commercial purposes and provided by the contributors “As Is” without any express or implied warranty of any kind. The organizations (Georgia Tech or Intel) involved do not own the rights to the data sets used and do not confer any rights to it. The organizations (Georgia Tech or Intel) do not warrant or assume responsibility for the accuracy or completeness of any information, text, graphics, links or other items within the code. A thorough security review has not been performed on this code. Additionally, this repository may contain components that are out of date or contain known security vulnerabilities.

Citation

If you find our work valuable, please consider citing our paper:

@inproceedings{ye2025longmamba,
  title={LAMB: A Training-Free Method to Enhance the Long-Context Understanding of SSMs via Attention-Guided Token Filtering},
  author={Zhifan Ye and Zheng Wang and Kejing Xia and Jihoon Hong and Leshu Li and Lexington Whalen and Cheng Wan and Yonggan Fu and Yingyan Celine Lin and Souvik Kundu},
  booktitle={Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
assets		assets
csrc/selective_scan		csrc/selective_scan
figures		figures
helmet		helmet
mamba_ssm		mamba_ssm
remapping_configs		remapping_configs
tasks		tasks
README.md		README.md
arguments.py		arguments.py
build_env.sh		build_env.sh
pyproject.toml		pyproject.toml
run_eval.py		run_eval.py
setup.py		setup.py
subseq_lambada.txt		subseq_lambada.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

LAMB

Intro

Getting Started

1. Install Environment

Evaluation

2. Configuration Parameters

3. Running Evaluations

3.1 LongBench Evaluation

3.2 Helmet Evaluation

3.3 Custom Perplexity Calculation

Disclaimer:

Citation

About

Uh oh!

Releases

Packages

Languages

GATECH-EIC/LAMB

Folders and files

Latest commit

History

Repository files navigation

LAMB

Intro

Getting Started

1. Install Environment

Evaluation

2. Configuration Parameters

3. Running Evaluations

3.1 LongBench Evaluation

3.2 Helmet Evaluation

3.3 Custom Perplexity Calculation

Disclaimer:

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages