CovEReD — Counterfactuals via Entity Replacement for DocRE

This repository contains the code and data associated with our paper, "Consistent Document-Level Relation Extraction via Counterfactuals", to be presented in the Findings of EMNLP 2024.

Abstract

Many datasets have been developed to train and evaluate document-level relation extraction (RE) models. Most of these are constructed using real-world data. It has been shown that RE models trained on real-world data suffer from factual biases. To evaluate and address this issue, we present CovEReD, a counterfactual data generation approach for document-level relation extraction datasets using entity replacement. We first demonstrate that models trained on factual data exhibit inconsistent behavior: while they accurately extract triples from factual data, they fail to extract the same triples after counterfactual modification. This inconsistency suggests that models trained on factual data rely on spurious signals such as specific entities and external knowledge—rather than on the input context—to extract triples. We show that by generating document-level counterfactual data with CovEReD and training models on them, consistency is maintained with minimal impact on RE performance. We release our CovEReD pipeline as well as Re-DocRED-CF, a dataset of counterfactual RE documents, to assist in evaluating and addressing inconsistency in document-level RE.

Re-DocRED-CF

For training and evaluation, we have run CovEReD five times on all Re-DocRED splits. All five sets of train/dev/test dataset files are available through HuggingFace Datasets 🤗. To select a specific variation (e.g. var-01):

dataset = load_dataset("amodaresi/Re-DocRED-CF", "var-01")

Output:

DatasetDict({
    train: Dataset({
        features: ['title', 'labels', 'original_doc_id', 'vertexSet', 'sents'],
        num_rows: 2870
    })
    dev: Dataset({
        features: ['title', 'labels', 'original_doc_id', 'vertexSet', 'sents'],
        num_rows: 466
    })
    test: Dataset({
        features: ['title', 'labels', 'original_doc_id', 'vertexSet', 'sents'],
        num_rows: 453
    })
    train_mix: Dataset({
        features: ['title', 'labels', 'original_doc_id', 'vertexSet', 'sents'],
        num_rows: 5923
    })
})

The train_mix is the original training set combined with its counterfactual variation counterpart. We have also included four additional training set variations (var-[06, 07, 08, 09]), though they were not used in the evaluations presented in our paper.

The properties title, labels, vertexSet, and sents are structured similarly to those in the original DocRED & Re-DocRED datasets:

title: Document title.
labels: List of relations. Each entry indicates the relation between a head and a tail entity, with some entries also specifying evidence sentences.
vertexSet: List of entity vertex sets. Each entry represents a vertex specifying all mentions of an entity by their position in the document, along with their type.
sents: Tokenized sentences.

In examples that are counterfactually generated, the title includes a variation number. For example: AirAsia Zest ### 1. The original_doc_id denotes the index of the example in the original seed dataset, i.e., Re-DocRED.

Counterfactual Generation

To generate counterfactuals, you will first need to download the seed DocRE dataset.

For instance, for Re-DocRED:

git clone https://github.com/tonytan48/Re-DocRED.git

After downloading the dataset, use the code/fix_overlapping_entities.ipynb notebook to apply the entity mention cleanup process and store the output dataset.

Using the output dataset, run code/generate_augmented_versions.py to generate counterfactual sets. For dev/test sets, use the code/generate_augmented_versions_val.py file, which also utilizes the training set to gather a larger entity replacement pool. In the aforementioned files, you can edit the following hyperparameters:

N_CPUs: The number of cores used for multiprocessing
MAX_ALTERNATIVES: Maximum number of alternatives to sample from for each node
THR: Affected relations threshold; an augmented document should have more than THR of its relations affected by the replacements
T_SELFSIM and T_SELFSIM_UPPER: Minimum and maximum entity mention similarity threshold
T_CONTEXTSIM: Minimum context similarity threshold

The code/utils folder includes files to load contriever, which are retrieved from:
https://github.com/facebookresearch/contriever

Cite

If you use the dataset, CovEReD pipeline, or code from this repository, please cite the paper:

@inproceedings{modarressi-covered-2024,
    title="Consistent Document-Level Relation Extraction via Counterfactuals", 
    author="Ali Modarressi and Abdullatif Köksal and Hinrich Schütze",
    year="2024",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2024",
    address = "Miami, United States",
    publisher = "Association for Computational Linguistics",
}

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
code		code
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

CovEReD — Counterfactuals via Entity Replacement for DocRE

Abstract

Re-DocRED-CF

Output:

Counterfactual Generation

Cite

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

amodaresi/CovEReD

Folders and files

Latest commit

History

Repository files navigation

CovEReD — Counterfactuals via Entity Replacement for DocRE

Abstract

Re-DocRED-CF

Output:

Counterfactual Generation

Cite

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages