Skip to content

AssociationSetFactory does not handle associations from multiple relations for a subject #370

@deepakunni3

Description

@deepakunni3

When fetching associations, say gene to phenotype associations, AssociationSetFactory populates an association map for each subject.

But this code does not handle associations with multiple relations.
For example, HGNC:6764 has two sets of associations:

# objects from association with relation RO:0002200
'HP:0000238', 'HP:0100587', 'HP:0000364', 'HP:0000864', 'HP:0000130', 'HP:0006501', 'HP:0008678', 'HP:0001392', 'HP:0000047', 'HP:0001903', 'HP:0007400', 'HP:0007565', 'HP:0000582', 'HP:0005522', 'HP:0003220', 'HP:0006824', 'HP:0000252', 'HP:0001639', 'HP:0002997', 'HP:0001873', 'HP:0001199', 'HP:0001631', 'HP:0001347', 'HP:0100542', 'HP:0001636', 'HP:0001562', 'HP:0000483', 'HP:0000268', 'HP:0000813', 'HP:0002863', 'HP:0000286', 'HP:0000508', 'HP:0010293', 'HP:0002245', 'HP:0001679', 'HP:0000175', 'HP:0002827', 'HP:0004322', 'HP:0000486', 'HP:0000453', 'HP:0001871', 'HP:0002007', 'HP:0001510', 'HP:0002251', 'HP:0100760', 'HP:0001643', 'HP:0002023', 'HP:0000010', 'HP:0100026', 'HP:0000135', 'HP:0006254', 'HP:0002664', 'HP:0000324', 'HP:0007874', 'HP:0000027', 'HP:0000365', 'HP:0001882', 'HP:0001053', 'HP:0001000', 'HP:0001511', 'HP:0004209', 'HP:0002575', 'HP:0010469', 'HP:0004349', 'HP:0000520', 'HP:0000083', 'HP:0001824', 'HP:0001671', 'HP:0000340', 'HP:0000316', 'HP:0002823', 'HP:0001646', 'HP:0012745', 'HP:0000072', 'HP:0008572', 'HP:0006101', 'HP:0000568', 'HP:0000218', 'HP:0001537', 'HP:0002817', 'HP:0000478', 'HP:0012041', 'HP:0000505', 'HP:0012639', 'HP:0000504', 'HP:0006265', 'HP:0005528', 'HP:0008053', 'HP:0012210', 'HP:0100867', 'HP:0005344', 'HP:0001875', 'HP:0001263', 'HP:0001763', 'HP:0000079', 'HP:0002414', 'HP:0001770', 'HP:0000518', 'HP:0000035', 'HP:0001760', 'HP:0003022', 'HP:0000347', 'HP:0002650', 'HP:0000639', 'HP:0000492', 'HP:0001249', 'HP:0001172', 'HP:0002119', 'HP:0000028'

and

# objects from association with relation RO:0003304
'EFO:0004340', 'EFO:0004765'

When populating the association map, the earlier entry is overwritten because the keys used in association map does not take relation into account. This yields an incorrect representation of associations fetched and analyses performed downstream with this association set.

Block of code where this is happening:

for a in assocs:
rel = a['relation']
subj = a['subject']
subject_label_map[subj] = a['subject_label']
amap[subj] = a['objects']

@cmungall FYI

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions