Skip to content

Conversation

cthoyt
Copy link
Contributor

@cthoyt cthoyt commented Aug 17, 2025

Closes #67

This PR automatically generates lexical matches to FAIRsharing and adds them in the xref field

SSSOM would be a much better way to store these mappings, so this PR also has code to generate SSSOM, but it was suggested by nico just to put them in the ad-hoc xrefs list.

FYI it took longer to figure out how to make the YAML diff small than to actually make the contribution :/ there were still a few cases where I couldn't get the YAML serializer to be standardized - this suggests that there should be a standard linter and CI test for correct formatting (in a follow-up PR)

@matentzn
Copy link
Contributor

@cthoyt. This is great, thanks. Two things:

  1. Can you share a suggestion of a good linter/formatter I can install in a github action to keep the YAML standardised?
  2. I dont know yet if I need to, but would you mind if I had to change the curie string to a URI for the xrefs field? I need some clarity from the maintainers here, because I dont see an obvious place to document the prefixes.

@sierra-moxon
Copy link
Member

sierra-moxon commented Aug 18, 2025

Thanks for the edits!! awesome.

Note: there is a yamllint check in the GH actions, I'd rather not have it auto-correct and commit the fixed file as part of the GH action.

Instead of using xref, why don't we add "exact_mappings" or some other kind of mapping slot depending on the semantics intended, to the InformationResource class:

. Generating SSSOM would be nice, if we did that based on a schema design pattern (e.g., "exact_mappings", I could emulate in other LinkML schemas, this could be a generic tool.

@cthoyt
Copy link
Contributor Author

cthoyt commented Aug 18, 2025

@matentzn

  1. Can you share a suggestion of a good linter/formatter I can install in a github action to keep the YAML standardised?

the way every other project I've done is always to have a custom, explicit YAML/JSON formatter script that's part of the repo

  1. I dont know yet if I need to, but would you mind if I had to change the curie string to a URI for the xrefs field? I need some clarity from the maintainers here, because I dont see an obvious place to document the prefixes.

I don't mind, this doesn't make a difference. Any reasonable data consumption workflow uses the bioregistry to standardize to CURIEs after the fact anyway ;)

@sierra-moxon does LinkML not have support for SSSOM? Something to consider. Otherwise, I think having an exact mapping slot would make a lot more sense (also using CURIEs)

@sierra-moxon
Copy link
Member

there is a SSSOM schema element mapping generator, yes: https://github.com/linkml/linkml/blob/main/linkml/generators/sssomgen.py

I would consider these value mappings.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add FAIRsharing references
3 participants