Skip to content

Conversation

jgilis
Copy link

@jgilis jgilis commented Aug 25, 2025

Merging this PR has the same effect as merging the four suggested features (individual PRs) separately. Hence, this PR makes things easier if you would agree with all proposed changes. Below, I summarize all changes:

Maintenance
A few functions in the current Medusa/development branch are outdated, or now use different conventions. This affects the package performance, e.g., the first tutorial currently doesn't run, and running the pytests is also not possible. Merging this maintenance branch resolves this by:

  • replace from cobra.test import create_test_model with from cobra.io import load_model everywhere
  • replace python setup.py development with python setup.py develop
  • replace pd.append with pd.concat
  • replace rownames.contains with in rownames
  • In two scripts (test_ensemble.py and flux_balance.py), a small workaround was introduced. Originally, reactions objects were matched directly between the ensemble.base_model and the textbook model. This sometimes leads to issues (we can discuss), which are circumvented by searching on reaction.id.

Refactor populate features base
Merging this PR will update the ensemble.py function, more specifically, the _populate_features_base subroutine. This feature speeds up generating an ensemble from an existing list of GEMs considerably, e.g., for my personal project with a list of 150 GEMs, the runtime reduces from 1min30s to 2s. The speed gain can be largely attributed to pre-caching the model reaction attributes in dictionaries, thus avoiding repeated getattr calls.

Fast bounds feature
One aspect hampering the use of Medusa for certain applications is the need to construct a list of GEMs prior to creating and analyzing the ensemble. Indeed, in many situations, it is computationally less expensive to create and analyze each member of the fly. This feature allows for directly populating the ensemble without needing to first generate each individual member, for one specific situation. This feature allows for directly populating an ensemble where members only differ w.r.t. the lower and/or upper bounds for certain reactions. Note that as bounds can be set to zero, reaction reversibility and presence is allowed to be affected. Changes made:

  • boundsEnsemble.py: workhorse
  • test_boundsEnsemble.py: test validity

Happy to discuss if functions like these would be a good fit for Medusa, or if they should be ported elsewhere.

Fast BOF feature
One aspect hampering the use of Medusa for certain applications is the need to construct a list of GEMs prior to creating and analyzing the ensemble. Indeed, in many situations, it is computationally less expensive to create and analyze each member of the fly. This feature allows for directly populating the ensemble without needing to first generate each individual member, for one specific situation. This feature allows for directly populating an ensemble where members only differ w.r.t. their biomass objective function (BOF) definition. Note that as BOF coefficients can be set to zero, metabolites can either be omitted or admitted to the BOF. Changes made:

  • bofEnsemble.py: workhorse
  • test_boundsEnsemble.py: test validity

Happy to discuss if functions like these would be a good fit for Medusa, or if they should be ported elsewhere.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant