Skip to content

Conversation

SandyRogers
Copy link
Member

@SandyRogers SandyRogers commented May 16, 2025

This PR:

  • Adds VIRify pipeline subflow, using the ASA pipelines "downstream samplesheet" output
  • Adds VIRify GFF to Analysis's downloads
  • Refactors the import process of ASA,MAP and VIRify using a more schema based solution.

There is a bit of work to do:

  • Partial states (e.g. ASA passes but VIRify fails)
  • Ingetion of more files from MAP and VIRIfy (as download files)

@SandyRogers SandyRogers changed the title Ass virify followup to ASA pipeline VIRify followup to ASA pipeline May 18, 2025
@mberacochea mberacochea self-assigned this May 23, 2025
SandyRogers and others added 19 commits June 4, 2025 21:36
* fix dev container build/deps

* prefect+pyslurm upgrade fixes

* updates for py3.12 / prefect 3.4
Often library_strategy metadata is wrong/missing in ENA.
These policy allow prefect user to select an option for
including/overriding/trusting the provided metadata.
Otherwise, potentially AMPLICON runs are fetched from ENA,
but not included in samplesheets.
Also added a CLAUDE.md file to help the robots.
Copy link
Member Author

@SandyRogers SandyRogers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mberacochea thanks for tackling the big schema question. I like the compositional schema approach a lot, and also the attempt at brining DownloadFile and the file-rules' File node closer. I think we should bring them completely together, and I left some other ideas scattered through about it too. Happy to talk in person if easier - ta!

Comment on lines +195 to +203
elif isinstance(file_item, tuple) and len(file_item) == 2:
# If it's a tuple of (filename, rules), create a File with those rules
filename, file_rules = file_item
directory.add_file(filename, rules=file_rules)
elif isinstance(file_item, str):
# If it's a string, create a File with default rules
directory.add_file(file_item)
else:
raise ValueError(f"Unsupported file specification: {file_item}")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neat, I like the default handling if passing strings.

However:

  • a pathlib.Path object is as likely as a str so it probably makes sense to handle that the same?
  • I've probably had too much pydantic koolaid, but similar to the create_file factory, I would argue that Directorys should be created directly rather than wrapped like this. In pydantic these conditions can be handled with pre validators. Doing so also means the ValueError would happen automatically.
    (Unless I am missing something again obviously)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, I've deleted the factory method


if not allow_non_exist:
glob_rules.append(GlobOfTaxonomyFolderHasHtmlAndKronaTxtRule)
class AssemblyDirectorySchema(BaseModel):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. I guess adding explicit subdirectories support to the filerules' node Directory would cover this right?

Comment on lines +86 to +89
except FileExistsError:
logger.warning(
f"Download with alias {download.alias} already exists for analysis {analysis.accession}"
)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this only be a warning?

@@ -0,0 +1,358 @@
import pytest
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

file naming: why ztest? this probably breaks pytest discovery

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I've disabled that one as it's currently broken

@@ -0,0 +1,282 @@
import pytest
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same q: why ztest in file name?

@mberacochea mberacochea marked this pull request as ready for review September 12, 2025 15:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants