This project will work on adding a module to filter out genomic contaminants to the nf-core/sarek pipeline. This would be analogous to what is currently implemented in the nf-core/rnaseq pipeline with BBSplit. We will focus on implementing xengsort to do this. The steps here would be:

  • finishing off the xengsort module
  • making a subworkflow to run xengsort
  • integrating this subworkflow into sarek

All skill levels welcome.

Goals

Introduce the filtering of genomic contaminants as a feature in the sarek pipeline.

Fry wondering if he'll win a Nobel Prize
category
pipelines