nf-cmgg/germline pipeline parameters
A nextflow pipeline for calling and annotating small germline variants from short DNA reads for WES and WGS data
Input/output options
Define where the pipeline should find input data and save output data.
Parameter | Description | Type | Default | Required | Hidden |
---|---|---|---|---|---|
input |
Path to comma-separated file containing information about the samples in the experiment. HelpYou will need to create a design file with information about the samples in your experiment before running the pipeline. Use this parameter to specify its location. It has to be a comma-separated file with samples, and a header row. See usage docs. |
string |
True | ||
outdir |
The output directory where the results will be saved. You have to use absolute paths to storage on Cloud infrastructure. | string |
True | ||
watchdir |
A folder to watch for the creation of files that start with watch: in the samplesheet. |
string |
|||
email |
Email address for completion summary. HelpSet this parameter to your e-mail address to get a summary e-mail with details of the run sent to you when the workflow exits. If set in your user config file (~/.nextflow/config ) then you don't need to specify this on the command line for every run. |
string |
|||
ped |
Path to a pedigree file for all samples in the run. All relational data will be fetched from this file. | string |
Reference genome options
Reference genome related files and options required for the workflow.
Parameter | Description | Type | Default | Required | Hidden |
---|---|---|---|---|---|
genome |
Reference genome build. Used to fetch the right reference files. HelpRequires a Genome Reference Consortium reference ID (e.g. GRCh38) |
string |
GRCh38 | ||
fasta |
Path to FASTA genome file. HelpThis parameter is mandatory if--genome is not specified. The path to the reference genome fasta. |
string |
True | ||
fai |
Path to FASTA genome index file. | string |
|||
dict |
Path to the sequence dictionary generated from the FASTA reference. This is only used when haplotypecaller is one of the specified callers. |
string |
|||
strtablefile |
Path to the STR table file generated from the FASTA reference. This is only used when --dragstr has been given. |
string |
|||
sdf |
Path to the SDF folder generated from the reference FASTA file. This is only required when using --validate . |
string |
|||
elfasta |
Path to the ELFASTA genome file. This is used when elprep is part of the callers and will be automatically generated when missing. |
string |
|||
elsites |
Path to the elsites file. This is used when elprep is part of the callers. |
string |
|||
genomes |
Object for genomes | object |
True | ||
genomes_base |
Directory base for CMGG reference store (used when --genomes_ignore false is specified) |
string |
/references/ | ||
cmgg_config_base |
The base directory for the local config files | string |
/conf/ | True | |
genomes_ignore |
Do not load the local references from the path specified with --genomes_base |
boolean |
True | ||
igenomes_base |
Directory / URL base for iGenomes references. | string |
True | ||
igenomes_ignore |
Do not load the iGenomes reference config. HelpDo not loadigenomes.config when running the pipeline. You may choose this option if you observe clashes between custom parameters and those supplied in igenomes.config . |
boolean |
True |
Pipeline specific parameters
Parameters that define how the pipeline works
Parameter | Description | Type | Default | Required | Hidden |
---|---|---|---|---|---|
scatter_count |
The amount of scattering that should happen per sample. HelpIncrease this number to increase the pipeline run speed, but at the tradeoff of using more IO and disk space. This can differ from the actual scatter count in some cases (especially with smaller files).This has an effect on HaplotypeCaller, GenomicsDBImport and GenotypeGVCFs. |
integer |
40 | ||
merge_distance |
The merge distance for family BED files HelpIncrease this parameter if GenomicsDBImport is running slow. This defines the maximum distance between intervals that should be merged. The less intervals GenomicsDBImport actually gets, the faster it will run. |
integer |
100000 | ||
dragstr |
Create DragSTR models to be used with HaplotypeCaller HelpThis currently is only able to run single-core per sample. Due to this, the process is very slow with only very small improvements to the analysis. |
boolean |
|||
validate |
Validate the found variants | boolean |
|||
filter |
Filter the found variants. | boolean |
|||
annotate |
Annotate the found variants using Ensembl VEP. | boolean |
|||
add_ped |
Add PED INFO header lines to the final VCFs. | boolean |
|||
gemini |
Create a Gemini databases from the final VCFs. | boolean |
|||
mosdepth_slow |
Don't run mosdepth in fast-mode HelpThis is advised if you need exact coverage BED files as output. |
boolean |
|||
roi |
Path to the default ROI (regions of interest) BED file to be used for WES analysis. HelpThis will be used for all samples that do not have a specific ROI file supplied to them through the samplesheet. Don't supply an ROI file to run the analysis as WGS. |
string |
|||
dbsnp |
Path to the dbSNP VCF file. This will be used to set the variant IDs. | string |
|||
dbsnp_tbi |
Path to the index of the dbSNP VCF file. | string |
|||
somalier_sites |
Path to the VCF file with sites for Somalier to use. | string |
https://github.com/brentp/somalier/files/3412456/sites.hg38.vcf.gz | ||
only_call |
Only call the variants without doing any post-processing. | boolean |
|||
only_merge |
Only run the pipeline until the creation of the genomicsdbs and output them. | boolean |
|||
output_genomicsdb |
Output the genomicsDB together with the joint-genotyped VCF. | boolean |
|||
callers |
A comma delimited string of the available callers. Current options are: haplotypecaller and vardict . |
string |
haplotypecaller | ||
vardict_min_af |
The minimum allele frequency for VarDict when no vardict_min_af is supplied in the samplesheet. |
number |
0.1 | ||
normalize |
Normalize the variant in the final VCFs. | boolean |
|||
only_pass |
Filter out all variants that don't have the PASS filter for vardict. This only works when --filter is also given. |
boolean |
|||
keep_alt_contigs |
Keep all aditional contigs for calling instead of filtering them out before. | boolean |
|||
updio |
Run UPDio analysis on the final VCFs. | boolean |
|||
updio_common_cnvs |
A TSV file containing common CNVs to be used by UPDio. | string |
|||
automap |
Run AutoMap analysis on the final VCFs. | boolean |
|||
automap_repeats |
BED file with repeat regions in the genome. HelpThis file will be automatically generated for hg38/GRCh38 and hg19/GRCh37 when this parameter has not been given. |
string |
|||
automap_panel |
TXT file with gene panel regions to be used by AutoMap. HelpBy default the CMGG gene panel list will be used. |
string |
|||
automap_panel_name |
The panel name of the panel given with --automap_panel. | string |
cmgg_bio | ||
hc_phasing |
Perform phasing with HaplotypeCaller. | boolean |
|||
min_callable_coverage |
The lowest callable coverage to determine callable regions. | integer |
5 | ||
unique_out |
Don't change this value | string |
True |
Institutional config options
Parameters used to describe centralised config profiles. These should not be edited.
Parameter | Description | Type | Default | Required | Hidden |
---|---|---|---|---|---|
custom_config_version |
Git commit id for Institutional configs. | string |
master | True | |
custom_config_base |
Base directory for Institutional configs. HelpIf you're running offline, Nextflow will not be able to fetch the institutional config files from the internet. If you don't need them, then this is not a problem. If you do need them, you should download the files from the repo and tell Nextflow where to find them with this parameter. |
string |
https://raw.githubusercontent.com/nf-core/configs/master | True | |
config_profile_name |
Institutional config name. | string |
True | ||
config_profile_description |
Institutional config description. | string |
True | ||
config_profile_contact |
Institutional config contact information. | string |
True | ||
config_profile_url |
Institutional config URL link. | string |
True |
Generic options
Less common options for the pipeline, typically set in a config file.
Parameter | Description | Type | Default | Required | Hidden |
---|---|---|---|---|---|
version |
Display version and exit. | boolean |
|||
publish_dir_mode |
Method used to save pipeline results to output directory. HelpThe NextflowpublishDir option specifies which intermediate files should be saved to the output directory. This option tells the pipeline what method should be used to move these files. See Nextflow docs for details. |
string |
copy | ||
email_on_fail |
Email address for completion summary, only when pipeline fails. HelpAn email address to send a summary email to when the pipeline is completed - ONLY sent if the pipeline does not exit successfully. |
string |
True | ||
plaintext_email |
Send plain-text email instead of HTML. | boolean |
True | ||
max_multiqc_email_size |
File size limit when attaching MultiQC reports to summary emails. | string |
25.MB | True | |
monochrome_logs |
Do not use coloured log outputs. | boolean |
True | ||
hook_url |
Incoming hook URL for messaging service HelpIncoming hook URL for messaging service. Currently, MS Teams and Slack are supported. |
string |
|||
multiqc_title |
MultiQC report title. Printed as page header, used for filename if not otherwise specified. | string |
|||
multiqc_config |
Custom config file to supply to MultiQC. | string |
|||
multiqc_logo |
Custom logo file to supply to MultiQC. File name must also be set in the MultiQC config file | string |
|||
multiqc_methods_description |
Custom MultiQC yaml file containing HTML including a methods description. | string |
|||
validate_params |
Boolean whether to validate parameters against the schema at runtime | boolean |
True | True | |
pipelines_testdata_base_path |
Base URL or local path to location of pipeline test dataset files | string |
https://raw.githubusercontent.com/nf-core/test-datasets/ | True |
Annotation parameters
Parameters to configure Ensembl VEP and VCFanno
Parameter | Description | Type | Default | Required | Hidden |
---|---|---|---|---|---|
vep_chunk_size |
The amount of sites per split VCF as input to VEP. | integer |
50000 | ||
species |
The species of the samples. HelpMust be lower case and have underscores as spaces. |
string |
homo_sapiens | ||
vep_merged |
Specify if the VEP cache is a merged cache. | boolean |
True | ||
vep_cache |
The path to the VEP cache. | string |
|||
vep_dbnsfp |
Use the dbNSFP plugin with Ensembl VEP. HelpThe '--dbnsfp' and '--dbnsfp_tbi' parameters need to be specified when using this parameter. |
boolean |
|||
vep_spliceai |
Use the SpliceAI plugin with Ensembl VEP. HelpThe '--spliceai_indel', '--spliceai_indel_tbi', '--spliceai_snv' and '--spliceai_snv_tbi' parameters need to be specified when using this parameter. |
boolean |
|||
vep_spliceregion |
Use the SpliceRegion plugin with Ensembl VEP. | boolean |
|||
vep_mastermind |
Use the Mastermind plugin with Ensembl VEP. HelpThe '--mastermind' and '--mastermind_tbi' parameters need to be specified when using this parameter. |
boolean |
|||
vep_maxentscan |
Use the MaxEntScan plugin with Ensembl VEP. HelpThe '--maxentscan' parameter need to be specified when using this parameter. |
boolean |
|||
vep_eog |
Use the custom EOG annotation with Ensembl VEP. HelpThe '--eog' and '--eog_tbi' parameters need to be specified when using this parameter. |
boolean |
|||
vep_alphamissense |
Use the AlphaMissense plugin with Ensembl VEP. HelpThe '--alphamissense' and '--alphamissense_tbi' parameters need to be specified when using this parameter. |
boolean |
|||
vep_version |
The version of the VEP tool to be used. | number |
105.0 | ||
vep_cache_version |
The version of the VEP cache to be used. | integer |
105 | ||
dbnsfp |
Path to the dbSNFP file. | string |
|||
dbnsfp_tbi |
Path to the index of the dbSNFP file. | string |
|||
spliceai_indel |
Path to the VCF containing indels for spliceAI. | string |
|||
spliceai_indel_tbi |
Path to the index of the VCF containing indels for spliceAI. | string |
|||
spliceai_snv |
Path to the VCF containing SNVs for spliceAI. | string |
|||
spliceai_snv_tbi |
Path to the index of the VCF containing SNVs for spliceAI. | string |
|||
mastermind |
Path to the VCF for Mastermind. | string |
|||
mastermind_tbi |
Path to the index of the VCF for Mastermind. | string |
|||
alphamissense |
Path to the TSV for AlphaMissense. | string |
|||
alphamissense_tbi |
Path to the index of the TSV for AlphaMissense. | string |
|||
eog |
Path to the VCF containing EOG annotations. | string |
|||
eog_tbi |
Path to the index of the VCF containing EOG annotations. | string |
|||
vcfanno |
Run annotations with vcfanno. | boolean |
|||
vcfanno_config |
The path to the VCFanno config TOML. | string |
|||
vcfanno_lua |
The path to a Lua script to be used in VCFanno. | string |
|||
vcfanno_resources |
A semicolon-seperated list of resource files for VCFanno, please also supply their indices using this parameter. | string |