Skip to content

Using Mendenian Inheritance Expectations to Assess Models

If you have trio-binned test genomes, TrioTrain can help calculate Mendelian Inheritance Error rate using rtg-tools mendelian. However, you must create a Sequence Data File (SDF) for each reference genome in the same directory as the reference genome in a sub-directory called rtg_tools/. Additional details about rtg-tools can be found on GitHub, or by reviewing the PDF documentation here.

Create a Reference Sequence Data File

Warning

This step is specific to the Human reference genome GRCh38. Cattle-specific input files are packaged with TrioTrain. If you are working with a new species, you will need to create this file for your reference genome.

After completing the tutorial walk-through, create the Human reference SDF by running the following at the command line:

source ./scripts/start_conda.sh     # Ensure the previously built conda env is active
bash scripts/setup/setup_rtg_tools.sh

For other species, use the following template:

Example | Creating the SDF
./scripts/setup/setup_rtg_tools.sh
#!/bin/bash
# scripts/setup/build_rtg_tools.sh

echo -e "=== scripts/setup/build_rtg_tools.sh > start $(date)"

##======= Create RTG-TOOLS SDF ======================================##
# required for using rtg-tools 'mendelian'
if [ ! -f ./triotrain/variant_calling/data/GIAB/reference/rtg_tools/reference.txt ]; then
    rtg format -o ./triotrain/variant_calling/data/GIAB/reference/rtg_tools/ ./triotrain/variant_calling/data/GIAB/reference/GRCh38_no_alt_analysis_set.fasta
else
    echo "$(date '+%Y-%m-%d %H:%M:%S') INFO: RTG-TOOLS SDF already exists... SKIPPING AHEAD"
fi

echo -e "=== scripts/setup/build_rtg_tools.sh> end $(date)"

Last update: July 28, 2023