🌙 Dark mode

Getting Started

Installation

uSort-M requires Python 3.8 or later.

# Install from source
git clone https://github.com/FordyceLab/usortm.git
cd usortm
pip install -e ".[all]"

Optional Dependencies

The base installation includes cost estimation and CLI tools. Optional dependencies provide additional functionality:

# Visualization tools (matplotlib, bokeh)
pip install -e ".[viz]"

# Demultiplexing tools (biopython, pysam)
pip install -e ".[demux]"

# Development (pytest)
pip install -e ".[dev]"

# Everything
pip install -e ".[all]"

External Tools (for demultiplexing)

The usortm demux command requires these tools to be installed separately:

  • dorado (1.3+) — Barcode demultiplexing (GitHub releases)
  • minimap2 (2.20+) — Reference alignment (brew install minimap2 or conda install minimap2)
  • samtools (1.16+) — BAM processing & consensus (brew install samtools or conda install samtools)

usortm auto-discovers dorado in common locations. You can also set DORADO_PATH, MINIMAP2_PATH, or SAMTOOLS_PATH environment variables.

Quick Start

1. Estimate Costs

Before starting, get a quick estimate of costs and timeline:

usortm estimate --library-size 500 --seq-length 300

This displays:

  • Cost breakdown (synthesis, cloning, sorting, barcoding, sequencing)
  • Comparison with traditional gene synthesis
  • Effort estimates (plates, time)
  • Projected timeline

2. Plan Your Experiment

Create a project from your variant list:

usortm plan variants.csv --output my_project/

Your variants.csv should contain your variant definitions:

name,mutation
K44A,AAG->GCG
G45A,GGC->GCG
T46G,ACC->GGC

This generates:

  • my_project/usortm_project.json - Project state tracking
  • my_project/sorting_instructions.md - Wet lab guide
  • my_project/barcodes/ - Barcode assignments
  • my_project/mask_config.toml - Barcode flanking sequences (editable)

3. Follow the Wet Lab Workflow

The generated sorting_instructions.md contains detailed protocols for:

Day 1 Pooled assembly and transformation
Day 2+ FACS isolation into 384-well plates
Day N PCR barcoding and pooling
Submit Send for sequencing

4. Process Sequencing Data

After receiving sequencing data:

# Demultiplex reads (using library CSV for variant calling)
usortm demux my_project/ --fastq data.fastq --library-csv variants.csv

# Generate hit-picking list
usortm pick my_project/

# Create final report
usortm report my_project/

Example Workflow

Here's a complete example for a 500-variant library:

# Create variant list
cat > variants.csv << EOF
name,sequence
variant_001,ATGAAG...
variant_002,ATGGCG...
EOF

# Plan experiment
usortm plan variants.csv --output acyp_library/ --seq-length 297

# [Perform wet lab steps]
# [Receive sequencing data]

# Process results
usortm demux acyp_library/ --fastq nanopore_data.fastq --library-csv variants.csv
usortm pick acyp_library/
usortm report acyp_library/

# Final outputs in acyp_library/report/

Next Steps