VARIANT

Viral mutAtion trackeR aImed At GeNome and proTein-level

Comprehensive Virus Mutation Analysis Tool

How to Upload Your Data

You need to upload 3 types of files for each virus:

Reference Genome

Complete genome sequence (.fasta)

Reference Proteome

Protein sequences (.fasta)

MSA File

Multiple sequence alignment (.txt)

Create Custom Virus
Enter a unique name for your virus
Custom Virus Benefits:
  • Upload your own virus data
  • No need for pre-configuration
  • Support for any virus type
  • Automatic file organization
Upload Data Files
One-click default setup for trying analysis and visualization quickly

Use one click instead of opening the file type dropdown each time.
Choose the virus you want to analyze
Only appears for multi-segment viruses like H3N2
Choose the type of file you're uploading
Choose the file to upload from your computer. If file type is empty, VARIANT will try to auto-detect it.
Data Requirements
Required Files for Analysis:
  • Reference Genome: FASTA format (.fasta)
  • Reference Proteome: FASTA format (.fasta)
  • MSA File: Multiple sequence alignment (.txt)
What Correct Input Looks Like (Examples)

Examples only. Upload your real files above.

Reference Genome FASTA (example)
>NC_045512 Severe acute respiratory syndrome coronavirus 2 isolate Wuhan-Hu-1, complete genome
ATTAAAGGTTTATACCTTCCCAGGTAACAAACCAACCAACTTTCGATCTCTTGTAGATCTGTTCTCTAAA
CGAACTTTAAAATCTGTGTGGCTGTCACTCGGCTGCATGCTTAGTGCACTCACGCAGTATAATTAATAAC
TAATTACTGTCGTTGACAGGACACGAGTAACTCGTCTATCTTCTGCAGGCTGCTTACGGTTTCGTCCGTG
TTGCAGCCGATCATCAGCACATCTAGGTTTCGTCCGGGTGTGACCGAAAGGTAAGATGGAGAGCCTTGTC
CCTGGTTTCAACGAGAAAACACACGTCCAACTCAGTTTGCCTGTTTTACAGGTTCGCGACGTGCTCGTAC
GTGGCTTTGGAGACTCCGTGGAGGAGGTCTTATCAGAGGCACGTCAACATCTTAAAGATGGCACTTGTGG
CTTAGTAGAAGTTGAAAAAGGCGTTTTGCCTCAACTTGAACAGCCCTATGTGTTCATCAAACGTTCGGAT
GCTCGAACTGCACCTCATGGTCATGTTATGGTTGAGCTGGTAGCAGAACTCGAAGGCATTCAGTACGGTC
GTAGTGGTGAGACACTTGGTGTCCTTGTCCCTCATGTGGGCGAAATACCAGTGGCTTACCGCAAGGTTCT
TCTTCGTAAGAACGGTAATAAAGGAGCTGGTGGCCATAGTTACGGCGCCGATCTAAAGTCATTTGACTTA
GGCGACGAGCTTGGCACTGATCCTTATGAAGATTTTCAAGAAAACTGGAACACTAAACATAGCAGTGGTG
TTACCCGTGAACTCATGCGTGAGCTTAACGGAGGGGCATACACTCGCTATGTCGATAACAACTTCTGTGG
CCCTGATGGCTACCCTCTTGAGTGCATTAAAGACCTTCTAGCACGTGCTGGTAAAGCTTCATGCACTTTG
TCCGAACAACTGGACTTTATTGACACTAAGAGGGGTGTATACTGCTGCCGTGAACATGAGCATGAAATTG
CTTGGTACACGGAACGTTCTGAAAAGAGCTATGAATTGCAGACACCTTTTGAAATTAAATTGGCAAAGAA
ATTTGACACCTTCAATGGGGAATGTCCAAATTTTGTATTTCCCTTAAATTCCATAATCAAGACTATTCAA
CCAAGGGTTGAAAAGAAAAAGCTTGATGGCTTTATGGGTAGAATTCGATCTGTCTATCCAGTTGCGTCAC
CAAATGAATGCAACCAAATGTGCCTTTCAACTCTCATGAAGTGTGATCATTGTGGTGAAACTTCATGGCA
GACGGGCGATTTTGTTAAAGCCACTTGCGAATTTTGTGGCACTGAGAATTTGACTAAAGAAGGTGCCACT
... (remaining lines omitted for brevity in this patch context)
Proteome FASTA (example)
>YP_009724390.1|spike_surface_glycoprotein|21563..25384
MFVFLVLLPLVSSQCVNLTTRTQLPPAYTNSFTRGVYYPDKVFRSSVLHSTQDLFLPFFSNVTWFHAIHV
SGTNGTKRFDNPVLPFNDGVYFASTEKSNIIRGWIFGTTLDSKTQSLLIVNNATNVVIKVCEFQFCNDPF
LGVYYHKNNKSWMESEFRVYSSANNCTFEYVSQPFLMDLEGKQGNFKNLREFVFKNIDGYFKIYSKHTPI
NLVRDLPQGFSALEPLVDLPIGINITRFQTLLALHRSYLTPGDSSSGWTAGAAAYYVGYLQPRTFLLKYN
ENGTITDAVDCALDPLSETKCTLKSFTVEKGIYQTSNFRVQPTESIVRFPNITNLCPFGEVFNATRFASV
YAWNRKRISNCVADYSVLYNSASFSTFKCYGVSPTKLNDLCFTNVYADSFVIRGDEVRQIAPGQTGKIAD
YNYKLPDDFTGCVIAWNSNNLDSKVGGNYNYLYRLFRKSNLKPFERDISTEIYQAGSTPCNGVEGFNCYF
PLQSYGFQPTNGVGYQPYRVVVLSFELLHAPATVCGPKKSTNLVKNKCVNFNFNGLTGTGVLTESNKKFL
PFQQFGRDIADTTDAVRDPQTLEILDITPCSFGGVSVITPGTNTSNQVAVLYQDVNCTEVPVAIHADQLT
PTWRVYSTGSNVFQTRAGCLIGAEHVNNSYECDIPIGAGICASYQTQTNSPRRARSVASQSIIAYTMSLG
AENSVAYSNNSIAIPTNFTISVTTEILPVSMTKTSVDCTMYICGDSTECSNLLLQYGSFCTQLNRALTGI
AVEQDKNTQEVFAQVKQIYKTPPIKDFGGFNFSQILPDPSKPSKRSFIEDLLFNKVTLADAGFIKQYGDC
LGDIAARDLICAQKFNGLTVLPPLLTDEMIAQYTSALLAGTITSGWTFGAGAALQIPFAMQMAYRFNGIG
VTQNVLYENQKLIANQFNSAIGKIQDSLSSTASALGKLQDVVNQNAQALNTLVKQLSSNFGAISSVLNDI
LSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRASANLAATKMSECVLGQSKRVDFCGKGYHLM
SFPQSAPHGVVFLHVTYVPAQEKNFTTAPAICHDGKAHFPREGVFVSNGTHWFVTQRNFYEPQIITTDNT
FVSGNCDVVIGIVNNTVYDPLQPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLNEVA
KNLNESLIDLQELGKYEQYIKWPWYIWLGFIAGLIAIVMVTIMLCCMTSCCSCLKGCCSCGSCCKFDEDD
SEPVLKGVKLHYT
MSA text/FASTA (example)
CLUSTAL O(1.2.4) multiple sequence alignment


NC_045512.2                                                        ATTAAAGGTTTATACCTTCCCAGGTAACAAACCAACCAACTTTCGATCTCTTGTAGATCT	60
hCoV-19/Shanghai/SJTU-236325/2022|EPI_ISL_16327572|2022-12-19      -TTAAAGGTTTATACCTTCCCAGGTAACAAACCAACCAACTTTTGATCTCTTGTAGATCT	59
                                                                    ****************************************** ****************

NC_045512.2                                                        GTTCTCTAAACGAACTTTAAAATCTGTGTGGCTGTCACTCGGCTGCATGCTTAGTGCACT	120
hCoV-19/Shanghai/SJTU-236325/2022|EPI_ISL_16327572|2022-12-19      GTTCTCTAAACGAACTTTAAAATCTGTGTGGCTGTCACTCGGCTGCATGCTTAGTGCACT	119
                                                                   ************************************************************

NC_045512.2                                                        CACGCAGTATAATTAATAACTAATTACTGTCGTTGACAGGACACGAGTAACTCGTCTATC	180
hCoV-19/Shanghai/SJTU-236325/2022|EPI_ISL_16327572|2022-12-19      CACGCAGTATAATTAATAACTAATTACTGTCGTTGACAGGACACGAGTAACTCGTCTATC	179
                                                                   ************************************************************

NC_045512.2                                                        TTCTGCAGGCTGCTTACGGTTTCGTCCGTGTTGCAGCCGATCATCAGCACATCTAGGTTT	240
hCoV-19/Shanghai/SJTU-236325/2022|EPI_ISL_16327572|2022-12-19      TTCTGCAGGCTGCTTACGGTTTCGTCCGTGTTGCAGCCGATCATCAGCACATCTAGGTTT	239
                                                                   ************************************************************

NC_045512.2                                                        CGTCCGGGTGTGACCGAAAGGTAAGATGGAGAGCCTTGTCCCTGGTTTCAACGAGAAAAC	300
hCoV-19/Shanghai/SJTU-236325/2022|EPI_ISL_16327572|2022-12-19      TGTCCGGGTGTGACCGAAAGGTAAGATGGAGAGCCTTGTCCCTGGTTTCAACGAGAAAAC	299
                                                                    ***********************************************************

NC_045512.2                                                        ACACGTCCAACTCAGTTTGCCTGTTTTACAGGTTCGCGACGTGCTCGTACGTGGCTTTGG	360
hCoV-19/Shanghai/SJTU-236325/2022|EPI_ISL_16327572|2022-12-19      ACACGTCCAACTCAGTTTGCCTGTTTTACAGGTTCGCGACGTGCTCGTACGTGGCTTTGG	359
                                                                   ************************************************************

NC_045512.2                                                        AGACTCCGTGGAGGAGGTCTTATCAGAGGCACGTCAACATCTTAAAGATGGCACTTGTGG	420
hCoV-19/Shanghai/SJTU-236325/2022|EPI_ISL_16327572|2022-12-19      AGACTCCGTGGAGGAGGTCTTATCAGAGGCACGTCAACATCTTAAAGATGGCACTTGTGG	419
                                                                   ************************************************************

... (remaining lines omitted for brevity in this patch context)
Important Notes:
  • Files are organized by virus type and segment (if applicable)
  • Reference genomes and proteomes go to refs/ folder
  • MSA files go to clustalW/ folder
  • Upload files before starting analysis
Upload Checklist:
Start New Analysis
Analysis Configuration
Leave empty to process all genomes in MSA file, or specify a specific genome ID
Select a virus with all required data files uploaded to enable analysis.
Loading...

Analysis in progress... This may take several minutes.

Generate Visualization
Leave empty to process the 1st genome, or specify a specific genome ID
Select a virus with completed analysis results to generate visualizations.
This will generate all three visualization types
Visualization Types:
Mutation Analysis

Combined genome and protein mutation analysis.

Row/Hot Mutations

Visualization of row and hot mutations on protein bars.

PRF Regions

Programmed Ribosomal Frameshifting regions analysis.

Interactive Visualization
No Visualization Available

Generate a visualization above to see it here.

RNA Secondary Structure β€” Dual Graph Assignment

Submit an RNA secondary structure in dot-bracket notation to assign its dual graph topology ID. The pipeline converts your structure through DSSR format β†’ CT file β†’ dual graph library match.

Used as the file base name throughout the pipeline.
Upload a .dbn file. The Structure ID is auto-filled from the filename.
Track Your Jobs
Enter a Job ID to check the status of a specific analysis job.
Analysis History

Loading analysis history...

VARIANT Documentation

Viral mutAtion trackeR aImed At GeNome and proTein-level

Features

Genome Analysis

Comprehensive mutation tracking across entire viral genomes

Protein Impact

Analysis of mutations on protein structure and function

Frameshift Detection

Identification of potential ribosomal frameshifting sites

Interactive Visualizations

Generate all three visualization types with one click

Streamlined Workflow
  • Auto-populated virus selection
  • One-click visualization generation
  • Real-time job tracking
  • Automatic frameshift detection
Advanced Visualizations
  • Interactive HTML visualizations
  • Fullscreen viewing mode
  • Multiple visualization types
  • Export capabilities

Quick Start

πŸš€ Get Started in 4 Steps

Follow these simple steps to analyze your virus data

Step 1: Data Upload

  1. 1Go to Data Upload tab
  2. 2Select virus type (or create custom)
  3. 3Upload reference genome (.fasta)
  4. 4Upload reference proteome (.fasta)
  5. 5Upload MSA file (.txt)

Step 2: Analysis & Visualization

  1. 1Go to Analysis & Visualization tab
  2. 2Select your virus (auto-populated)
  3. 3Optional: Enter specific genome ID
  4. 4Click "Start Analysis"

Step 3: Analysis & Visualizations

  1. 1Wait for analysis completion
  2. 2Click "Generate All Visualizations"
  3. 3View all three visualization types
  4. 4Use fullscreen for detailed view

Step 4: RNA Dual Graph

  1. 1Go to RNA Dual Graph tab
  2. 2Paste dot-bracket input
  3. 3Run dual-graph assignment
  4. 4View RNA Dual Graph IDs and plot

Output Files

Results are written under result/{virus_name}/ and downloadable from History.

File PatternDescription
{genome_id}_mutation_summary.csvPosition-level nucleotide and amino-acid mutation table.
{genome_id}_row_hot_mutations.csvClustered and high-frequency mutation report.
{genome_id}.txtDetailed per-genome mutation text report.
potential_PRF.csv or prf_analysis.prf_candidates.csvCandidate PRF sites from PRF scanner.

Contact

Questions or issues: rw3594@nyu.edu or GitHub.

File Requirements

Reference Genome
Format .fasta
Content Complete genome sequence
  • Single sequence per file
  • Standard FASTA format
  • Complete genome length
Reference Proteome
Format .fasta
Content Protein sequences
  • Multiple proteins allowed
  • Standard FASTA format
  • All expected proteins
MSA File
Format .txt
Content Multiple sequence alignment
  • ClustalW format preferred
  • Contains all sample sequences
  • Aligned sequences
Cookie Notice: This website uses cookies to provide session management and improve your experience. By continuing to use this site, you consent to our use of cookies.