How to Upload Your Data
You need to upload 3 types of files for each virus:
Reference Genome
Complete genome sequence (.fasta)
Reference Proteome
Protein sequences (.fasta)
MSA File
Multiple sequence alignment (.txt)
Create Custom Virus
Custom Virus Benefits:
- Upload your own virus data
- No need for pre-configuration
- Support for any virus type
- Automatic file organization
Upload Data Files
Data Requirements
Required Files for Analysis:
- Reference Genome: FASTA format (.fasta)
- Reference Proteome: FASTA format (.fasta)
- MSA File: Multiple sequence alignment (.txt)
What Correct Input Looks Like (Examples)
Examples only. Upload your real files above.
Reference Genome FASTA (example)
>NC_045512 Severe acute respiratory syndrome coronavirus 2 isolate Wuhan-Hu-1, complete genome ATTAAAGGTTTATACCTTCCCAGGTAACAAACCAACCAACTTTCGATCTCTTGTAGATCTGTTCTCTAAA CGAACTTTAAAATCTGTGTGGCTGTCACTCGGCTGCATGCTTAGTGCACTCACGCAGTATAATTAATAAC TAATTACTGTCGTTGACAGGACACGAGTAACTCGTCTATCTTCTGCAGGCTGCTTACGGTTTCGTCCGTG TTGCAGCCGATCATCAGCACATCTAGGTTTCGTCCGGGTGTGACCGAAAGGTAAGATGGAGAGCCTTGTC CCTGGTTTCAACGAGAAAACACACGTCCAACTCAGTTTGCCTGTTTTACAGGTTCGCGACGTGCTCGTAC GTGGCTTTGGAGACTCCGTGGAGGAGGTCTTATCAGAGGCACGTCAACATCTTAAAGATGGCACTTGTGG CTTAGTAGAAGTTGAAAAAGGCGTTTTGCCTCAACTTGAACAGCCCTATGTGTTCATCAAACGTTCGGAT GCTCGAACTGCACCTCATGGTCATGTTATGGTTGAGCTGGTAGCAGAACTCGAAGGCATTCAGTACGGTC GTAGTGGTGAGACACTTGGTGTCCTTGTCCCTCATGTGGGCGAAATACCAGTGGCTTACCGCAAGGTTCT TCTTCGTAAGAACGGTAATAAAGGAGCTGGTGGCCATAGTTACGGCGCCGATCTAAAGTCATTTGACTTA GGCGACGAGCTTGGCACTGATCCTTATGAAGATTTTCAAGAAAACTGGAACACTAAACATAGCAGTGGTG TTACCCGTGAACTCATGCGTGAGCTTAACGGAGGGGCATACACTCGCTATGTCGATAACAACTTCTGTGG CCCTGATGGCTACCCTCTTGAGTGCATTAAAGACCTTCTAGCACGTGCTGGTAAAGCTTCATGCACTTTG TCCGAACAACTGGACTTTATTGACACTAAGAGGGGTGTATACTGCTGCCGTGAACATGAGCATGAAATTG CTTGGTACACGGAACGTTCTGAAAAGAGCTATGAATTGCAGACACCTTTTGAAATTAAATTGGCAAAGAA ATTTGACACCTTCAATGGGGAATGTCCAAATTTTGTATTTCCCTTAAATTCCATAATCAAGACTATTCAA CCAAGGGTTGAAAAGAAAAAGCTTGATGGCTTTATGGGTAGAATTCGATCTGTCTATCCAGTTGCGTCAC CAAATGAATGCAACCAAATGTGCCTTTCAACTCTCATGAAGTGTGATCATTGTGGTGAAACTTCATGGCA GACGGGCGATTTTGTTAAAGCCACTTGCGAATTTTGTGGCACTGAGAATTTGACTAAAGAAGGTGCCACT ... (remaining lines omitted for brevity in this patch context)
Proteome FASTA (example)
>YP_009724390.1|spike_surface_glycoprotein|21563..25384 MFVFLVLLPLVSSQCVNLTTRTQLPPAYTNSFTRGVYYPDKVFRSSVLHSTQDLFLPFFSNVTWFHAIHV SGTNGTKRFDNPVLPFNDGVYFASTEKSNIIRGWIFGTTLDSKTQSLLIVNNATNVVIKVCEFQFCNDPF LGVYYHKNNKSWMESEFRVYSSANNCTFEYVSQPFLMDLEGKQGNFKNLREFVFKNIDGYFKIYSKHTPI NLVRDLPQGFSALEPLVDLPIGINITRFQTLLALHRSYLTPGDSSSGWTAGAAAYYVGYLQPRTFLLKYN ENGTITDAVDCALDPLSETKCTLKSFTVEKGIYQTSNFRVQPTESIVRFPNITNLCPFGEVFNATRFASV YAWNRKRISNCVADYSVLYNSASFSTFKCYGVSPTKLNDLCFTNVYADSFVIRGDEVRQIAPGQTGKIAD YNYKLPDDFTGCVIAWNSNNLDSKVGGNYNYLYRLFRKSNLKPFERDISTEIYQAGSTPCNGVEGFNCYF PLQSYGFQPTNGVGYQPYRVVVLSFELLHAPATVCGPKKSTNLVKNKCVNFNFNGLTGTGVLTESNKKFL PFQQFGRDIADTTDAVRDPQTLEILDITPCSFGGVSVITPGTNTSNQVAVLYQDVNCTEVPVAIHADQLT PTWRVYSTGSNVFQTRAGCLIGAEHVNNSYECDIPIGAGICASYQTQTNSPRRARSVASQSIIAYTMSLG AENSVAYSNNSIAIPTNFTISVTTEILPVSMTKTSVDCTMYICGDSTECSNLLLQYGSFCTQLNRALTGI AVEQDKNTQEVFAQVKQIYKTPPIKDFGGFNFSQILPDPSKPSKRSFIEDLLFNKVTLADAGFIKQYGDC LGDIAARDLICAQKFNGLTVLPPLLTDEMIAQYTSALLAGTITSGWTFGAGAALQIPFAMQMAYRFNGIG VTQNVLYENQKLIANQFNSAIGKIQDSLSSTASALGKLQDVVNQNAQALNTLVKQLSSNFGAISSVLNDI LSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRASANLAATKMSECVLGQSKRVDFCGKGYHLM SFPQSAPHGVVFLHVTYVPAQEKNFTTAPAICHDGKAHFPREGVFVSNGTHWFVTQRNFYEPQIITTDNT FVSGNCDVVIGIVNNTVYDPLQPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLNEVA KNLNESLIDLQELGKYEQYIKWPWYIWLGFIAGLIAIVMVTIMLCCMTSCCSCLKGCCSCGSCCKFDEDD SEPVLKGVKLHYT
MSA text/FASTA (example)
CLUSTAL O(1.2.4) multiple sequence alignment
NC_045512.2 ATTAAAGGTTTATACCTTCCCAGGTAACAAACCAACCAACTTTCGATCTCTTGTAGATCT 60
hCoV-19/Shanghai/SJTU-236325/2022|EPI_ISL_16327572|2022-12-19 -TTAAAGGTTTATACCTTCCCAGGTAACAAACCAACCAACTTTTGATCTCTTGTAGATCT 59
****************************************** ****************
NC_045512.2 GTTCTCTAAACGAACTTTAAAATCTGTGTGGCTGTCACTCGGCTGCATGCTTAGTGCACT 120
hCoV-19/Shanghai/SJTU-236325/2022|EPI_ISL_16327572|2022-12-19 GTTCTCTAAACGAACTTTAAAATCTGTGTGGCTGTCACTCGGCTGCATGCTTAGTGCACT 119
************************************************************
NC_045512.2 CACGCAGTATAATTAATAACTAATTACTGTCGTTGACAGGACACGAGTAACTCGTCTATC 180
hCoV-19/Shanghai/SJTU-236325/2022|EPI_ISL_16327572|2022-12-19 CACGCAGTATAATTAATAACTAATTACTGTCGTTGACAGGACACGAGTAACTCGTCTATC 179
************************************************************
NC_045512.2 TTCTGCAGGCTGCTTACGGTTTCGTCCGTGTTGCAGCCGATCATCAGCACATCTAGGTTT 240
hCoV-19/Shanghai/SJTU-236325/2022|EPI_ISL_16327572|2022-12-19 TTCTGCAGGCTGCTTACGGTTTCGTCCGTGTTGCAGCCGATCATCAGCACATCTAGGTTT 239
************************************************************
NC_045512.2 CGTCCGGGTGTGACCGAAAGGTAAGATGGAGAGCCTTGTCCCTGGTTTCAACGAGAAAAC 300
hCoV-19/Shanghai/SJTU-236325/2022|EPI_ISL_16327572|2022-12-19 TGTCCGGGTGTGACCGAAAGGTAAGATGGAGAGCCTTGTCCCTGGTTTCAACGAGAAAAC 299
***********************************************************
NC_045512.2 ACACGTCCAACTCAGTTTGCCTGTTTTACAGGTTCGCGACGTGCTCGTACGTGGCTTTGG 360
hCoV-19/Shanghai/SJTU-236325/2022|EPI_ISL_16327572|2022-12-19 ACACGTCCAACTCAGTTTGCCTGTTTTACAGGTTCGCGACGTGCTCGTACGTGGCTTTGG 359
************************************************************
NC_045512.2 AGACTCCGTGGAGGAGGTCTTATCAGAGGCACGTCAACATCTTAAAGATGGCACTTGTGG 420
hCoV-19/Shanghai/SJTU-236325/2022|EPI_ISL_16327572|2022-12-19 AGACTCCGTGGAGGAGGTCTTATCAGAGGCACGTCAACATCTTAAAGATGGCACTTGTGG 419
************************************************************
... (remaining lines omitted for brevity in this patch context)
Important Notes:
- Files are organized by virus type and segment (if applicable)
- Reference genomes and proteomes go to
refs/folder - MSA files go to
clustalW/folder - Upload files before starting analysis
Upload Checklist:
Start New Analysis
Analysis in progress... This may take several minutes.
Generate Visualization
Visualization Types:
Mutation Analysis
Combined genome and protein mutation analysis.
Row/Hot Mutations
Visualization of row and hot mutations on protein bars.
PRF Regions
Programmed Ribosomal Frameshifting regions analysis.
Interactive Visualization
No Visualization Available
Generate a visualization above to see it here.
RNA Secondary Structure β Dual Graph Assignment
Submit an RNA secondary structure in dot-bracket notation to assign its dual graph topology ID. The pipeline converts your structure through DSSR format β CT file β dual graph library match.
Track Your Jobs
Analysis History
Loading analysis history...
VARIANT Documentation
Viral mutAtion trackeR aImed At GeNome and proTein-level
Features
Genome Analysis
Comprehensive mutation tracking across entire viral genomes
Protein Impact
Analysis of mutations on protein structure and function
Frameshift Detection
Identification of potential ribosomal frameshifting sites
Interactive Visualizations
Generate all three visualization types with one click
Streamlined Workflow
- Auto-populated virus selection
- One-click visualization generation
- Real-time job tracking
- Automatic frameshift detection
Advanced Visualizations
- Interactive HTML visualizations
- Fullscreen viewing mode
- Multiple visualization types
- Export capabilities
Quick Start
π Get Started in 4 Steps
Follow these simple steps to analyze your virus data
Step 1: Data Upload
- 1Go to Data Upload tab
- 2Select virus type (or create custom)
- 3Upload reference genome (.fasta)
- 4Upload reference proteome (.fasta)
- 5Upload MSA file (.txt)
Step 2: Analysis & Visualization
- 1Go to Analysis & Visualization tab
- 2Select your virus (auto-populated)
- 3Optional: Enter specific genome ID
- 4Click "Start Analysis"
Step 3: Analysis & Visualizations
- 1Wait for analysis completion
- 2Click "Generate All Visualizations"
- 3View all three visualization types
- 4Use fullscreen for detailed view
Step 4: RNA Dual Graph
- 1Go to RNA Dual Graph tab
- 2Paste dot-bracket input
- 3Run dual-graph assignment
- 4View RNA Dual Graph IDs and plot
Output Files
Results are written under result/{virus_name}/ and downloadable from History.
| File Pattern | Description |
|---|---|
{genome_id}_mutation_summary.csv | Position-level nucleotide and amino-acid mutation table. |
{genome_id}_row_hot_mutations.csv | Clustered and high-frequency mutation report. |
{genome_id}.txt | Detailed per-genome mutation text report. |
potential_PRF.csv or prf_analysis.prf_candidates.csv | Candidate PRF sites from PRF scanner. |
Contact
Questions or issues: rw3594@nyu.edu or GitHub.
File Requirements
Reference Genome
- Single sequence per file
- Standard FASTA format
- Complete genome length
Reference Proteome
- Multiple proteins allowed
- Standard FASTA format
- All expected proteins
MSA File
- ClustalW format preferred
- Contains all sample sequences
- Aligned sequences