Background: The COVID-19 is caused by a novel coronavirus SARS-CoV-2, which started from China. It spread rapidly throughout the world and was later declared a pandemic by the WHO. Over the course of time, SARS-CoV-2 has mutated for survival advantages, and this led to multiple variants. Multiple studies on mutations identification in SARS-CoV2 have been published covering extensive sample areas. The purpose of this study was to limit the sample area to the Georgia state in the U.S. and to analyze the genome sequences for mutation profiling across the genome and origin of variants. Methods: The genome sequences (n = 3,970) were obtained from the NCBI database as of June 12, 2021, with the filter of being complete sequenced genomes, homo-sapiens host, and only from Georgia State of the U.S. NextClade, an online tool was used for the analysis of the sequences using Wuhan-Hu-1/2019 as a reference genome. The algorithm was sequence alignment, translation, mutation calling, phylogenetic placement, clade assignment, and quality control (QC). Thirty-six samples with bad QC were removed from the mutational analysis. Results: A total 117,743 mutations in the nucleotides were identified (averaging 31.5 mutations per sample). The mutations A23403G, C3037T, C241T, and C14408T were detected in 98% of the samples. Also, a total of 75,517 mutations in the amino acid were identified (averaging 20.2 mutations per sample). The mutations D614G and P314L were identified in >97% samples whereas R203K, G204R, P681H, and N501Y were detected in >50% samples. Analysis also revealed 16 different clades with 20I (49.6%). Clades 20G (24.2%) and 20A (5.5%) being the most abundant, showed that SARS-CoV-2 in the Georgia State originated mainly from Southeast England, other parts of the U.S., and several countries in Western Europe. Conclusion: Looking at the three most common variants in Georgia State of the U.S., we could determine the primary locations of transmission or origin for the virus, and our analyses indicates that majority of the cases originated from Southeast England (Clade 20I), the U.S. itself (Clade 20G), and from Western Europe (Clade 20C).
- Coronavirus disease 2019
- Genome sequences
- Severe acute respiratory syndrome coronavirus-2