NYU Dataset

Dataset and Code to Model Dispersal Dynamics of SARS-CoV-2 Lineages During the First Epidemic Wave in New York City

UID: 10441
* Corresponding Author
To construct a fixed time-scaled phylogenetic tree, the investigators combined 828 viral genome sequences obtained from COVID-19 patients at NYU Langone Health between March and May 2020 with 1,899 background sequences that were contributed to the Nextstrain repository. A total of 2,727 SARS-CoV-2 genomic sequences are available in the dataset. The supporting R code included with the dataset will produce phylogenetic trees through discrete and continuous Bayesian phylogeographic methods.
Geographic Coverage
New York (State) - New York City
Subject of Study
Subject Domain


Free to All
R code and datasets are available publicly using the Access Link to the Github repository.
Access via Github

Datasets and code

Associated Publications
Data Type
Study Type
Grant Support
12U7121N/Research Foundation - Flanders
G0D5117N/Research Foundation - Flanders
G0E1420N/Research Foundation - Flanders
G098321N/Research Foundation - Flanders
C14/18/094/Internal Funds KU Leuven
Related Datasets