Summary: The Macroecology of Infectious Disease Research Coordination Network is a working group jointly funded by the NSF,  NIH, and USDA. Our group brings together experts in macroecology, infectious disease ecology, machine learning, GIS analysis, and phylogenetic comparative methods to conduct some of the first studies quantifying and exploring the drivers of global scale patterns of pathogen biodiversity and disease emergence.

Disease ecology is a vital and rapidly maturing area of biological research. Some of the most pressing questions in disease ecology concern the origins of patterns at large (global) spatial and long (macroevolutionary) temporal scales. For example, what factors determine cross-species transmission and pathogen shifts into new host populations or species? What impact can we expect global warming to have on large scale patterns of pathogen prevalence and diversity? What drives the evolution of host breadth and virulence in parasitic organisms? To date relatively few studies have addressed these questions at large spatial scales or from a macroevolutionary perspective. Our goal is to better understand large scale patterns of infectious disease biodiversity and emergence using global-scale data sets and advanced quantitative approaches.

Macroecology is the field of ecology that most directly addresses large scale patterns of biodiversity. The emergence of macroecology as a discipline within ecology has been driven by the realization that understanding the origins of many patterns in nature requires a perspective encompassing large spatial scales, long temporal scales, and large data sets. While traditional ecological approaches such as experimental manipulation provide great insight into interactions between small sets of species in particular settings, it is difficult or impossible to “scale up” such methods to investigate the factors that operate at higher levels of organization such as populations, communities, or regions. A macroecological approach is also necessary to study the impacts of anthropogenic forces such as habitat destruction and  climate change because these processes tend to manifest at large scales.

Applying a macroecological perspective to understanding the ecology and evolution of infectious diseases requires discourse and collaboration among researchers in several fields. Macroecologists have rarely considered patterns of parasite biodiversity; at the same time, disease ecologists do not often examine pathogen variation at large taxonomic, spatial and temporal scales. Recently, host-parasite occurrence data (needed to quantify large scale patterns of parasite biodiversity) have been collected independently by a number of research groups studying diseases of mammals, amphibians, fish, insects and humans. The time is now ripe for interaction between research teams to formulate a general set of questions and approaches that could be applied universally across data sets. Methodological advances are also needed to address problems such as hidden correlations, non-liner relationships between variables, phylogenetic non-independence of trait data, differences in sampling effort among both parasite and host species and the issue of missing data. Machine learning (ML) methods such as boosted regression trees offer promising solutions to these methodological issues, although ecologists have only just begun to dabble in these approaches. By bringing together experts in macroecology, infectious disease ecology, machine learning methods, and phylogenetic comparative methods, members of our RCN will quantify and test hypotheses about global scale patterns of parasite biodiversity and emergence.

As a catalyst for our work we will focus on quantifying global patterns of parasite and disease biodiversity for groups of hosts and parasites for which data have previously been collected. Using resources such as the Global Mammal Parasite Database (, a common set of questions will be used to explore patterns of parasite biodiversity within and between groups:

1. How should host parasite diversity occurrence and parasite host breadth be quantified?

A problem with any data base of host parasite occurrences is that there is wide variation in the degree to which hosts have been sampled, and also wide variation in the degree to which different parasite species have been sampled. Traditional ecological approaches to quantifying parasite diversity rely on correcting for sampling effort (e.g., the number of published studies that consider a given host). An alternative approach (derived from evolutionary theory) is to quantify how diversity at higher and lower taxonomic levels are correlated, and using higher taxonomic diversity of an assemblage (which is likely to be accurately known) to estimate the species level diversity of an assemblage. Finally, there are machine learning methods developed specifically for estimating presence-/absence data from incomplete samples that so far have not been applied in either ecology or evolutionary biology. Which of these broad classes of methods tends to be most accurate, and the degree to which they yield congruent results is presently unknown. Using both randomized sub-sampling of empirical data, and data simulated under several sets of assumptions we will determine which methods are most likely to yield accurate estimates of “true” host parasite diversity and parasite host breadth.

2. What is the global distribution of parasite diversity?

Once methods for accurately assessing host parasite diversity have been established, it will be possible to build some of the first large scale maps of parasite species richness and address some basic questions in disease macroecology. For example, are regional patterns of parasite diversity similar among different types of host species, or idiosyncratic? Similarly, do different parasite taxonomic groups (e.g. viruses, protozoa, helminths) show similar or dissimilar patterns of biodiversity? Are hotspots of disease biodiversity the same as hotspots of emerging infectious diseases in humans and domestic animals?

3. What drives variation in parasite biodiversity and endemism?

A host of environmental factors are known to be correlated with regional variation in the species richness of host species (e.g., terrestrial vertebrates). However the factors that drive regional variation in parasite diversity have rarely been explored, and many questions are outstanding. For example, are host diversity and parasite diversity generally correlated, and is there any direct effect of environmental variation on parasite biodiversity? What host characteristics drive variation in parasite species richness among host species? What biological, environmental, or geographic factors promote endemism in parasite assemblages within and between host species? What drives the evolution of host breadth in parasites? What factors promote cross-species transmission and pathogen shifts into new host populations or species?

4. How will patterns of parasite biodiversity change in the future?

Once the factors that drive modern patterns of parasite biodiversity are illuminated, it should be possible to make predictions about how patterns of parasite diversity might change in the future. For example, as climate change alters the distributions of habitats and host species, how will the distribution of parasite biodiversity change? Should there prove to be a strong correlation between spatial patterns of disease diversity and disease emergence, it would also be possible to infer areas with high probabilities of future EID events.