Connecting shared genes and shared traits across species

Clark and Chikina labs employ CRC in exploring convergent evolution


     The Evolutionary Tree of LIfe. Note Charles Darwin at the top of the hominid branch.

The star-nosed mole of North America is functionally blind. So is the naked mole rat of East Africa. Different species separated by thousands of miles underwent the same evolutionary adaptation to living underground. Similarly, marine mammals such as manatees and whales underwent shared adaptations to aquatic life. The phenomenon is termed phenotypic convergence: similar physical changes arising independently in different species. What could convergence reveal about the evolution of the genes responsible for those physical changes?

The labs of Maria Chikina and Nathan Clark in the School of Medicine Department of Computational and Systems Biology explore the evolution of phenotype and genotype, relying on Center for Research Computing resources for heavy computation tasks from sequencing genomic data to developing and testing statistical tools. At the core of their work is comparing rates of evolution for a gene in one species to rates of evolution for a gene in another species – gene pairs with similar rates likely share similar evolutionary histories.

The Chikina-Clark collaboration has led to numbers of papers in scientific journals as well as media coverage and the 2019 Chancellor’s Distinguished Research Award for Clark.

“We ask what reproducible things you find across many animals,” explains assistant professor Chikina. “We look at natural occurrences where the same phenotype arose at the same time and make comparisons in the genotype. The tools we create look at likelihoods, not specific gene sequences but instead the rate of change on a particular branch of a species relative to an average expected value.”

Clark describes the collaboration as a bridge between computation and experimentation. “We are the end user of the algorithms developed by Maria and the people in her lab. She has a knack for finding signals and turbo-charging the tools.”

Exploration is central for Clark, now an adjunct professor (in fall 2019 he began dividing time between Pitt and the University of Utah). “We find partners of a gene by testing against known genes – sometimes we find genes we didn’t expect. We hope for surprises. Computationally, that means we must be able to adjust course. It’s good to have flexible CRC resources.”

Amanda Kowalczyk, fourth-year graduate student in the Clark and Chikina labs, helped develop the labs’ new RER Converge tool built with CRC support using the R programming package.

Kowalczyk explains. “We begin with the method of Evolutionary Rate Covariation (ERC), comparing all the rates of one species to all the rates of another for 63 mammal species – which means comparing 9,000 evolutionary trees across 63 species per tree. RER Converge starts on that basis but also creates what we call a vector trait phenotype representation, which means a representation including information about the ancestral species. It’s a tool we can make usable for biologists who are not computational.”

Raghavendran Partha carried out much of the lab’s computation as a graduate student. Partha was awarded a PhD at Pitt in the spring and now works at Ancestry.com.

“When CRC was still called Center for Simulation and Modeling, I used the old cluster known as Frank, where I could do a lot of jobs, but it was slow. In 2018, CRC upgraded all the computers and suddenly I could run 500 jobs simultaneously. The CRC team – Kim Wong, Fangping Mu, and Barry Moore – helped install and debug the R packages we used to build RER Converge.”

“Working with CRC has been really great,” Clark adds. “The center adds an enormous amount of muscle. Once Raghavendran submitted jobs on Friday that he expected to be finished in a week. He came in Monday and the computation had been done over the weekend.”

Developing tools to study convergent evolution is a statistical challenge.

“The statistics are complicated depending on the data,” Chikina explains. “It’s hard to get a p-value and the data can show systematic biases. We computed the statistics by permutation – repeated runs. Intellectually simple, but computationally very expensive.”

Could convergent evolution shed light on the evolution of human genes?

Clark says yes. In a 2015 study of genes contributing to cancers and developmental disorders, co-evolution markers linked known genes that contribute to specific diseases. By extension, those markers could link previously unknown genes to those same diseases.

“We are already using convergent evolution to guide the creation of a new genetic screening panel for congenital eye diseases,” Clark explains. “By using noncoding regions of the genome with accelerated mutation in blind mammals, we created a list of possible regions controlling eye development. We are recruiting patients with unexplained congenital eye disease to determine if their chromosomes have mutations specifically in those ‘blind-species-accelerated’ regions. This could mean real insight into the evolution of human disease genes.”

 

Contact:
Brian Connelly
Pitt Center for Research Computing
bgc14@pitt.edu
412-383-0459