ARC23 Poster Competition
At the conclusion of the ARC23 symposium, students, speakers and contest judges gathered for a poster contest at the University Club. We heard from 18 Pitt graduate and undergraduate students competing for one of three $750 travel scholarships.
The students presented an impressive array of research from across schools and majors. Judges had a difficult time deciding between the range of impactful work and polished presentations.
We are proud to have showcased outstanding work by all the researchers.Congratulations and thanks to all the participants and winners.
Siddarth Achar, Chemical Engineering
"Using Deep Learning Potentials and Graph Lattice Models to Engineer Optimal Proton Conducing Membranes for Fuel Cells"
Development of new materials capable of conducting protons in the absence of water is crucial for improving the performance, reducing the cost, and extending the operating conditions for proton exchange membrane fuel cells. We present an atomistic simulation-based workflow to computationally design fuel cell membranes using deep learning potentials (DP) and graph lattice models (GLM). Our workflow was used to demonstrate that graphanol (hydroxylated graphane) will conduct protons anhydrously with very low diffusion barriers. First, DPs were trained for graphanol that has near-density functional theory accuracy but requires a very small fraction of the computational cost. These DPs were used to calculate the overall barrier to proton diffusion by performing thousands of molecular dynamics simulations. We used GLMs to show that the overall barrier for proton diffusion is dictated by the occurrence of an adversarial phenomenon called “Grotthuss chains”. Results from our workflow have enabled us to set specific design rules for developing next-generation proton conducting membranes with even lower diffusion barriers.
Meiirbek Islamov, Chemical and Petroleum Engineering
"A Data-Driven Exploration of Structure-Property Relationships of Thermal Transport in Metal-Organic Frameworks"
Metal-Organic Frameworks (MOFs) are a promising class of highly porous and crystalline materials with vast potential for use in various gas storage, separation, catalysis, and thermoelectric applications due to their large surface area. However, to be practical, their ability to rapidly disperse exothermic heat generated during the gas adsorption process or suppress directed heat flow for thermoelectric applications must be addressed. Despite their importance, MOFs' thermal transport properties have received little attention, resulting in limited understanding of structure-thermal conductivity relationships for designing MOFs with tailored thermal conductivity. To establish these relationships, we conducted a high-throughput screening of hypothetical MOFs for thermal conductivity using classical molecular dynamics simulation and the Green-Kubo method. We carefully selected 10,194 hypothetical MOFs from the ToBaCCo database and analyzed important structural and compositional characteristics: density, pore size, surface area, void fraction, node-linker mass mismatch, and metal node connectivity. Our analysis showed that high thermal conductivity favors small pores, high density, small node-linker mass mismatch, and four-connected metal clusters, while low thermal conductivity favors large pores and high porosity. We identified six hypothetical MOF structures with high average thermal conductivity and 36 with ultralow thermal conductivity. Interestingly, the six ultrahigh thermal conductivity MOFs share a four-fold coordinated metal nodes, through which the organic linkers connect in a perpendicular orientation, suggesting that topology primarily determines MOFs' thermal conductivity limits.
Maya Salem, Chemical Engineering
"Understanding the Segregation Energy Behavior of Single Atom Alloys in the Presence of Ligands"
Surface segregation is one of the two key parameters that determine the stability in single atom alloys (SAAs). The formation of the isolated dopant on the surface of the alloy is governed by (but not limited to1) the presence of adsorbates. Our work fills the knowledge gap by addressing the effect of commonly used ligands in colloidal nanoparticle synthesis, such as methylthiolate (R-S) and methylamine (R-NH2). We then extended our study to incorporate a reaction intermediate (R-NH) on d8 (Ni, Pd, Pt) and d9 (Ag, Au, Cu) metals on (111) and (100) surfaces. Through this study, we gained an understanding of how the binding strength and adsorption site of the ligand affects the overall segregation energy trends. Finally, we collected the DFT data and built a four-feature neural network: multilayer perceptron regression model that accurately captures the segregation energy (Eseg) trends, accelerating the predictions across vast material space.
Daniel Banko-Ferran, Economics
"Peer Comparison Effect on Effort in Choice Environments"
Although information about an individual’s performance relative to one’s peers can motivate behavior in a cost-effective way, it can also cause an oppositional reaction or “boomerang effect” which decreases socially desired behavior. This paper uses an experimental design in a controlled laboratory setting to measure if selection into receiving peer information (as opposed to forced delivery) attenuates this demotivational effect. This is the first causal study to explore the selection channel to improve the efficacy of peer information interventions (PIIs). I find that being forced to receive peer comparison information leads to worse performance on average on a real-effort task, and selection ability eliminates this adverse response. Furthermore, most participants still opted into receiving peer information even among below average performers. I discuss possible reasons for this surprising result and future research directions.
Lianjin Cai, School of Pharmacy
"Screening of Flavonoid Nutrients inhibiting SARS-COV-2 3CL-Pro using Machine Learning trained Classifiers and Regression" Models
This work developed a novel multi-step virtual screening strategy to reliably identify naturally available and abundant flavonoid compounds against 3CLpro, a well-studied antiviral target. We utilized a novel class of molecular descriptors, ligand-residue interaction profile which considered the heterogeneity of protein-ligand binding to construct Machine Learning (ML) models which significantly outperformed routinely used descriptors. The consensus of multiple models trained by various ML algorithms achieved a strong screening power (accuracy 93.9% and false-positive-rate 6.4%) for the classification modeling and low root-mean-square-error of 1.18 kcal/mol for the binding affinity prediction. A flavonoid is recognized as an inhibitor only when it can pass both two tests consisting of 10 ML-trained regression models and 25 classifiers. 140 out of 6000 flavonoids survived both tests. Meanwhile, the top flavonoid candidates are stable during the 100-ns molecular dynamics simulations. Among 140 promising flavonoids, we found 8 predominant dietary supplements according to the USDA database.
Beihong Ji, School of Pharmacy
"Development of Machine Learning Models by Exploring Varieties of Molecular Representations to Predict the Anti-SARS-CoV-2 Activities of a Chemical"
To accelerate the discovery of novel drug candidates for Coronavirus Disease 2019 (COVID-19) therapeutics, we reported a series of machine learning (ML)-based models to accurately predict the anti-SARS-CoV-2 activities of screening compounds. Those models were trained and evaluated using the experimental data deposited in the COVID-19 OpenData Portal hosted by NCATS. We explored 6 popular ML algorithms in combination with 15 molecular descriptors for molecular structures from 9 screening assays. As a result, the models constructed using the k-nearest neighbors method and the hybrid molecular descriptor, GAFF+RDKit achieved the best performance. The best models of all 9 assays were collected and implemented in the COVID-19-CP webtool. We evaluated the performance of COVID-19-CP models using four external datasets, including a dataset of 28 drugs which have been applied in clinical trials of treatment for COVID-19. The overall performance of our developed models was significantly better than REDIAL-2020. We applied Shapley additive explanations algorithm to analyze the contribution of the individual descriptor for our best models. A web server (https://clickff.org/amberweb/covid-19-cp) was developed to enable users to forecast anti-SARS-CoV-2 activities of arbitrary compounds using the COVID-19-CP models. Besides the descriptor-based machine learning models, we also developed graph-based Attentive FP models for the 9 assays. We found that the Attentive FP models achieved a comparable performance to that of COVID-19-CP and outperformed the REDIAL-2020 models. Very encouragingly, the consensus prediction utilizing both COVID-19-CP and Attentive FP can significantly boost the prediction accuracy, thus can ultimately improve the success rate of COVID-19 drug discovery.
Jinghang Li, Bioengineering
"Robust and Automatic White Matter Hyperintensity Segmentation on FLAIR Images at 1.5 Tesla, 3Tesla and 7Tesla Magnetic Field" Strengths
White matter hyperintense/es (WMH) are hyperintense lesion clusters on Fluid Attenuated Inverse Recovery (FLAIR) magnetic resonance (MR) images that are commonly associated with various neurological disorders. Accurate segmentation of WMHs is critical for diagnosis, treatment planning, and monitoring disease progression. In recent years, various Unet models have shown great promise for segmenting WMHs [1, 2]. In this study, we implemented various data augmentations to improve the robustness of the Unet model for WMH segmentation. Additionally, our work is one of the very few in the field that incorporates training images from diverse magnetic field strengths.
We used a diverse dataset of more than 300 FLAIR scans from many different universities and institutes, including University of Pittsburgh, University of Nottingham, UT Health San Antonio, UMC Utrecht, NUHS Singapore, and VU Amsterdam, to train and evaluate our transformer based Unet model for WMH segmentation . We augmented the training data with artificial MRI artifacts on FLAIR images acquired at 1.5Tesla, 3Tesla, and 7Tesla. The artifacts include motion, noise, inhomogeneity, and ghosting (figure 3) . We evaluated the performance of the model using the Jaccard index, which measures the overlap between the predicted and ground truth WMH segmentations. Furthermore, we tested the model performance on previously collected 7Tesla and 3Tesla FLAIR images and compared the segmentation results with widely acknowledged neuro image processing tool Freesurfer. Our results demonstrate that incorporating artificial MRI artifacts from a diverse range of MRI images and field strengths improved the robustness of the Unet model for WMH segmentation.The model achieved a state-of-the-art Jaccard index of 0.84, outperforming the model that did not use data augmentation.
Thiago Brito Matos, Short term scholar visitor at the Durrant Lab, University of São Paulo
"Computer-Aided Drug Design for Cysteine Protease Inhibitors"
Cysteine proteases from the papain family are involved in many pathophysiological processes, acting as catalysts for peptide-bond cleavage. This work applies computer-aided drug design (CADD) to cysteine protease targets using molecular docking and DeepFrag. DeepFrag is a machine-learning model that suggests lead optimization strategies to improve putative enzyme inhibitors. I will specifically focus on the targets studied by the NEQUIMED group at the University of São Paulo under the supervision of Professor Andrei Leitão, starting with cathepsin L (prostate cancer). The main objective of this work is to identify novel drug candidates to prioritize subsequent synthesis and study.
Ektha Parchuri and Sohan Rao, School of Public Health, Department of Epidemiology
"Covid-19 Associated Environmental and Socioeconomic Risks in Pennsylvania State Prisons"
The breadth and severity of the COVID-19 pandemic has shed light on environmental and socioeconomic risks of health and mortality. Exposures to particulate matter and gaseous pollutants, characterized by elevations in respiratory hazards indices, raise COVID-19 mortality by 9%. These interfaces specifically threaten communities of color. The United States holds one of the largest incarcerated populations, who face higher burdens of disease than their non-incarcerated counterparts. Correctional facilities often struggle with staffing shortages and limited resources, fostering a culture of discrimination based on race, gender, and stigma of disease. These rifts have expanded during the COVID-19 pandemic, with inmates having 3.3 times the incidence rate and 2.5 times the mortality rate compared to the non-incarcerated U.S population. Interlinked with these rates are issues of overcrowding, poor sanitation and ventilation, environmental hazards, inadequate healthcare, and lack of prison-specific CDC isolation and quarantine guidelines. While recent studies examine these factors which exacerbate COVID-19 outbreaks in prisons, there remains a lack of work exploring the intersections of COVID-19, incarcerated populations, and poor environmental conditions of correctional facilities. By piloting an analysis using the state correctional institutes (SCI) of Pennsylvania, we hypothesize that poor environmental indices, as defined by Environmental Protection Agency (EPA), directly heighten COVID-19 associated infection and mortality risk amongst inmates in the state of Pennsylvania.
24 SCI facilities in Pennsylvania were identified. For each facility, the Department of Corrections COVID-19 Dashboard reported inmate and staff demographic and COVID-19 related metrics. The Pennsylvania COVID-19 Dashboard published by the Department of Health reported cumulative case/death counts in the counties surrounding each SCI facility.The EPA Environmental Justice and Mapping Tool was used for determine national metrics on various EJ and socioeconomic variables, grouping data by subsetted geographical blocks. The latitude and longitude coordinates of each SCI facility, with points storing respective COVID-19 data, was plotted on a map in ArcGIS Pro. EJ Screening data was uploaded and overlaid with facility coordinates. For each COVID-19 metric, exploratory regression analyses was used by selecting EJ metrics as explanatory variables. Predictors of this model were used for an ordinary least squares regression (OLS) framework. Default parameters were used. For significant variables, empirical bayesian kriging (EBK) was used to visualize spatial gradients of COVID-19 metrics. A subset size of 100, overlap factor of 5.0, 10,000 simulations, empirical transformation and K-Bessel semivariogram type were used. The remaining parameters held their default values.
12 total OLS models were used. OLS regression identified % under age 5 (UNDER5PCT, Coefficient=-70.804, p=0.0118) as the only statistically significant variable reducing mortality rate ratios (inmate/surrounding township) in PA SCIs. In contrast, % over age 64 (OVER64PCT, Coefficient=18.672, p=0.0004), 2017 air toxics respiratory hazard index (RESP, Coefficient=10.801, p=0.011), and superfund proximity (PNPL, Coefficient=11.946, p=0.0001) promote greater mortality in these prisons compared to surrounding townships.
12 EBK models were constructed, reporting the cross-validation statistics. Of the models, prevalence of inmates partially vaccinated and Hispanic inmate mortality rate yielded no detectable spatial gradients. EBK model indicates elevated mortality rate ratios in major PA cities, such as the Pittsburgh, Philadelphia and Scranton areas. Difference in mortality rates between Hispanic and white prisoners rises in facilities located close to urban centers. The mortality rate difference between black and white prisoners, black mortality rate, ages 55-64 mortality rate, and ages 45-54 mortality rate show intense elevations in the Philadelphia area. OLS regression shows that OVER64PCT and 2017 diesel particulate matter (DSLPM) elevate differences in Hispanic (OVER64PCT Coefficient=6.404 and p=0.001, DSLPM Coefficient=7.760 and p=0.0003) and black (OVER64PCT Coefficient=6.462 and p=0.001, DSLPM Coefficient=7.821 and p=0.0003) mortality rates from those of white prisoners.
Age and exposure to respiratory hazards predominate COVID-19 mortality risks in Pennsylvania SCIs. These factors impact inmates more than the surrounding populations, with the discrepancies widening in urban centers and among inmates of color. So strong are these predictors, they may trump the protection of vaccines-such is the case in the Philadelphia area. We thus highlight the importance of adequate healthcare services to manage age-related chronic conditions and proper ventilation to ensure good air quality in correctional facilities.
Jatin Singh, UPMC Radiology
"CT-Derived Features are Predictors of Systemic Sclerosis- Related Lung Transplant Survival"
The survival following systemic sclerosis-lung transplants (Ssc-LTx) is unsatisfactory. This study aims to identify radiographical factors from chest computed tomography (CT) scans that are associated with survival after Ssc-LTx.
We conducted a retrospective study based on a cohort of 102 patients with preoperative chest CT scans who underwent Ssc-LTx. Two categories of CT image features were automatically segmented and computed from the CT images. Univariate and multivariate Cox hazards regression analyses were used to identify the variables associated with post-transplant survival and integrate them as a computer model to predict post-transplant survival.
The identified CT features include Muscle Ratio, Bone Density, Artery Vein Ratio, and Heart Ratio (p=<0.01). The composite model without CT-derived features achieved an AUC score of 0.69 (95%: 0.59-0.79), while the model with CT-derived features achieves an AUC score of 0.89 (95%: 0.83-0.95).
CT-derived features may be useful predictors of survival after Ssc-LTx.
Jason Dou, Electrical and Computer Engineering
"Learning More Effective Representations Efficiently"
Sampling is ubiquitous in machine learning methodologies. Due to the growth of large datasets and model complexity, we want to learn and adapt the sampling strategies to best fit the representation learning process. Towards achieving this grand goal, a variety of sampling techniques have been proposed. However, most of them either use a fixed sampling scheme or adjust the sampling scheme based on simple heuristics, with which it is difficult to find the best sampling methods for model training in different stages. Inspired by "Think, Fast and Slow" in cognitive science, we propose a reward-guided sampling strategy called Adaptive Sample with Reward (ASR) to tackle this challenge. Our approach optimally adjusts the sampling process to achieve optimal performance by introducing a reward function based on downstream task performance benchmarks and a policy network that controls sampling distributions. We explore geographical relationships among samples by distance-based sampling to maximize the overall cumulative reward. We apply ASR to the sampling problems in ranking-based loss functions which requires triplet sampling. Empirical results in information retrieval and clustering demonstrate ASR’s superb performance across different datasets and its ability to help models learn more effective representations efficiently. We also discuss an interesting phenomenon which we name as "ASR gravity well" in experiments and provide an escape time lower bound for the observations.
Degan Hao, School of Computing and Information- Intelligent Systems Program
"Survival Prediction of Long COVID Patients with Spatiotemporal Attention"
Long COVID is a general term of post-acute sequelae of COVID-19. Patients with long COVID can endure long-lasting symptoms including fatigue, headache, dyspnea and anosmia, etc. Identifying the cohorts with severe long-term complications in COVID-19 could benefit the treatment planning and resource arrangement for future patient care. However, due to the heterogeneous phenotype presented in long COVID patients, it is difficult to predict their survival from their longitudinal data. In this study, we proposed a spatiotemporal attention mechanism to weigh feature importance jointly from the temporal dimension and feature space. Considering that medical examinations can have interchangeable orders in adjacent time points, we restricted the learning of short-term dependency with a Local-LSTM and the learning of long-term dependency with the joint spatiotemporal attention. We also compared the proposed method with several state-of-the-art methods and a method in clinical practice. The methods are evaluated on a hard-to-acquire clinical dataset of patients with long COVID. Experimental results show the Local-LSTM with joint spatiotemporal attention outperformed related methods in outcome prediction. The proposed method provides a clinical tool for the severity assessment of long COVID.
Shu Liu, Physics and Astronomy
"Testing and Improving the Difference Image Analysis Pipeline"
Difference Image Analysis (DIA) is a technique for detecting transient sources. By subtracting a template image from a science image, we can obtain a difference image for source detection. In this work, we evaluate the performance of the Dark Energy Science Collaboration (DESC) DIA pipeline by adding synthetic sources to images. Tracing the detection status of synthetic sources shows that the DESC DIA pipeline performs well in terms of detection efficiency and flux measurement. We explore the morphology and origins subtraction artifacts from difference images. We suggest that applying a set of flexible bases with spatial variation could be helpful to yield cleaner difference images.
Lauren Luciani, Shoemaker Immunosystems Lab, Department of Chemical and Petroleum Engineering
"Mathematical Model of Influenza Infection in Juvenile Mice Suggests Increased Production of Type 1 Interferon by Infected Cells is Associated with Severe Infection"
Each year, influenza infection can cause 290,000-650,000 deaths worldwide. Children are uniquely susceptible to severe influenza infection which can result in pneumonia, a leading cause of death in children worldwide. Murine studies have revealed that while viral burden is similar between juveniles and adults, juveniles experience higher levels of inflammation, lung injury, and mortality. To understand the underlying mechanisms driving severe pediatric infection, we developed a mathematical model consisting of six ordinary differential equations (ODE). Our ODE model focuses on the innate immune response to influenza infection and incorporates neutrophils, monocytes, and type 1 interferons, of which the latter two are observed to be significantly elevated in pediatric influenza infection. Using our model, we capture the observed juvenile immune response to influenza infection and can test whether the recruitment of inflammatory monocytes, production of interferon, or viral replication are drivers of differences in pathologies between the adults and children.
Chinmay Mhatre, Chemical and Petroleum Engineering
"Molecular Simulations for Designing Adsorbents and Adsorption Processes"
Fast capture and destruction of chemical warfare agents is an active research area with tremendous practical significance. Experiments have shown that the UiO-66 metal-organic framework is a very promising candidate for both the adsorption and destruction of chemical warfare agents. It is known from experiments that missing linker defects in UiO-66 play an important role in adsorption, diffusion, and chemical reactivity. However, how these defects work at the atomic level is not known. A detailed understanding of how defects impact the adsorption and diffusion of chemical warfare agents is crucial to being able to optimize UiO-66. We are addressing this need by using computational methods to characterize the adsorption and diffusion of chemical warfare agents and common simulants in pristine and defective UiO-66. We have developed models of defective UiO-66 by removing linkers in systematic and random ways. We examine the impact of different ways of accounting for framework charges. We have calculated the amount adsorbed at very low pressure and at saturation pressure for two Chemical warfare agents, sarin, and soman, and for two chemical warfare agents simulants. We observe that the amount adsorbed is higher in pristine than in defective UiO-66 at very low pressures, whereas the opposite is true for saturation pressures. To study transport, we calculate the self-diffusivities of chemical warfare agents and their simulants at infinite dilution for hydroxylated pristine UiO-66 using the dynamically corrected Transition State Theory approach. The parameterization of the force field to include the hydrogen bonding between the chemical warfare agents/simulants and framework atoms impacts the zero-loading self-diffusivities of both chemical warfare agents and their simulants. Our results provide the first atomic-level insights into the role of defects on adsorption and diffusion in UiO-66.
Anjnesh Sharma, Informatics and Networked Systems (MSIS)
"Predicting Patients with Hearing Difficulties from Comprehensive Audiological Data"
Increasing numbers of Veterans are presenting with blast-related mild traumatic brain injury (mTBI), posttraumatic stress disorder (PTSD), or both. These challenges are associated with deficits that contribute to self-reported hearing difficulties that fail to manifest on standard hearing tests. This study aimed to identify a combination of measures that best explain the self-perceived auditory and auditory-related cognitive deficits in this population, using a machine learning approach.
A large battery of tests was administered to 217 Veterans, including measures of central auditory processing, speech and language processing, auditory pathway health, and quality of life. Supervised machine learning was used to perform feature selection, including univariate, model-based, and recursive feature selection. The selected features were then used in multiple machine learning algorithms, such as decision tree, random forest, gradient boosting, and XGBoost.
The results of the machine learning analysis were used to categorize patients in a three-stage process. First, binary classification was used to identify whether patients were either healthy or presented with one of three pathologies: mTBI-only, PTSD-only, or mTBI+PTSD. Then, multiclass classification was used to identify the type of pathology. Finally, binary classification was used for a specific class to provide a conclusive prediction.
Quality of life, psychological health, and quality of hearing were identified as significant features explaining mTBI and/or PTSD in Veterans with concomitant self-perceived hearing difficulties. The subset of questionnaires that captured these factors can be easily administered in audiology clinics, making them candidate screeners for the auditory-related challenges in these Veterans. When the machine learning analysis was re-run after excluding these questionnaire-derived features, several indices of auditory pathway health, speech processing in noise, and binaural interaction of auditory processing were identified as significant features explaining Veterans’ auditory-related pathologies. The machine learning approach used in this study provides a valuable tool for identifying the underlying factors contributing to these difficulties, which can inform targeted interventions and improve outcomes for Veterans. Ongoing efforts to compile a larger dataset using similar data from other study sites will improve model accuracy in these preliminary analyses.
Tatum McGeary, Chemical and Petroleum Engineering
"Identifying the mechanisms linking sex and influenza infection using computational modeling"
In humans, differences in the immune response between males and females greatly influence influenza virus infection outcomes. During the 2009 H1N1 pandemic, females made up 53.2% of the total hospitalizations and adult females specifically were at greater risk than their male, age-matched counterparts for hospitalization and death. The innate immune response is a key factor of these sex differences; sex hormones have multiple, specific effects on innate immunity and estradiol treatment can reduce excess inflammation mortality in mice through stimulation of interferon production pathways resulting in promotion of anti-inflammatory activity. We modified an existing mathematical model published by our group and fit the model to sex-specific murine data. Several parameterization exercises were performed to determine which immune processes are differently regulated in males and females, indicating that the immune system and viral replication both contribute to sex-specific immune responses and disease severity. These specific mechanisms could be targeted by novel or existing drugs to treat and prevent severe influenza infection. Currently, our collaborators are generating high-quality, comprehensive, sex-specific murine data that we plan on using to validate immune models incorporating additional immune cells identified as significant for infection resolution. We will use this data to validate our conclusions from the current model developed. We aim to include hormone concentrations in the future to analyze hormone-specific impacts on immune response in males and females.
Rishal Aggarwal, Computational and Systems Biology
"Pharmacophore generation using Deep Learning"