Faculty Research Collaborations with Rising Researchers - List of AI Proposals - PSU Institute for Computational and Data Sciences

Click any of the AI proposal summaries below for more information and to apply as a Rising Researcher.

Your deadline to apply is June 16.

Return to the complete list of available research opportunities.

Enhancing Rural Supply Chains with AlphaFold-Inspired Deep Learning (Soundar Kumara)

This project promises to substantially enhance rural supply chain resilience and efficiency. (learn more and apply – requires Penn State Login)

Accelerating AlphaFold3 for High-Throughput Protein Design (Soundar Kumara)

The proposed project targets major inference speed-ups for diffusion-based protein prediction models like Alphafold 3, Boltz and Chai -1 without significant loss of accuracy, enabling the evaluation of thousands of protein designs in silico. (Learn more and apply – requires Penn State login)

Enhancing Satellite Deformation Measurements using Deep Learning (Christelle Wauthier)

Using AI deep learning approaches to maximize the output of satellite deformation measurements using realistic atmospheric models is still in its infancy. (Learn more and apply)

Dynamically Adjustable Queue to Optimize the Roar GPU Cluster (Guido Cervone)

The goal of this research is to optimize the queue for the Roar GPU cluster. (Learn more and apply)

Large Language Models as Noisy Oracles for Constructing Causal Models (Vasant Honavar)

This project aims to explore the use of LLMs as noisy oracles that can answer conditional independence queries or interventional queries to construct causal models. (Learn more and apply)

Resource Request for RAG-Based HPC User Support Chatbot (Lindsay Wells)

This project aims to improve the accessibility of ICDS documentation and reduce the support load on client-facing teams by deploying an intelligent chatbot that provides accurate, context-aware answers to user questions using a consolidated knowledge base (KB) and RAG-based LLM architecture. (Learn more and apply)

Development of a Digital Twin Model for Stirred Milling Process by Integrating Machine Learning Models and Discrete Element Method Simulations (Olumide Ogunmodimu)

This project aims to contribute to the evolution of digital twin applications by integrating a combined approach of machine learning models, including Support Vector Machines (SVM), Convolutional and Graph Neural Networks (CCNN, GNN), Physics-Informed Neural Networks (PINN), and Discrete Element Method (DEM) simulations. (Learn more and apply)

Learning on the Edge With Hyperdimensional Computing (Vasant Honavar)

This project aims to develop and evaluate lightweight, HD computing based machine learning framework for learning on the edge, that ls, learning predictive models from data being acquired by edge devices. The resulting methods will also help significantly reduce the carbon footprint of machine learning. (Learn more and apply)

Federated estimation of causal effects from observational data (Vasant Honavar)

A long term goal of this project is to develop robust federated algorithms for causal effect estimation for a broad range of applications in healthcare, education, public policy, etc. where it is generally neither feasible nor desirable to aggregate data collected by independent entities into a centralized repository. (Learn more and apply)

Efficient Adaptation of Trained Models When Utilities of Model Predictions Change (Vasant Honavar)

The long-term goal of this project is to develop methods for efficient adaptation of predictive models trained using machine learning when the utilities of model predictions change, with practical applications across a broad range of real-world applications. (Learn more and apply)

Machine Learning for Health Risk Prediction from Longitudinal Health Data (Vasant Honavar)

The long term goal of this project is to establish a unified, modular framework for temporal clinical modeling that is generalizable across datasets, interpretable for clinicians, and adaptable to other domains of risk prediction. (Learn more and apply)

Data-Driven Discovery of Regulatory Mechanisms and Cellular Resource Allocation via Multi-Modal Data Integration (Vasant Honavar)

This project supports a working group within the U.S. National Science Foundation (NSF) National Synthesis Center for Emergence in the Molecular and Cellular Sciences (NCEMS) at Penn State. NCEMS aims to drive multidisciplinary collaboration by synthesizing publicly available research data to address fundamental scientific questions at the intersection of data science and molecular and cellular biology. (Learn more and apply)

Estimating the causal effects of clinical interventions from observational electronic health records (Vasant Honavar)

This project aims to develop effect methods for causal effect estimation from longitudinal data, under different scenarios: point interventions and point effects, longitudinal interventions and point effects, point interventions and longitudinal effects, and longitudinal interventions and longitudinal effects. (Learn more and apply)

Explorations in Quantum Machine Learning (Vasant Honavar)

The long-term goals of this research are to develop innovative QML algorithms that offer substantial advantages over their CML counterparts for a broad range of real-world applications. This project lays the groundwork by allowing the PI and his team to develop the necessary experience with QML, gather preliminary data to support competitive collaborative QML research proposals. (Learn more and apply)

Geodetic inversion and optimization using physics-based FEMs models and AI (Christelle Wauthier)

We will develop and apply AI and computational modeling methods to volcanic processes that will have broader impacts on forecasting. (Learn more and apply)

Forecasting volcanic eruptions using data fusion (Christelle Wauthier)

We will develop and apply data sciences and AI methods to volcanic hazards processes and hope to improve eruption forecasting globally. (Learn more and apply)

Using AI to learn and generate physically consistent and realistic landscape topography and fluvial river bathymetry (Xiaofeng Liu)

The objectives of the project are: (1) to investigate the inherent structural relationships between topography, river bathymetry, physiography, climate, precipitation, and river discharge. (2) to develop AI and ML models capable of generating synthetic, physically realistic landscape topography and river bathymetry. (Learn more and apply)

Transfer Learning for Predicting Local Atomic Order in Multi-Principal Element Alloys (Mia Jin)

This project aims to develop a machine learning framework that leverages transfer learning from binary alloy datasets to predict chemical short-range order (CSRO) in multi-principal element alloys (MPEAs), such as high-entropy alloys (HEAs), where data are scarce. (Learn more and apply)

Normalizing flows for Bayesian Model Comparison: Detecting Extrasolar Planets (Eric Ford)

This project compares the robustness and efficiency of different computational methods for performing Bayesian uncertainty quantification and model comparison to improve the sensitivity and robustness of surveys to discover and characterize low-mass planets. (Learn more and apply)

Using Artificial Intelligence (AI) to Understand Neural and Behavioral Variability (Xiao Liu)

We will develop and apply state-of-art AI models to understand brain functions. The project is also to understand the ANN from the perspective of the brain science. (Learn more and apply)

Improving economic outcomes via AI-powered bank monitoring and risk management (Nonna Sorokina)

By integrating expertise in finance, economics, regulatory policy, and artificial intelligence, the initiative aims to build an AI-powered monitoring framework for banking risk management—particularly vital in today’s volatile interest rate environment. (Learn more and apply)

De-risking the commercialization of advanced nuclear reactors through innovative financing vehicles (Nonna Sorokina)

By developing innovative financial mechanisms—including pooled investment models, securitization strategies, and CDS-like instruments—this research synthesizes technical reactor design considerations with sophisticated computational modeling of risk and return. (Learn more and apply)

Your next-door neighbor, nuclear reactor: real estate and societal readiness (Nonna Sorokina)

By examining real estate dynamics around nuclear power plants and incorporating novel measures of public sentiment and societal readiness, the research brings together expertise from economics, urban planning, nuclear engineering, and computational social science. (Learn more and apply)

LLM-Augmented Digital Twin Framework for Building Material Reuse and Recycling Assessment (Yuqing Hu)

This project proposes to develop a digital twin framework powered by large language models (LLMs) and large vision models (LVMs) to support component-level material reuse and recycling assessment. (Learn more and apply)

Privacy-Preserving Linear Regression and Synthetic Data for Reproducible Social Science Research (Aleksandra Slavkovic)

This project aims to develop a novel method for DP linear regression that enables valid statistical inference and supports synthetic data generation. (Learn more and apply)

Linking Multidimensional Sleep Health to Cognitive Function in Older Adults Using Machine Learning (Sayed Reza)

This project will evaluate the relationship between sleep health and cognitive function in older adults by leveraging wearable device time series data and applying interpretable AI/ML techniques. (Learn more and apply)

Non-Invasive Turkey Body Weight Monitoring and Prediction via Deep Visual Time Series Analysis (Enrico Casella)

This project aims to develop a novel hybrid deep learning model that leverages longitudinal visual data, potentially combined with historical flocklevel time series information, to estimate current body weight, predict future body weight trajectories, and ultimately forecast final carcass weight in turkeys. (Learn more and apply)

Evaluating Generative AI Tools for Qualitative Analysis (Tim Brick)

The goal of this project is to develop a pipeline that can leverage zero-shot and few-shot learning with Retrieval Augmented Generation (RAG) in Large Language Models (LLMs) to partially automate qualitative coding of conversational transcript data. (Learn more and apply)

Millions of Galaxies but No Time: Rapid Inference of Galaxy Properties with Neural Density Estimators (Joel Leja)

We seek an ICDS Junior Researcher who will perform the first, pioneering application of our SBI++algorithm to millions of galaxies from the Hobby-Eberly Telescope Dark Energy Experiment (HETDEX), in which Penn State has a leadership role. (Learn more and apply)

Classifying Weakly Detected Gamma-ray Transients (James DeLaunay)

This project will consist of finding the optimal way to perform classification on these weakly detected gamma-ray transients, by exploring different AI techniques, inputs, and training data. (Learn more and apply)

Interpreting the biological concepts learned by neural networks in genomic predictive tasks (Shaun Mahony)

In this project, we aim to develop an alternative approach for feature interpretation in genomics neural networks. Our goal is to implement and test the Testing with Concept Activation Vectors approach to assess how genomic “concepts” are used by neural networks as opposed to focusing on individual DNA element features.(Learn more and apply)

Mapping Language Model Failures Through Community Experience: A Study of Multilingual Researchers (Dana Calacci)

This project investigates how English as a Second Language (ESL) graduate students interact with Large Language Models (LLMs) like ChatGPT, focusing on how language proficiency shapes their experience of model failures, biases, and harms. (Learn more and apply)

PolliSense: AI-powered Habitat Quality Assessment and Biodiversity Improvement (Mehrdad Mahdavi)

The long-term vision for this project is to create an accessible, user-friendly tool that empowers land managers, farmers, and conservationists to quickly and easily assess their landscapes, make informed decisions, and collaborate on creating environments that support both human and ecological health.(Learn more and apply)

Predicting genomic regulatory elements across species using domain adaptive neural networks (Shaun Mahony)

In this project, our goal is to implement additional domain adaptation strategies to enable accurate crossspecies gene regulatory predictions. We are particularly interested in the multi-source training scenario, where we have labeled training data from multiple genomes/domains. (Learn more and apply)

Machine learning techniques to identify ‘state-changing’ and anomalous behavior in astrophysical time series (Charlotte Ward)

In this project, we aim to identify the architectures most capable of identifying ‘state changes’ in time series, whether they be outlier detection techniques or extensions of methods that learn and predict time series behavior. (Learn more and apply)

Algorithmic Affidavits and Automation Bias: Empirical Evaluation of Generative AI in Police Report Writing (Dana Calacci)

This project investigates the full lifecycle of hype surrounding generative AI technologies in policing—from exaggerated capability claims by vendors, to procurement by law enforcement agencies, to courtroom use of AI-generated evidence. (Learn more and apply)

Deciphering systemic biological networks through AI-driven multi-omic integration (Gustavo Nader)

We will integrate publicly available multi-organ, multi-omic dataset to investigate the molecular hierarchies that establish organ cross-talk and optimal organismal physiological adaptations and function. (Learn more and apply)

Analyzing Human and Social Dynamics Through Social Sensing (Xi Gong)

This project aims to expand the current study using social sensing for understanding spatial social networks and public perspectives on controversial social topics, also exploring dealing with the challenges inherited in social sensing research. (Learn more and apply)

Embedding Intelligence in 3D Modeling Workflows: An Approach for Large Language Models to Assist Users in Modeling Using Natural Language (Felicia Ann Davis)

This research proposes the development of an AI-augmented interface that enables users to interact with 3D modeling applications through natural language. The project seeks to reimagine how designers engage with complex software systems by embedding large language models (LLMs) within the modeling environment. (Learn more and apply)

Enhancing Road Safety Through Real-Time AI-Powered Drowsiness Detection and Alert system Using EEG Eye-Blink Artifacts (Daniel Otchere)

This proposal seeks to develop an innovative AI-powered system for detecting driver drowsiness through real-time analysis of EEG eye-blink artifacts. (Learn more and apply)

Application of Transformer-Based Machine Learning Models to Whole Organism Computational Phenomics (Keith Cheng)

To enable the first 3-dimensional whole-organism phenotyping that encompasses all cell types and organ systems, we propose to develop and optimize Transformer-based machine learning (ML) models capable of automatically segmenting and labeling regions of interest from high-resolution 3D micro-CT scans at unprecedented resolutions. (Learn more and apply)

Benchmarking for Quantum Machine Learning (Mahmut Taylan Kandemir)

The goal of QML benchmarking is to establish a rigorous, standardized, and practical (easy to use) framework for systematically evaluating and comparing QML systems—spanning algorithms, hardware systems, and application domains. (Learn more and apply)

AI-Supported Cyber Safety Curriculum for Youth: Design, Development, and Evaluation (Ellen Wenting Zou)

Project objectives include the design and prototyping of four interactive curriculum modules, the development of AI-powered learning scenarios, and initial user testing with middle and high school students to refine both content and interface. (Learn more and apply)

Building Digital Twins of Personalized Models for Alzheimer’s Disease Prevention and Treatment (Zi-Kui Liu)

The proposed project aims to develop a Zentropy-Enhanced Neural Network (ZENN) that learns the configurations, total energy, and entropy of brain states using data related to Alzheimer’s disease (AD). (Learn more and apply)

Predicting HIV care loss-to-follow-up using machine learning (Kathryn Risher)

Our project aims to develop an ML model to predict patient LTFU from HIV care, trained on data from PLHIV in the Penn State Comprehensive Care Clinic and TriNetX. (Learn more and apply)

AI-Enabled System for UAV Precision Descent and Touchdown (Dhananjay Singh)

Combining data from onboard cameras and inertial measurement units (IMUs), the proposed system will integrate computer vision, artificial intelligence/ML algorithms, and sensor fusion approaches. (Learn more and apply)

Adaptive Smart Homes for the Elderly: AI, VR, and IoT for Independent Living (Dhananjay Singh)

The project will present a functional prototype evaluated in a simulated environment by the end of the year, proving how artificial intelligence may improve aged care, lower healthcare costs, and advance well-being. (Learn more and apply)

Better Left Unsaid: Preventing Hallucinations by Learning Abstention (Dongwon Lee)

The project aims to explore a few ideas and produce a prototype with preliminary results. The participating junior researchers will have an opportunity to contribute to scientific publications in top AI venues, while PI aims to use the preliminary findings to pursue an external grant program at the NSF. (Learn more and apply)

Exoplanet Demographics Combining Multiple Detection Method (Eric Ford)

This project aims to develop simulation-based inference (SBI) tools for characterizing the intrinsic distribution of exoplanets while combining observational constraints from multiple exoplanet detection techniques. (Learn more and apply)

Neural-Network based optimization of wave functions of interacting electrons (Jainendra Jain)

In this project, we will initialize our model on CF trial wavefunctions and fine-tune to capture only the residual LL mixing effects—a transfer-learning strategy that slashes parameter requirements and opens the door to much larger systems. (Learn more and apply)

Development of Data-based AI-driven Toolkits for Energy Industry Using Distributed Fiberoptic Sensing (Shimin Liu)

In this project, we aim to develop and optimize a robust data analytics pipeline tailored specifically for high-volume DAS datasets generated from industry generated data set in mining and oil and gas fields. (Learn more and apply)

Predict Arctic Sea Ice Variability from Atmospheric River Activities and the Time of Arrival of Ice-free Arctic (Laifang Li)

In this project we propose to apply deep-learning models (e.g., convolutional neural networks; CNN) to predict Arctic sea ice variability based on the life cycle of ARs. (Learn more and apply)

Designing Adaptive Reservoir Operations Using Multi-Objective Reinforcement Learning (Hadjimichael)

This project will develop dynamically adaptive and state-aware reservoir operation policies for the Conowingo Dam that explicitly address saltwater intrusion while balancing other management objectives under deeply uncertain future climate conditions. (Learn more and apply)

AI-Enhanced Dynamic Assessment of ESL Academic Writing (Matthew Poehner)

The proposed project is to develop an AI-enhanced DA system that students can access to receive updated diagnoses of their L2 English writing progress, including areas needing improvement. (Learn more and apply)

Informing the detection of flash drought events by mining and modeling media reports (Antonia Hadjimichael)

The project directly supports ICDS’s mission to advance computational and data science approaches to pressing societal challenges, by demonstrating the value of integrating diverse data sources for the detection of hydroclimatic hazards. (Learn more and apply)

Unrestricted MeanField Analysis for Quantum Materials via Machine Learning (Zhen Bi)

By combining quantum materials expertise with machine learning tools, this project will produce an opensource package that automates and speeds unrestricted meanfield calculations for interacting electronic systems. (Learn more and apply)

Develop machine learning models to study cell-type-specific aging using single-cell methylation data in the Uzun Lab (Yasin Uzun)

Our goal is to develop a deep learning-based framework to predict cell-type-specific epigenetic age using single-cell methylation data. (Learn more and apply)

Learning Linear Temporal Logic under Uncertainty for Sustainable Behavioral Change Interventions (Romulo Meira Goes)

The proposed project is an interdisciplinary project combining the fields of artificial intelligence and data science to address an important problem in the field of behavioral change: Which behavior patterns explain behavioral change? (Learn more and apply)

Machine-Learning Angle-Resolved Photoemission Spectroscopy under Tunable Magnetic Fields (Chaoxing Liu)

This project closely aligns with the objectives of CENSAI by leveraging machine learning approach to guide the design of cutting-edge experiments and drive foundational advances in quantum materials research. (Learn more and apply)

Using Artificial Intelligence (AI) to Understand Neural and Behavioral Variability (Xiao Liu)

We will develop and apply state-of-art AI models to understand brain functions. The project is also to understand the ANN from the perspective of the brain science. (Learn more and apply)

Advancing Air Pollution Exposure Assessment with Machine Learning Techniques (Xi Gong)

We will develop and apply data science and ML/AI methods to environmental health science to advance understanding, response, and mitigation of air pollution’s adverse health effects. (Learn more and apply)

Developing Functionally Equivalent Proxy Systems for AI: A Framework for Code Similarity Analysis, Asynchronous Digital Twin Proxies, and Proxy Repository Implementation (Joanna F. DeFranco)

The research team will investigate targeted techniques for analyzing AI systems to develop functionally equivalent proxy systems. (Learn more and apply)

Fire and climate change impacts in a tropical biodiversity hotspot (Rwenzori Mtns, Uganda): remote sensing to understand abrupt ecosystem change (Sarah Ivory)

This project will use remote sensing data (primarily Landsat, MODIS, ASTER) to reconstruct fire burned areas on a remote mountain. (Learn more and apply)

Development of an AI image classifier for detecting vulnerability of African ecosystems under changing climates using ancient data (Sarah Ivory)

In this project, we seek to develop a proof of concept for AI image classification of 4 African pollen taxa common in the fossil record. (Learn more and apply)

Development of a Web-Based Platform for Structured CryoEM Data Collection and Metadata Management (Jean-Paul Armache)

This project aims to develop a secure, user-friendly web-based platform to collect, store, and manage cryoEM data collection parameters in a complementary automated and manual approach. (Learn more and apply)

Reduced order modeling for supersonic and hypersonic aerodynamic flows via probabilistic machine learning (Ashwin Renganathan)

We will develop probabilistic AI/ML methods to reduce, interpret, and learn data. This project will include both large-scale data generation by running finite-volume based multiphysics codes on Roar Collab, as well as developing AI/ML methods on that data with GPU acceleration. (Learn more and apply)

ML-Enhanced Multiphysics Modeling for Packed-Bed Thermal Energy Storage Optimization (Olumide Ogunmodimu)

This research introduces a comprehensive multiphysics modeling framework that integrates machine learning and digital learning tools for enhanced simulation and analysis. (Learn more and apply)

Faculty Research Collaborations with Rising Researchers – List of AI Proposals