The Luddy AI Center in Indianapolis combines core AI theory and algorithms (e.g. machine learning, computer vision, and natural language processing) with a full range of other technologies, such as human-computer interaction, visualization, and cyber-physical systems, to provide enhanced AI capabilities to solve large scale and complex application problems.
Multi-disciplinary artificial intelligence research
AI and health care: Professor’s paper explores how systems assess patient risks and medical coding
In a paper by published in the Journal of Medical Internet Research, Luddy Indianapolis Associate Professor Saptarshi Purkayastha and other researchers assessed how various large-language models would make health care decisions.

Projects by area
Visual analytics of neuroimaging data
Collaborating with the Department of Radiology and Imaging Sciences at Indiana University, we developed image analysis and visualization tools for neuroimaging data. These tools facilitate visual exploration and analysis, aiding in the detection of diagnostic biomarkers. By harnessing the power of visual analytics, we aim to enhance our understanding of brain health and contribute to improved clinical outcomes.
Gene co-expression underlying the connectomic alterations in Alzheimer’s disease
This project aims to investigate the relationship between gene expression patterns and changes in brain networks in Alzheimer's disease (AD). By analyzing brain-wide transcriptome data, we seek to identify genes crucial to the connection between co-expression networks and AD-altered networks, thereby improving our understanding of the mechanisms underlying the neural alterations and advancing diagnostic, therapeutic, and preventive strategies for the disease.
Deep Fusion of Brain Structure-Function in Mild Cognitive Impairment
Deep Connectome is a graph-based neural network that integrates brain structure and function into a disease-related network for individuals. The network topology is initialized with an individual's structural network and iteratively updated with disease-related functional information to optimize MCI classification. The resulting Deep Brain Connectome captures “deep relations” between brain structure and function in individuals, representing their disease status and offering insights into the disease's impact on brain networks.
Cortex2vector: Anatomical Embedding of Cortical Folding Patterns
Current brain mapping methods rely heavily on anatomical regularity and often overlook individualized structures. To simultaneously encode commonality and individuality, we developed a framework to establish correspondences of individual cortical folding patterns based on 3-hinge gyrus (3HG), encoding regularity in learned embedding vectors (cortex2vector), while preserving individuality by multi-hop combination coefficients. Each 3HG is then represented as an individually specified combination of embedding vectors.
Designing Autonomy Preserving Interactions in Intelligent Assistants for Older Adults
Aqueasha Martin-Hammond and Davide Bolchini
Intelligent assistants (IAs) hold promise in aiding older adults with daily care tasks. However, current approaches lack in providing older adults with a sense of control over their data and interactions, particularly in health-related tasks. This project aims to enhance the design of intelligent assistants for older adults by identifying tailored design strategies, evaluating their impact on autonomy and acceptance, and ultimately improving older adults' sense of autonomy when using these systems for health-related activities at home.
Conversational User Interfaces to Support Older Adults' Social Wellness
This project aims to explore how conversational user interfaces such as voice assistants and chatbots can assist older adults in maintaining social connections and enhancing their social wellness. Applying social-behavioral theories, this project seeks to develop strategies for creating personalized conversational interfaces that leverage real-world social connections and motivate older adults to engage in social activities.
Health Freedom Path to Wellness: A Culturally-Relevant and Patient-Centered mHealth Intervention to Promote Cardiovascular Health Equity
African Americans (AAs) and urban communities experience disproportionately high rates of cardiovascular diseases (CVD). This project aims to develop a mobile health (mHealth) tool to support CVD prevention and self-management for these high-risk populations, and providing educational and behavioral interventions.
Measuring learning gains in man-machine assemblage when augmenting radiology work with artificial intelligence
We developed a platform that integrates AI image analysis models into clinical workflows. Our study demonstrates that the platform enhances diagnostic accuracy by facilitating a synergistic partnership between radiologists and AI: radiologists benefit from real-time insights provided by AI models during image interpretation, while AI models exhibit enhanced performance through the corrections and feedback provided by radiologists.
Hierarchical Clustering and Multivariate Forecasting for Health Econometrics
This study utilizes machine learning methods to forecast the long-term impact of socio-economic changes on health indicators. It employs Hierarchical Cluster Analysis to group countries and the Multivariate Prophet model for time series analysis. By generating "what-if" scenarios and forecasted impacts, policymakers and healthcare practitioners can make informed decisions and implement targeted interventions to effectively address health-related challenges.
Generalizability of Human Activity Recognition Machine Learning Models from non-Parkinson's to Parkinson's Disease Patients
High-intensity exercises have shown promise in alleviating tremors and stiffness in Parkinson's disease (PD). This project uses machine learning (ML) models to automate human activity recognition, enabling personalized activity intensity monitoring. By quantifying and evaluating activities of healthy individuals to establish a gold standard, we aim to develop a generalized ML models applicable to both non-PD and PD patients for identifying activities and their respective intensities.
Community-Based AI Chatbots for Mental Health
Emerging AI technologies, particularly large language models, have generated significant interest for their potential to address persistent challenges in mental health care, such as limited resources, lack of awareness, and stigma. At the same time, these technologies were not originally designed for mental health contexts and carry numerous potential harms, as well as ethical and safety concerns. This project works with community partners, including NAMI Greater Indianapolis, to address these challenges and to envision safe, effective applications of mental health AI that are developed and deployed in ways that genuinely serve the needs of individuals and communities. The research employs qualitative and exploratory methods, including interviews, design workshops, and tool development, followed by long-term deployment in community settings.
AI Chatbots to Support Family Caregivers of People with Alzheimer’s Disease and Related Dementias (ADRD)
Family caregivers of people with Alzheimer’s disease and related dementias (ADRD) face numerous challenges, often leaving them with limited time and resources to care for their own mental wellbeing. AI chatbots offer an accessible gateway to support caregivers in managing their wellbeing and connecting to relevant resources when needed. In collaboration with the University of Massachusetts and the University of Illinois, this project explores how AI chatbots can incorporate evidence-based approaches, such as cognitive behavioral therapy, to support the wellbeing of family caregivers. The project is supported by PennAITech/NIA’s A2 Pilot Awards.
Mental Health AI for Trajectory Work
Drawing on the concept of trajectory work from sociology and Computer-Supported Cooperative Work (CSCW), this project investigates how people living with mental health conditions, along with clinicians, peer supporters, and caregivers, navigate evolving experiences, shifting diagnoses, and complex treatment journeys. We explore how AI tools, such as chatbots and generative models, might support reflection, communication, and planning across these nonlinear paths, while also identifying where such technologies risk oversimplifying, misaligning, or distorting personal narratives. Through participatory design, qualitative inquiry, and ethical analysis, AI for Trajectory Work aims to inform the development of mental health technologies that are attuned to the temporal, relational, and deeply human dimensions of care.
Disease2Vec: Encoding Alzheimer’s Progression via Disease Embedding Tree
The continuous nature of Alzheimer’s disease (AD) development has been typically overlooked in AD prediction models. Our proposed Disease2Vec framework learns the intrinsic relations among AD stages from fMRI functional connectivity data and generates a disease embedding tree (DETree).
DETree modeling the continuous AD progression. Training: small and larger bubbles represent individual and group embeddings, and their color indicates clinical stages. Prediction: individuals are projected onto the DETree with bubble location indicating predicted status, while bubble color indicating true clinical stage.
Advancing Health Information Exchange (HIE) Interoperability with LLM-Driven Standardization
This project uses Large Language Models (LLMs) to standardize medical text into compatible coding systems such as ICD, CPT, and SNOMED CT. It streamlines data exchange, reduces errors, and improves HIE system efficiency, addressing current interoperability challenges. The integration of these LLM capabilities into existing blockchain-based HIE frameworks facilitates secure and seamless data exchange, prioritizing privacy, empowering patients with control over their health data, and improving patient care outcomes.
Mapping RNA protein interaction networks in the human genome
Increasing number of RNA-binding proteins (RBPs) have been implicated in human diseases, but many RBPs and their cognate motifs are still unknown. This project aims to develop robust computational techniques for predicting RNA-binding protein (RBP) motifs. By integrating expression associations, sequence information, and RBP-centric features, these techniques will facilitate the construction of tissue-specific RBP-RNA networks using genome-wide data from protein protection assays (POP-seq).
Graph-based Spatial Transcriptomics Computational Methods in Kidney Diseases
Chronic Kidney Disease (CKD) and Acute Kidney Injury (AKI) are intersecting diseases affecting a significant portion of the global population. Spatial transcriptomics technology provides insights into cell type heterogeneity, but identifying colocalizing cell types and understanding fibrosis, immune interactions, and epithelial repair in kidney spatial transcriptomics data present computational challenges. In this project, we use AI-based spatial transcriptomics methods with multi-omics cell atlas data to understand the pathogenesis of kidney disease.
Dimension-agnostic and granularity-based spatially variable genes identification
Identifying spatially variable genes (SVGs) is critical in linking molecular cell functions with tissue phenotypes. We developed BSP (big-small patch), a non-parametric model by comparing gene expression pattens at two spatial granularities to identify SVGs from two or three-dimensional spatial transcriptomics data in a fast and robust manner. We will apply BSP in substantiated biological discoveries in cancer, neural science, rheumatoid arthritis, and kidney studies with various types of spatial transcriptomics technologies.
Computational strategies for incompleteness and heterogeneity in multi-omic data
Large-scale multi-omic data collected from multiple projects is often heterogeneous and incomplete, impeding data integration for joint analysis. To address these issues, we developed a model for joint network module detection and feature selection that identifies multi-omic subnetworks as disease biomarkers. Furthermore, we created a sparse association model to select associated features between heterogeneous -omics layers. These approaches eliminate the need for data exclusion and enhance disease biomarker discovery.
A Novel Wireless Sensor for Continuous Monitoring of Patients with Chronic Diseases
Chronic diseases such as heart failure, eye disease, and brain injury often require continuous monitoring of physiological signals in human bodies to improve patient outcomes and quality of life. Wireless biomedical implants have shown promise for enabling such monitoring, but existing systems face major challenges in achieving reliable sensing using radio-frequency (RF) signals. This project brings together commercial, research, and education efforts across disciplines to develop a high-performance, energy-efficient, and reliable biotelemetric system. Our goal is to enable real-time acquisition and wireless transmission of biological signals from wearable medical devices and bioimplants.
Cyberwater - A sustainable data/model integration framework
CyberWater2 is an open and sustainable data/model integration framework to facilitate collaboration across disciplines within the water domain. The framework aims to integrate heterogeneous data sources and enable two-way couplings among diverse computational models, eliminating the need for complex glue code. CyberWater2 also introduces a novel web service architecture, making it easily accessible via web browsers and adaptable AI-driven data agents to accommodate future modifications to external data sources.
CI Compass
CI Compass is the NSF Center of Excellence for Navigating the Major Facilities (MFs) Data Lifecycle. MFs are the largest-scale scientific cyberinfrastructure (CI) efforts that the NSF supports and serve scientists, researchers, and the public by capturing and curating complex data from a variety of scientific instruments (from telescopes to sensors to research vessels). CI Compass brings together expertise from multiple disciplines to accelerate the MFs data lifecycle, to facilitate knowledge sharing and discovery, ensure the integrity and effectiveness of the CI upon which research and discovery depend.
Examining the political bias of LLMs
Will LLM-powered AIs guide us to the truth or further away from it? Many research groups are competing to release new AIs (or new versions of previous AIs), but our knowledge of how they are trained remains limited. By studying how AIs answer value-laden questions, this project examines the political bias of LLMs.
Data lifecycle for LLMs
Information Scientists have developed a framework for how research data is collected, analyzed, and shared with academic communities. Especially, reusing data is emphasized to ensure reduced costs and efficiency. However, in the age of AI, data can be concatenated, merged, and layered to create another model. Therefore, it has come to our attention that it is time to revisit the existing data lifecycle model and develop a new model appropriate for AI models. This project examines reuse practices in Hugging Face, where datasets and models are reused.
CATpc: AI chatbot to increase cultural relevancy of STEM lessons, engage marginalized students
The project adopts a community-based participatory approach to develop an AI chatbot that generates culturally aware text. The project will use focus groups to elicit training data, incorporate cultural nuances, and use sentiment analysis to ensure appropriate responses. Human-centered AI methods will continuously incorporate user feedback to improve the chatbot's performance. The project aims to create a model for developing community sourced AI language learning models that can be refined and researched, resulting in a beta-level chatbot for teachers to improve their lesson plans and learning environments.
A Novel AI-based Approach to Facilitate Code-Switching for AAVE Speaking Students
Code-switching is a skill for working between dialects of a language, such as African American Vernacular English (AAVE) and Standard American English (SAE). This project aims to build an automatic dialect-to-dialect translation model between AAVE and SAE, which will help AAVE-speaking students develop code-switching skills critical to academic success.
FazBoard: AI Enhanced Teaching & Learning Systems to Improve Student Engagement, Connectedness, and Learning Equity
FazBoard combines a digital canvas with an AI Assistant to create a dynamic educational environment. The digital canvas serves as a versatile space for teachers to present materials, engage with students, and foster collaboration among individuals and groups. Meanwhile, the AI Assistant offers round-the-clock support by providing instant responses to student queries and streamlining the collection of inquiries and learning data for enhanced analytics.
EdTech Governance
Educational technology (EdTech) is increasingly adopted in higher education, datafying student life to enable examination and intervention using analytics. This raises ethical concerns related to student privacy due to poor data management, gaps in infrastructure strategy, and a lack of effective governance. This project aims to create a widely transferable governance framework to enhance EdTech policies and data practices in higher education, serving as a model for public/private socio-technical system crossover.
Blockchain-AI Synergy in Special Education: Enhancing STEM Learning for Students with Disabilities
This project addresses the data breaches and inconsistencies in managing Individualized Education Programs (IEP) and lifelong learning portfolios. The project has three objectives: 1) develop a secure and efficient blockchain framework for IEP management; 2) create a blockchain-based system for verified and private academic records; and 3) implement an AI-driven platform for personalized learning experiences that nurture students' innate STEM talents. This project aims to provide secure, personalized, and accessible STEM education for students with disabilities.
An Interdisciplinary Approach to the Discovery, Analysis, and Disruption of Wildlife Trafficking Networks
The project aims to combat online wildlife trafficking using advanced web crawling, machine learning, and text and visual language models. By compiling a dataset of 235 species in 20 languages and providing open-source tools to analyze nearly a million ads across multiple e-commerce platforms, we have improved data collection precision and provided valuable information to researchers, conservationists, and law enforcement to protect endangered species and disrupt trafficking networks.
Detecting Hotspots of Human-Wildlife Conflicts in India using News Articles and Aerial Images
This project developed an automated machine learning pipeline to collect and analyze data on human-wildlife conflict (HWC) incidents. We constructed a knowledge base from historical news articles and achieved 90% accuracy in identifying major causes of HWC. By incorporating satellite imagery, we identified proximity zones of human settlements to forested areas and pinpointed human-elephant conflict hotspots in West Bengal, India. Our findings provide insights into the spatial and temporal trends of HWC across India and allow the public and policymakers to make more informed decisions in conservation efforts.
Representation through Order Embedding of Binary Vectors
The BINDER algorithm uses binary operations that effectively represent the logical 'is-a' relation that standard word embedding models often fail to capture. Also, it applies randomized optimization as an innovative learning algorithm in the discrete domain, flipping bits in each epoch with a carefully calculated probability generated from a proxy gradient designed for binary space.
Uncovering imaging signatures of diabetic kidney disease in the renal epithelium with deep learning
CNN deep learning model can achieve 80% accuracy in classifying major cell-types in the human kidney cortex when trained on biopsy images of reference nuclei cells, but it failed to classify cells from diabetic patients, likely due to structural deformations and other biological changes unseen during training. We propose a novel Dual Encoder-Unseen Class Score (DE-UCS) outlier detection method, which embeds both images and cell transcriptomics (side information) and links them in a latent space. A test instance is considered an outlier if the distance between its embedding and the seen classes' embeddings is greater than a threshold.
Multi-level Spatiotemporal Transformer for Early Prediction of Breast Cancer Response to Neoadjuvant Therapy
Neoadjuvant Systemic Therapy (NST) is the primary method to reduce tumor size and metastasis in breast cancer before surgery, potentially enabling breast-conserving procedure. Deep learning models can predict the effectiveness of NST using MRI scans, but many rely on manual tumor segmentation by experts. To improve practical use in clinical settings, we developed a multi-level spatiotemporal transformer (MST-Former) model that directly uses raw MRI scans and clinical report. Our model demonstrates state-of-the-art performance on the publicly available I-SPY-1 dataset.
AI-Based Radio-Pathomic Nomogram for Prostate Cancer Prognosis after Radical Prostatectomy
Patients who experience recurrence after radical prostatectomy face potential risks of metastasis and mortality. To predict cancer recurrence, we have developed an AI-based method that integrate multiple data modalities from routine standard-of-care, combining MRI (radiology) and digitized biopsy (pathology) to identifying patterns associated with prostate cancer outcomes. Demonstrating its feasibility in addressing disparities in prostate cancer prognosis, particularly for minority populations such as African American men, we pave the way for future research targeting health disparities in prostate cancer.
Federated training of medical image segmentation and classification AI models
Deep learning (DL) model performance is impaired by MRI variation across sites and scanners, as well as by data sharing restrictions due to privacy concerns. Federated learning addresses both issues by enabling privacy-preserving model training on decentralized data. In this study, we train DL models for prostate gland and cancer segmentation using a federated computing platform, resulting in more robust and generalizable models compared to those trained on single-site data. This approach demonstrates the potential to improve diagnostic and treatment workflows for prostate cancer while maintaining patient privacy.
Capturing Dynamism in Causal Relationships: A New Paradigm for Relationship Extraction from Text
Our project aims to develop a deep learning model that extracts causal relationships from text data, capturing the dynamism of these relationships by combining semantic and syntax cues. By analyzing large text datasets related to public health, our paradigm offers practitioners new ways to comprehend the evolving nature of causal connections and the strength of the relationships from textual description. We apply our findings to different domains, including health, where we investigate how causal relationships extracted from text can help identify latent factors of a disease from scholarly articles on that disease.
Interpreting and generating visual metaphors
Visual metaphors are powerful communication tools that can be used to convey persuasive messages. They are often more impactful than verbal explanations, as they can appeal to the senses and trigger emotions. For example, smoking visual metaphors can be more effective in depicting the harmfulness of smoking than simply stating the facts. Despite their importance, visual metaphors have not been studied extensively. This research project aims to build computational models that can interpret and generate visual metaphors.
Developing Metrics for Trustworthy AI
This project aims to develop metrics to evaluate trustworthiness of AI systems involved in decision making processes and help calibrate the trust and acceptance of these systems.
Interactive Machine Learning and Explainable-AI
This project aims to create a visualization framework that reveals the shape patterns of machine learning models. These models, often perceived as black boxes operating in high-dimensional spaces, becomes more comprehensible to users through the framework’s multiple viewing options in lower-dimensional spaces. This framework not only helps users interpret model behavior but also ensures safety and instills trust in their use.
Uncanny Valley
The uncanny valley (UV) effect is a negative affective response to human-looking artificial entities that hinders comfortable and trusting interactions with android robots and virtual characters. This project investigates the perceptual and cognitive mechanisms of the UV effect, establishes the theoretical and methodological basis for UV research, and examines its impact on the adoption of virtual clinical consultations and storytelling robots for children.
AI pitfalls and what not to do: mitigating bias in AI
The increasing deployment of artificial intelligence (AI) applications in healthcare systems has also revealed pitfalls of these models in perpetuating bias. This study highlights bias during AI development and provides suggestions on bias evaluation and mitigation.
