Current Project Highlights
Trustworthy AI to address health disparities in Under-Resourced Communities (AI-For-U)
A concern in the biomedical research community is how to develop trustworthy Artificial Intelligence tools to address diversity challenges in the health domain in both expert and stakeholders’ communities. The overall goal of the proposed AI-For-U project is to develop trustworthy AI tools for under-resourced communities, and to increase the AI capacity in these communities through the development process. We will test the hypothesis that focused community engagement with risk mitigation and explainable AI technologies can improve the trust of AI technologies in healthcare.
Improving health data quality by assessing and enhancing semantic integrity
As terminologies change and are used over time by different entities, there can develop changes and divergence in what the use of a single code or a set of codes represent. If multiple codes or multiple combinations of codes can represent the same phenotype, cohort identification or cohort variable assignment based on the codes becomes problematic. As numerous research projects utilize large electronic health record (EHR) datasets containing standardized terminology codes, violations of RS integrity would be expected to propagate errors in subsequent analyses and findings. We propose to develop novel data driven methods to analyze the temporal pattern and the context of EHR variables. We will also assess the impact on predicative modeling.
Biomarker for AD/ADRD Risk Supplement
AD/ADRD is a growing national public health crisis as the number of Americans ≥65 years is projected to double by 2050. Our parent grant aims to assess cardiorespiratory fitness (CRF) as a biomarker of physical activity in a large-scale study of nearly 1 million Veterans using advanced AI and VA’s electronic health record. Understanding AI models is crucial for their ethical use, as it helps detect biases and builds trust among stakeholders. This supplement will test how AI model explanations affect trust and bias detection in a simulated environment, with the goal of advancing ethical AI use in biomedical and behavioral sciences.
Magnesium supplement and vascular health: Machine learning from the longitudinal medical record
Over half of adult Americans use dietary supplements, but their safety and effectiveness are largely unknown since they are not approved by the FDA, with limited post-marketing surveillance. The NIH Office of Dietary Supplements (ODS) seeks to fill in that gap and has identified electronic health record (EHR) data as a potential tool to advance that goal. Preliminary findings from our pilot study suggest that magnesium supplements may reduce the risk of heart failure (HF) in people with diabetes mellitus (DM) and may improve outcomes in those with HF. We aim to test whether magnesium supplementation lowers the risk of HF and mortality in patients with DM and HF, while also developing a deep learning-based prediction model to identify individuals who would benefit most from magnesium supplements. The study will use data from the Veterans Affairs national EHR, incorporating machine learning to minimize biases, and validate the findings using data from Cerner Health Facts® for generalizability in non-Veteran populations.
MWAS+ – A Novel Drug Repurposing Strategy for ADRD Prevention
An estimated 5 to 6 million older Americans (≥65 years) suffer from dementias that affect cognition, memory, language, and executive function. Nearly two-thirds of dementias are Alzheimer's disease (AD) and the rest are AD-related dementias (ADRD). Finding drugs that may prevent dementia requires advanced analytical approaches. Our innovative three-step study called "medication-wide association study plus (MWAS+)" will accelerate the discovery of approved drugs that have the potential to delay dementia.
AI-for-You: Data Mining to Improve Mental Health Through Shared Decision-Making in Minority Adolescents and Their Parents
Preparing diverse students for the future healthcare workforce is fundamental to addressing health disparities, increasing cross-cultural communication, and positively impacting health equity while promoting diversity, equity, and inclusion in practice. The demonstration and hands-on activities will center around student participation in developing a "Teens Like Me" app, employing state-of-the-art explainable AI-For-You technology. Specifically, the app will retrieve matching patient cases from a national EHR data repository, perform risk prediction, identify modifiable risk factors, and compare treatment options. A software developer will lead the app design and development while providing students with opportunities to participate in the design, implementation, and usability testing.
Past Projects
Representational Semantic Integrity in Coded, Longitudinal Heterogenous EHR Data
Numerous research projects utilize electronic health record datasets containing standardized codes without tools to assess if the codes are used consistently semantically. We have termed this consistent use of the codes as "representational semantic integrity." Violations of representational semantic integrity will propagate errors in subsequent large scale data analyses of patient records. Thus, the goal of this project is to develop and validate advanced statistical and machine learning methods to assess and improve representational semantic integrity in large clinical databases.
Healthy Play for Teens by Teens
The Biomedical Informatics Center is collaborating with T.C. Williams High School students to create a suite of video games that promote healthy behavior among teens. By promoting learning tools which are created by teens for their peers, this project aims to provide information that is relatable, credible, and engaging to the learners. The goal of these games is to not only increase healthy behaviors among teens, but to reduce the disparity in health behaviors among teens from varying backgrounds (e.g. socioeconomic, racial, cultural, religious).
Informatics - CTSI at CNMC
Research Initiative whose final purpose is to develop a dynamic risk assessment of pediatric risk of mortality that is able to quantify physiological status and can be used for earlier interventions to significantly increase survival rate of patients and to improve the efficiency of the resources. The Biomedical Informatics Center has access to the Health Facts database and uses statistical and machine learning methods for the development of the desired dynamic risk assessment.
Use Frailty Status to Predict Postoperative Outcomes in Elderly Patients
Frailty is an age-related state of increased vulnerability to stressors. It is an important predictor for many health outcomes, but is rarely collected in a quantitative and systematic fashion in routine health care. This project aims to extract frailty status from free-text clinical notes using natural language processing tools. The extracted frailty status will be mapped into an ontology and then applied to the prediction of major cardiovascular procedure outcomes on elder patients.
Protect Patient Safety Through Herb-Drug Disease Interaction Detection and Alert
Doctors and pharmacists routinely check for potentially harmful drug interactions when prescribing medications for patients. However, herbal supplements have been found to have interactions with some drugs. The Biomedical Informatics Center is collaborating with a cardiologist and a team of pharmacists with specialty in herbal supplements to develop a tool for detection of possible herb-drug interactions. An online survey is being developed to allow patients to enter their medications and herbal supplements in order to alert them of any potentially harmful interactions.
Data Mining on Veterans with Severe Mental Illnesses
The goal of this project is to identify patients at high risk of developing adverse outcomes (i.e., death, hospitalization, readmission, etc.) among the US veterans with severe mental illnesses (focusing on bipolar disorders). Different from the traditional studies that usually use patient baseline characteristics as predictors, this study uses patient temporal phenotypic features to predict outcomes. Multiple data mining techniques such as topic modeling and data visualization are applied to process temporal data from the US national VA databases including free-text medical notes. The pilot study has evidenced data mining techniques with temporal phenotypic features would dramatically improve the predictive performance.
NLP Support for Digital Pathology
Surgical pathology reports can vary greatly in appearance and format, and tend to lack a uniform style across institutions or clinicians. Traditionally, this has created a need for report information to be manually extracted in order to accurately capture machine-readable information. However, the accuracy rates of this approach is still far from perfect, leading to statistical skewing of any inferences generated from the use of such data. The goal of this project is to implement an automated lexical analysis of surgical pathology reports which attains a high degree of accuracy.