Co-Innovation Research Exchange

HPI·MS has developed an interdisciplinary research exchange and educational program wherein students of the HPI Digital Health program apply advanced data engineering and machine-learning approaches to interrogate Mount Sinai clinical data in research projects supervised jointly by Mount Sinai and HPI faculty.

Ongoing Research Project

Personalized Decision Support Interventions to Promote Behavior Change for People with Low Back Pain

In this master project, we will develop a recommender and decision support system,whichcan be embedded into existing apps and will be tested in a small pilot study. By also drawing insights from behavioral economics theory, we will explore how we can nudge people to change their routine and adhere to guidelines when suffering chronic low back pain. Moreover, we will address several challenges that arise throughout the development of such recommender system. For instance, as we might not have enough information about the preferences of a person in the beginning, how do we deal with the so-called cold-start problem? Or, as the project proceeds, how do we deal with people who do not react to the intervention messages, so-called non-responders? And how do we address temporal non-responders, who do not adhere to the program because of other environmental factors which again might be a challenge to our learning system? Therefore, the goal of this master project is also to consider already available solutions in scientific literature and find an appropriate way to overcome these problems in this project.

Supervisors

Erwin Bottinger

Micol Zweig

Tamara Slosarek

Non-Affiliated Example

Students

Eugenia Alleva, Ipsita Bhaduri, Larisse Hoffäller, Tillmann Int-Veen, Juliane Kleinknecht, Theresa Lasarow

Completed Research Project

App-based N-of-1 trials in a clinically-relevant context for personalizing digital health interventions

Digital health tools like apps and mobile sensor devices enable individuals to track and quantify their own health. The goal of this project is to empower individuals to perform app-guided health interventions with the goal of improving their health status, for example to reach a fitness goal or to manage chronic back pain, while longitudinally quantifying putative health benefits in the app. This setup represents so-called N-of-1 trials, meaning that the sample size of this form of interventional trial is 1, i.e.,the individual him or herself. Statistically speaking, N-of-1 trials are multi-crossover randomized controlled trials in a single participant, i.e. where participants use the different interventions of the study in a pre-specified order. This allows investigating which intervention works better for each single participant of the study and deriving personalized medicine approaches in a clinical context. Also, N-of-1 trials of single participants can be aggregated to obtain population-level intervention estimates [1–4]. N-of-1 trials are most powerful when they can integrate sensor data from wearables and be linked to electronic health records in a clinical context.This can be done seamlessly by developing mobile apps for digital N-of-1trials. Moreover, such an app helps to empower citizens through directly feeding back the study results. In this Master project, we will design and build such an app with an appropriate user-friendly interface.

Students

Florian Henschel, Manisha Manaswini, Fabian Pottbäcker, Ferenc Darius Rüther, Nils Strelow, Alexander Maximilian Zenner

Completed Research Project

Process Mining in Personalized Medicine

Process mining has been applied successfully in a variety of domains, e.g., production, banking, logistics, and many more. Besides others, its techniques are useful to discover actual business processes and make them accessible to multiple stakeholders like process owners and participants. This often leads to surprising insights showing that in many cases, the expected behavior may be very different from reality. Further process mining can uncover performance issues, the use of resources, and hidden costs. All of this is also of particular interest in healthcare processes. The project will be conducted in cooperation with the clinic network Mount Sinai (MS). Located in the New York metropolitan area, Mount Sinai combines the Icahn School of Medicine at Mount Sinai, eight hospital campuses and 13 free-standing joint venture centers. As part of this cooperation, Mount Sinai provides a unique data set containing detailed personalized medical data of more than eight million patients and over 120 million encounters This year’s master project gives you the opportunity to work with real-life patient data. In a previous project, a tool was developed to transform data of patient cohorts into a process mining ready format, based on event logs. Using this tool, the focus on this semester's project will be analyzing similar patient groups based on their treatment paths through the MS Health System. Technically, we are interested in the different variations that a treatment process might cover. You will develop and apply clustering methods for patient flows and compare treatment processes across wards and departments.

Supervisors

Jan Philipp Sachs

Students

Oliver Clasen, Adrian Jobst, Wiktoria Staszak

Completed Research Project

FIBER: Enabling Flexible Retrieval of Electronic Health Records Data for Clinical Predictive Modeling

In health care, the development of clinical predictive models hinges upon the availability of comprehensive clinical data. Tapping into such resources requires considerable effort from data scientists and engineers, especially for the data extraction and preprocessing steps required prior to modeling, including complex database queries. A handful of software libraries exist that can reduce this complexity by building upon data standards. However, a gap remains concerning Electronic Health Records (EHRs) stored in star-schema data warehouses, an approach often adopted in practice. In this paper, we introduce the FlexIBle HER Retrieval (FIBER): a Python-based library built on top of a star-schema clinical data warehouse that enables flexible generation of modeling-ready cohorts as data frames. To illustrate its capabilities, we applied FIBER to a clinical predictive modeling task in the EHR data warehouse of a large health system. As such, FIBER reduces time-to-modeling, helping to streamline the clinical modeling process.

Supervisors

Erwin Bottinger

Jan Philipp Sachs

Students

Christoph Anders, Philipp Bode, Jonas Kopka, Tom Martensen

Completed Research Project

Process Mining in Personalized Medicine

Business Process Technology aims to formally describe and to analyze real-world business prcess management problems. With the help of process models, they can be made accessible to multiple stakeholders. Through process mining, one can discover and model underlying processes from historical data, like electronic health records. Based on that, further aspects are considered, like performance analysis, bottlenecks, prediction about the outcome of a process, or the utilization of resources. In our project, we analyzed the patient flows of a lower back-pain sub-cohort, containing about 85,000 patients, from the Mount Sinai Data Warehouse. Furthermore, we compared the discovered behavior with existing clinical guidelines. We will present the results of our project by focusing on data extraction, transformation, and process modeling. Moreover, we will address the challenges of applying process mining on big data in healthcare.

Supervisors

Jan Philipp Sachs

Students

Arne Boockmeyer, Finn Klessascheck, Tom Lichtenstein, Martin Meier, Francois Peverali, Simon Siegert

Completed Research Project

Natural-Language Processing on Clinical Notes for Phenotyping Depression

In current clinical handbooks, mental disorders are described mainly based on symptoms. However, many patients with different disorders share the same biological underpinnings and patients of the same diagnostic category react very differently to the same treatment. Clinical notes of psychiatric patients are a rich resource for gaining a better understand of mental disorders and developing better phenotypes. In this master project, we use natural language processing on clinical notes of electronic health records (EHR) to develop meaningful language-based representations of patients with depression. With unsupervised machine learning methods, we aim to find categories that are closer to underlying biological or neurological mechanisms as well as subcategories that could inform treatment decisions.

Students

Andrea Eoli, Mirko Krause, Samuel Matthews, Hao Nguyen

Completed Research Project

Prediction of Hypertension Onset by Leveraging EHR Data with Machine Learning

Hypertension is one of the most prevalent medical conditions worldwide. With the normal physiological regulation of blood pressure being impaired, this condition is one of the main risk factors for a broad spectrum of cardiovascular diseases. Since high blood pressure can typically be prevented by lifestyle interventions, a timely prediction is essential. In this project, we leverage the power of state-of-the-art machine learning models by applying them to longitudinal clinical information from the Mount Sinai Data Warehouse (MSDW). We use different machine learning approaches such as LightGBM, random forests and neural networks to predict, six months in advance, whether a patient will develop hypertension or not. We also investigate which clinical parameters plays the major role in differentiating the two cohorts i.e. hypertensive (cases) and normotensive (controls) patients. The hypertensive patients were identified from the MSDW data, using an already validated phenotyping algorithm. In this presentation, you will be able to see the results of each approach in different sized cohorts and a discussion on the clinical parameters found to be more significant for predicting hypertension based on our analysis.

Students

Jonas Cremerius, Margaux Gatrio, Melanie Hackl, Nina Kiwit

Completed Research Project

Phenotyping and Subgroup Identification in a Non-Specific Back Pain Patient Cohort

One of the most common health problems affecting humans is non-specific back pain, which is the leading cause for absence from work and disability worldwide. Since by definition no underlying pathologycan be identified, it holds a huge potential for data-driven knowledge discovery approaches. At the same time, large quantities of real-world medical information stored in electronic health record (EHR) databases promise to reveal new insights into epidemiology, pathophysiology, and therapy of a variety of diseases. In this master project around the Mount Sinai Data Warehouse (MSDW), we want to leverage that potential: We aim at identifying clinically relevant, yet undiscovered subgroups of non-specific back pain patients and at defining a robust phenotyping algorithm for future EHR-based research. With no such algorithm available, we develop criteria to define different subsets of back pain patients from the data comprised in EHRs and we compare the epidemiological features from the resulting datasets with other previously described cohorts. Lastly, we describe how we feed the data from all patients with non-specific back pain into an unsupervised clustering approach, yielding the first clinical data-driven subclassification of non-specific back pain.

Supervisors

Erwin Bottinger

Jan Philipp Sachs

Students

Dennis Kipping, Stephan Krumm, Julian Sass, Antonia Winne