Transforming Care Delivery with AI and ML: Using Patient Data to Prevent Hospital Visits
The most exciting and concrete use case for AI in healthcare
The U.S. healthcare system faces a significant challenge: many hospital visits and emergency department (ED) trips could be prevented with better proactive care. These potentially preventable hospitalizations (PPH) and ED visits (PPED) cost the healthcare system approximately $30 - $100 billion1 annually. Estimates suggest approximately 13% of adult and 8% of pediatric hospitalizations2 are potentially preventable. Now, emerging technologies and artificial intelligence could help address this challenge by revolutionizing how we identify and help at-risk patients before they need acute care.
Understanding the Challenge
Ambulatory care sensitive conditions (ACSCs) are health issues that can typically be managed effectively in outpatient settings through proper primary care. These are important to address because there are clinical tools to ensure adequate management of the conditions. These conditions, such as diabetes, hypertension, and asthma, often lead to hospital visits that could have been prevented with earlier intervention.
Ultimately, the problem we are tying to solve here is the per capita costs of health care in the United States, which is much higher and worse performing than our economic peer countries. The scope of this challenge becomes clear when examining health care utilization patterns. Studies show that approximately 5% of the U.S. population accounts for nearly half of all health care spending, while the top 10% of health care utilizers account for 67% of expenditures. These high-cost utilizers (HCUs) represent a diverse group of patients who frequently use health services due to unmet health care needs. The proportion of this cost that is avoidable is highly debated, but the consensus is that there is a significant portion of it.
As indicated above, about 13% of adult hospitalizations and 8% of pediatric hospitalizations are generally considered potentially preventable.
What makes this challenge particularly complex is the heterogeneous nature of the high-cost utilizer population. Some patients show clear clinical patterns that predict future high utilization, while others do not display obvious warning signs. This variability makes it difficult to identify who might become a high-cost utilizer before it happens, limiting healthcare providers' ability to intervene proactively.
There are three overarching (and easier said than done) ways to solve this “high-utilizer” and “potentially avoidable” acute care problem. First, we can invest in effective public health interventions to prevent disease in the first place. A healthier population with a better diet, higher amounts of exercise, reduced smoking, and improved mental wellbeing would be effective at reducing acute care utilization. Second, it is critical to identify acute care utilization risk before it occurs so “potentially preventable” becomes *actually* preventable. The methodologies by which to do this are significant areas of research and innovation. And, third, once identified somebody must do something with the information and apply the most effective intervention.
Thus, methodologies and technologies by which patient identification and systematic intervention occur, cost-effectively, are critical to solving this problem.
The health care system has traditionally struggled to identify precisely which patients are at highest risk for these preventable visits. Current methods rely on relatively simple algorithms or basic clinical rules of thumb, which lack the precision needed to make intervention programs cost-effective. For example, one study demonstrated that for every 20 patients classified as high-risk for hospitalization using traditional methods, only one actually experiences a hospitalization during the study period. This high "number needed to benefit" (NNB) ratio makes it challenging to implement cost-effective intervention programs.
Adding to the complexity, many patients with chronic conditions face barriers to consistent primary care access and may not seek medical attention until their condition becomes severe enough to require emergency care. This pattern often leads to worse health outcomes and higher costs than if the condition had been managed earlier in an outpatient setting. This is where the problem meets socioeconomic risk factors, structural determinants of health, and health care costs & insurance coverage.
Furthermore, the current healthcare system's reactive nature means that providers often don't receive updated patient health information between visits, creating gaps in care that can lead to complications that may have been otherwise preventable. This lack of continuous monitoring and early warning systems makes it difficult to identify and address health issues before they escalate to emergency care or hospitalization. This is a major barrier to proactive prevention and the design of active intervention programs.
However, the recent growth and reimbursement for remote patient monitoring (see also, a great book!) may present a new opportunity to mix traditional and novel risk prediction methodologies with real-time, patient generated health data.
Traditional Approaches to Risk Assessment and Stratification
Historically, healthcare providers have relied on relatively simple methods to assess patient risk and adjust care accordingly. The most common approaches include:
Clinical Algorithms: Healthcare providers have traditionally used straightforward scoring systems developed through retrospective linear regression analysis. These tools, like the APGAR score for newborns, serve more as quick clinical decision-making guides than precise prediction instruments.
Claims-Based Systems: Insurance companies and healthcare organizations use systems that analyze medical claims data to group patients into broad risk categories. These systems primarily look at demographics and diagnosis codes to predict future healthcare needs and costs.
Electronic Health Record Analysis: With the widespread adoption of electronic health records (EHRs) in the past decade (over 90+% now), providers began using systems like the Johns Hopkins ACG System, which analyzes patient age, sex, and medical diagnoses to create risk groups. While effective for general categorization, these systems have significant limitations:
They use relatively few predictor variables
They assume linear relationships between risk factors and outcomes
They often lack precision in identifying individual patient risk (that cost issue again)
They rely on historical data that may be months or years old
They miss important day-to-day changes in patient health status
For example, when using the ACG system to predict 30-day pediatric hospitalizations, the highest-risk 5% of patients accounted for 43% of hospitalizations3. However, this 5% meant identifying over 46,000 patients in the sample data as high-risk, making targeted intervention programs challenging and potentially cost-ineffective .
The Rise of Remote Patient Monitoring
A promising development in healthcare delivery is remote patient monitoring (RPM), which has grown significantly since 2019 when Medicare began providing reimbursement for these services. RPM programs use cellular-enabled medical devices and wearables to collect real-time health data from patients in their homes. These devices can track various vital signs and health metrics, transmitting this information directly to healthcare providers.
This patient-generated health data (PGHD) provides a continuous stream of health information that was previously unavailable to health care providers. Instead of only seeing data from occasional office visits, providers can now monitor patients' health status daily. RPM has already shown promise in reducing acute care visits, even without advanced predictive algorithms.
The Power of Machine Learning in Health Care
The real transformation comes from combining RPM data with advanced machine learning (ML) and artificial intelligence algorithms. Traditional risk prediction models typically use only electronic health records (EHR) and insurance claims data, which have significant limitations. These records may be outdated or incomplete, and they only capture snapshots of patient health during medical visits.
Machine learning algorithms can analyze vast amounts of data from multiple sources, including:
Electronic health records
Insurance claims
Census data
Daily patient-generated health data from RPM devices
Patient-reported symptoms and experiences
These algorithms can identify subtle patterns and risk factors that might escape human notice, potentially predicting health deterioration before traditional clinical signs appear. Many studies4 on these models using a variety of techniques show excellent predictive accuracy and precision, however there is room to grow and opportunities to advance the implementation of these models into clinical practice.
Creating a Proactive Clinical Model
The integration of ML algorithms with RPM data could transform healthcare delivery from reactive to proactive and cost-effectively allow care management programs, primary care clinics, and centralized clinical intervention programs to prevent patient deterioration before it happens. Here's how such a system might work:
1. Patients with certain chronic conditions and elevated starting risk levels use RPM devices to monitor their health metrics daily.
2. ML algorithms continuously analyze this data alongside other health records and update risk as new data is transmitted.
3. When the system identifies increasing risk patterns, it alerts healthcare providers or care management teams to reach out to the patient for intervention.
4. Providers can intervene early with preventive measures like medications, intensive outpatient therapies, lifestyle modifications, medically-tailored meals, and outpatient procedures.
5. Regular monitoring continues to track the effectiveness of interventions and continue to monitor patients at risk. As the program collects more data, the precision and accuracy of the predictions can improve.
This approach offers several advantages over traditional methods like waiting for the patient to come in to a visit, or outreach from an insurance company for a disease management program (patients generally don’t trust their insurance companies):
More timely identification of rising health risks
Better precision in targeting interventions
Continuous feedback on intervention effectiveness
Reduced burden on healthcare providers through automated monitoring
More efficient use of healthcare resources
The trick here is to have a prediction methodology that only identifies patients at the highest levels of risk and who will actually have an event such that alerts do not overwhelm the resources available for intervention. This is a critical problem to solve, but it is possible with recent advances in computing and artificial intelligence.
Once prediction and identification is solved, this becomes an operational and financial puzzle as the delivery of interventions has a price tag and is subservient to the reimbursement and financial environment of health care, which is slow to change and adapt. However, early programs like the one described here are already in place in many value-based care organizations and full-risk capitation models. Technological advances in this area will improve the cost-effectiveness of existing programs and allow new ones to launch.
Implementation Challenges
While the potential benefits are significant, several challenges must be addressed:
Data Privacy and Security: The system involves collecting and analyzing sensitive health data, requiring robust privacy protections and security measures.
Clinical Integration: Healthcare providers need user-friendly interfaces and clear guidelines for using the ML predictions effectively in their practice. The operations of intervention programs are a critical component of success here. Putting all of the onus to act on a provider during a typical office visit defeats the purpose of the program and over burdens that process. The care model described above must be viewed as an independent, but closely integrated process. I like to call it “Octopus Primary Care” where data streams and connected devices are the tentacles and the intervention program is the main body.
Algorithm Transparency: The "black box" nature of some ML algorithms can make it difficult for healthcare providers to trust and properly interpret their predictions. Regulatory guidance and processes can help garner trust as can well-designed systems that display information to clinicians in an interpretable manner.
Cost and Access: While RPM services are increasingly covered by insurance, patient cost-sharing could still create barriers to access for some patient. Often those who are most affected by cost-sharing policy are the ones who need the services the most.
Looking Ahead
The combination of RPM-generated patient data and ML algorithms represents a promising approach to reducing preventable hospital visits. As these technologies continue to evolve and become more widely adopted, they could help create a more proactive, efficient healthcare system that better serves patients while reducing costs.
Success in this area could transform how we manage chronic conditions, shifting focus from treating acute episodes to preventing them. This could lead to better health outcomes, improved quality of life for patients, and significant cost savings for the healthcare system.
The key will be ensuring these technologies are implemented thoughtfully, with appropriate attention to clinical validation, user experience, and equitable access. As healthcare continues to evolve, this data-driven, proactive approach could become a standard part of chronic disease management, helping millions of patients avoid unnecessary hospitalizations while receiving better care in their communities.
To learn more about RPM programs themselves, you can check out this book.
https://www.cms.gov/data-research/statistics-trends-and-reports/national-health-expenditure-data
McDermott KW, Jiang HJ. (2020). Statistical brief #259: characteristics and costs of potentially preventable inpatient stays, 2017. Healthcare Cost and Utilization Project, US Agency for Healthcare Research and Quality. https:// www.hcup-us.ahrq.gov/reports/statbriefs/sb259-Potentially-Preventable-Hospitalizations-2017.pdf.
Maltenfort MG, Chen Y, Forrest CB. (2019). Prediction of 30-day pediatric unplanned hospitalizations using the Johns Hopkins Adjusted Clinical Groups risk adjustment system. PLoS ONE 14(8): e0221233. https://doi.org/10.1371/journal.pone.022123
For you statistics and data science nerds out there. For ML models, the current standard is AUROCs greater than 0.80 for the prediction of acute care utilization with 0.84 as the highest in the literature. A 2021 study comparing ML approaches to traditional methods produced AUROCs for potentially preventable hospitalizations and potentially preventable ED visits of 0.778 and 0.681, respectively.