Healthcare Transparency Is Not Enough. Employers Need Dependability.
Business leaders are used to managing uncertainty. But healthcare is one of the few major costs where employers are often making decisions without something eve...
America Forever Bytes
Other
Business leaders are used to managing uncertainty. But healthcare is one of the few major costs where employers are often making decisions without something eve...
The groups sued over a federal rule limiting student loan borrowing for some graduate degree programs that impact healthcare professionals.
The 12-lead ECG hasn't changed in a century. The algorithms reading it have. Three CEOs and one educator on whether doctors should trust the model
To qualify, Connecticuters must either owe medical debt worth 5% or more of their annual income, or their income must...
Some concrete measures being discussed include having standard treatment protocols for common ailments to start with, which could help define pricing, says Dr S...
As more retirees live longer, greater access to human-intensive care is needed, while the supply of caregivers can’t keep up with demand.
Come January, pregnancy care physician billing codes will change from a bundled system to an à la carte one.
In 2019, there were about 150,000 people working in autism therapy. Six years later, there were 654,000—more than the number of people who work in mining and ...
More than 97,000 Connecticut residents will receive letters in the mail starting this week informing them that some or all of their medical debt has been
Abstract page for arXiv paper 2606.03876: From 'What' to 'How' and 'Why': Sharing LLM-Generated Retrospective Summaries of Older Adults' Passive Tracking Data w...
Abstract page for arXiv paper 2606.03543: D2MDT: Department-aware Multidisciplinary Team Consultation with Deliberation for Efficient Clinical Prediction
Chronic pain patients deserve care, understanding, and support—not monthly suspicion or probation.
Background: Large language models (LLMs) require specialized methodologies to quantify model confidence for safe deployment in health care systems; however, there is a lack of established methods for confidence assessment. Objective: This study aimed to evaluate confidence metrics for multimodal LLMs interpreting ultrasound-based radiology cases and to compare self-reported, consistency-based, and hybrid methods. Methods: From a total of 330 quizzes on the Korean Society of Ultrasound in Medicine digital platform, we selected 94 multiple-choice cases. Four multimodal LLMs were evaluated: 3 reasoning models (GPT-5, Claude-4.5-Sonnet, and Gemini-3-Pro) and 1 general model (GPT-4o). Temperature was fixed at 1.0. Multiple confidence metrics were assessed: (1) self-reported metrics generated by LLMs using prompts that elicited direct confidence percentages with answers, including first self-reported confidence and mean self-reported confidence; (2) consistency-based metrics derived from 20 repeated outputs per case, including relative entropy calculated as 1 − H/log k (H=Shannon entropy, k=number of answer choices) and majority-vote percentage; and (3) a Top Weighted Score combining response frequency with self-reported confidence. Receiver operating characteristic analysis for discrimination and Spearman correlation between accuracy and each confidence metric was conducted. Additionally, model calibration was assessed using expected calibration error and Brier score. Processing time and token consumption (input, output, and total) were recorded for each application programming interface call to evaluate resource use across models. Results: Diagnostic accuracy varied across models, with Gemini-3-Pro achieving the highest accuracy (70/94, 74.47%), surpassing the median human accuracy (59%, IQR 40.3%-75%). Top Weighted Score, a hybrid metric combining response frequency and self-reported confidence, was the only metric achieving statistically significant correlations across all 4 models: Gemini-3-Pro (ρ=0.52), GPT-5 (ρ=0.43), Claude-4.5-Sonnet (ρ=0.30), and GPT-4o (ρ=0.22). Receiver operating characteristic analysis revealed that Top Weighted Score demonstrated the highest discriminative ability, with area under the curve values of 0.826 (95% CI 0.731‐0.920) for Gemini-3-Pro and 0.767 (95% CI 0.668‐0.866) for GPT-5. Top Weighted Score was the only metric achieving statistical significance in GPT-4o. Calibration analysis showed that Top Weighted Score achieved the lowest expected calibration error in GPT-5 (0.098) and Claude-4.5-Sonnet (0.192), while Gemini-3-Pro showed comparable calibration between relative entropy (0.119) and Top Weighted Score (0.122). Resource use analysis demonstrated that reasoning models required substantially longer processing times and higher token consumption compared to general models. Conclusions: In multimodal LLMs applied to ultrasound-based radiology cases, hybrid methods (Top Weighted Score) demonstrated significant associations across all evaluated models and appear to serve as more reliable indicators of diagnostic confidence compared to self-reported or consistency-based metrics alone, although the strength of these associations varied across models, and external validation is warranted before broader clinical application. These findings support integrative confidence estimation approaches that incorporate response consistency while highlighting the need for resource-efficient sampling strategies to enable practical clinical deployment.
The human population is growing larger and older, which means cancer cases and deaths are increasing, too.
As higher interest rates reshape commercial real estate, healthcare assets continue to show resilience. Explore how rising costs, disciplined investment strateg...
Our family moved abroad twice to different countries. We were surprised the biggest challenges involved housing and healthcare, not making friends.
Healthcare supply chain disruptions require new planning approaches. Learn how health systems can build resilience amid volatility and infrastructure challenges...
The global health care sector is under increasing strain. Decades of chronic underinvestment and constraints in recruitment have coincided with a surge in dem...
PPC Strategies for Plastic Surgeons help clinics attract high-intent patients, increase consultations, and improve ROI.
A new meta-analysis of 57 psychological studies found that essentially identical patient interviews can lead to different diagnoses for the same exact patient.
The mental shortcuts doctors use in diagnosis aren't that different from how chatbots come up with answers to your health questions.
Two Growth Stocks reported over the past month
Inclusion of weight loss drugs on insurer formularies is a meaningful step towards expanding access. But having a product listed doesn’t necessarily mean it?...
Healthcare costs shouldn’t eclipse the cost of owning a home; but for millions of Americans, that's exactly what's happening.
Abstract page for arXiv paper 2605.31511: Bayesian Nonparametric Clustering to Support Medical Decision-Making: A Variational Inference Approach
The portfolio’s Income Stock’s performance shows why they deserve their positions.
Learn about the effects of ICU overcrowding on healthcare systems, highlighting the struggles faced by medical professionals and patients.
This week, a House Judiciary subcommittee will hold a hearing to examine patents and prescription drug prices.
Hospitals promised far more affordable healthcare and they did not fulfill their promise. In fact, the opposite happened, here's why.
There are strategies to improve healthcare, but US isn't trying them.