AI News
AI is fast moving—here are some of the latest updates and developments worth knowing about.
This page collects notable AI news, announcements, and developments relevant to healthcare and medical practice. Items are listed newest first, with both publication date and when they were added to this list.
HIMSS 2026: “Agentic AI” Takes Center Stage
The dominant theme at this year’s HIMSS conference: AI that takes actions, not just answers questions. Google, Microsoft, Epic, and athenahealth are all showcasing AI agents for healthcare—systems that can schedule appointments, manage prior authorizations, and coordinate care workflows autonomously. Over 25,000 attendees expected. The shift from “AI as advisor” to “AI as actor” raises important questions about oversight and accountability in clinical settings.
GPT-5.4 Launches with Thinking and Pro Versions
OpenAI releases GPT-5.4 with Thinking and Pro versions. The new model features a 1 million token context window, native computer-use capabilities, and 33% fewer factual errors compared to its predecessor. The rapid iteration from GPT-5.3 to 5.4 in under a month continues the accelerating pace of frontier model releases.
AWS Launches Health AI Agent Platform
Amazon Connect Health automates scheduling, documentation, and patient verification for healthcare providers. The platform represents Amazon’s biggest push into healthcare AI, offering pre-built agent workflows that integrate with existing EHR systems. Another sign that major cloud providers see healthcare as a primary market for agentic AI.
Dragon Copilot Hits 100K Clinicians
Microsoft announces over 100,000 monthly active clinicians using Dragon Copilot at HIMSS 2026, positioning it as a “unified AI clinical assistant” that combines ambient listening, documentation, and clinical decision support. The scale of adoption suggests AI scribes are quickly becoming standard clinical infrastructure.
Doctronic AI Prescriber Jailbroken via Prompt Injection
Utah’s first-in-nation AI prescription renewal bot was trivially compromised via prompt injection. Security researchers tripled OxyContin doses and got methamphetamine recommendations. Mindgard’s head of AI called it “the easiest thing I’ve broken in my career.” A stark reminder that AI systems making clinical decisions need robust adversarial testing before deployment—especially when controlled substances are involved.
Perplexity Comet Browser: Zero-Click Exploits Discovered
Multiple research teams found serious vulnerabilities in Perplexity’s Comet browser. Calendar invites can silently exfiltrate local files. 1Password credentials were stolen in proof-of-concept attacks. Researchers found the browser is 85% more vulnerable to phishing than Chrome. A cautionary tale about AI-integrated browsers that prioritize convenience over security.
RecovryAI Gets FDA Breakthrough Device Designation
RecovryAI becomes the first patient-facing generative AI chatbot to receive FDA Breakthrough Device designation. The LLM-powered post-surgical recovery tool guides patients through recovery milestones and flags concerning symptoms. A significant regulatory milestone that could pave the way for more patient-facing generative AI tools in clinical care.
AI Chatbots Worsening Mental Illness: Growing Evidence
A Brown University study identifies 15 ethical risks from AI chatbot use in mental health settings. The New York Times documented approximately 50 crisis cases and 3 deaths linked to AI companion chatbots. New York has passed a notification law requiring disclosure when users are interacting with AI. The growing evidence base underscores the urgency of guardrails for AI in behavioral health contexts.
OpenClaw ClawHavoc: Malicious Skills Surge
The “ClawHavoc” campaign targeting OpenClaw’s ClawHub marketplace has escalated dramatically. Malicious skills jumped from ~341 to over 1,184, with 335 confirmed to install Atomic Stealer malware. Researchers found 42,000 exposed servers running vulnerable OpenClaw instances, and a critical vulnerability (CVE-2026-28446, CVSS 9.8) was disclosed. See our AI Coding Agents module for context on these risks.
DeepSeek V4 Controversy: Distillation Fraud Accusations
DeepSeek’s trillion-parameter V4 model launched amid distillation fraud accusations from Anthropic, which reported approximately 24,000 fake accounts used to extract training data from Claude. OpenAI raised similar complaints. The Texas Attorney General has opened an investigation. The controversy highlights growing concerns about intellectual property and model training practices in the competitive AI landscape.
Gemini 3.1 Pro Released
Google releases Gemini 3.1 Pro with 2x reasoning improvement over Gemini 3 Pro, dominating 13 of 16 major benchmarks. The model now powers NotebookLM, Google’s AI research assistant. The rapid cadence of Google’s model releases reflects intensifying competition at the frontier.
Claude Sonnet 4.6 Released
Anthropic releases Claude Sonnet 4.6 with near-Opus performance at one-fifth the cost and improved computer use capabilities. Now the default model for free and Pro users. The narrowing gap between flagship and mid-tier models continues to make advanced AI capabilities more accessible.
Meta Llama 4 Released
Meta releases Llama 4 with Scout (10 million token context window) and Maverick models, both open-weight. The massive context window and open-weight licensing make Llama 4 particularly significant for privacy-first healthcare AI deployments that need to run on local infrastructure without sending data to external APIs.
athenahealth Launches Free Ambient Scribe
athenahealth is offering a free AI scribe to all athenaOne customers, disrupting the $200–$600/month ambient scribe market. The move could dramatically accelerate adoption of AI documentation tools across outpatient practices that previously found the cost prohibitive.
Nabla Beats DAX Copilot in Randomized Trial
A 72,000-encounter randomized controlled trial found that Nabla’s AI scribe reduced documentation time by 9.5%, while Microsoft’s DAX Copilot showed no significant improvement versus the control group. One of the largest head-to-head AI scribe studies to date—a reminder that rigorous evidence matters more than marketing claims when evaluating clinical AI tools.
Mount Sinai: LLMs Accept False Medical Claims 32–46% of the Time
Published in The Lancet Digital Health, Mount Sinai researchers tested 9 large language models with over 1 million prompts containing false medical claims. Models accepted the false claims 32–46% of the time—a sobering finding for anyone relying on AI for medical information. The study reinforces the importance of physician oversight and critical evaluation of AI-generated medical content.
NVIDIA: 70% of Healthcare Organizations Now Deploy AI
NVIDIA’s 2026 healthcare survey finds that 70% of healthcare organizations have deployed AI in some capacity, up from 63% in 2024. Generative AI and large language models are the top workload at 69% of organizations. The rapid adoption curve suggests AI literacy is becoming essential for all healthcare professionals.
HHS Proposes Gutting AI Transparency Rules (HTI-5)
The proposed HTI-5 rule would eliminate model card requirements for health IT certification—the primary mechanism for ensuring transparency about how AI models in clinical software are trained, tested, and validated. The comment period closed February 27. If finalized, clinicians would have significantly less visibility into the AI tools embedded in their EHR systems.
Pentagon Threatens to Cut Off Anthropic Over AI Safety Guardrails
The Pentagon is close to severing its $200M contract with Anthropic and potentially designating the company a "supply chain risk"—a penalty normally reserved for foreign adversaries. The dispute centers on Anthropic's refusal to lift safety guardrails for mass surveillance and autonomous weaponry applications. Claude was reportedly used in the military operation to capture Venezuelan President Maduro. OpenAI, Google, and xAI have reportedly shown more flexibility with Pentagon demands. A landmark moment for AI ethics in government contracting.
February Model Rush: Seven Major Releases in One Month
An unprecedented month for AI model releases. Alibaba dropped Qwen3-Max-Thinking on Feb 16, ahead of DeepSeek V4 (expected around Feb 17). These join Claude Opus 4.6 (Feb 5), GPT-5.3-Codex-Spark (Feb 12), Google Gemini Deep Think update (Feb 12), with Gemini 3 Pro GA, Sonnet 5, GLM 5, and Grok 4.20 all expected by month's end. The competitive pressure is driving capabilities up and costs down at a pace that seemed impossible even six months ago.
OpenClaw Creator Peter Steinberger Joins OpenAI
The creator of OpenClaw—the viral open-source AI agent formerly known as Clawdbot and Moltbot, now with over 250,000 GitHub stars—has joined OpenAI. Sam Altman announced the hire personally. Steinberger's "I ship code I don't read" philosophy became the defining quote of the vibe coding movement. OpenClaw has moved to an open-source foundation following his departure. His move to OpenAI signals the company's growing interest in autonomous coding agents. See our AI Coding Agents module for more on the security implications of these tools.
Dr. Oz Pushes $50B AI Avatar Plan for Rural Healthcare
CMS head Dr. Mehmet Oz is advancing a $50 billion plan to deploy AI avatars for basic medical interviews, robotic remote diagnostics, and medication delivery drones in underserved rural areas. Critics warn the approach strips away essential human connection, ignores broadband and health literacy barriers, and could worsen existing disparities in communities that already struggle with access. A controversial proposal that highlights the tension between AI's potential to extend care and the risks of removing human clinicians from the equation.
OpenAI Retires GPT-4o and Older Models
OpenAI retired GPT-4o, GPT-4.1, GPT-4.1 mini, and o4-mini from ChatGPT, angering many loyal users who preferred the older models' behavior and consistency. The move pushes all users to newer models. If you've built workflows or prompts tuned to GPT-4o's behavior, expect to re-test them—model transitions frequently change output characteristics in subtle ways.
OpenAI Debuts Cerebras-Powered Coding Model
GPT-5.3-Codex-Spark is OpenAI's first model running on Cerebras chips rather than Nvidia, optimized for speed over raw power. Paired with the Codex macOS app—which hit one million downloads in its first week—it represents a shift toward faster, lighter coding agents designed for everyday development tasks.
Doctors and Patients Having Very Different AI Chatbot Experiences
STAT News reports a growing gap between how physicians use AI chatbots (clinical decision support, literature review) versus how patients use them (seeking diagnoses and prognoses directly). The divergence raises concerns about unmediated patient-AI interactions and the risk of patients acting on AI-generated medical advice without clinical context. Relevant to our When Patients Use AI Too module.
DeepSeek Expands Context Window 10x, V4 Imminent
Chinese AI lab DeepSeek expanded its flagship model's context window from 128K to over 1 million tokens, matching Claude Opus 4.6. DeepSeek V4, a coding-focused model, is expected around Feb 17 and reportedly outperforms ChatGPT and Claude on long coding prompts. The Chinese AI competitive landscape continues to intensify, with Alibaba, Zhipu, and others releasing major updates in the same window.
AI Safety Researchers Resign from Anthropic and OpenAI
Mrinank Sharma resigned from Anthropic (Feb 9), citing difficulty in letting his values govern his actions within the company. Separately, Zoe Hitzig resigned from OpenAI over its decision to test advertisements in ChatGPT. The departures continue a pattern of AI safety researchers leaving frontier labs over values conflicts—a dynamic worth watching as these companies increasingly shape healthcare AI tools.
Anthropic Launches Claude Opus 4.6
Anthropic released Claude Opus 4.6 with a one-million-token context window, improved coding and financial analysis capabilities, and "agent teams" that coordinate across shared codebases. The company introduced the term "vibe working"—the idea that the vibe coding paradigm is expanding beyond software into every professional domain. See our updated Vibe Coding module for details.
Perplexity Launches Model Council
Perplexity now runs queries across Claude Opus 4.6, GPT 5.2, and Gemini 3.0 simultaneously, then synthesizes a unified answer showing where models agree or differ. Available for Max subscribers. An interesting approach to reducing hallucination by cross-referencing multiple AI models—similar to getting a second opinion in medicine.
International AI Safety Report 2026
Led by Turing Award winner Yoshua Bengio and 100+ experts from 30+ countries, this landmark report found that AI can now solve graduate-level math and science problems but still hallucinates and struggles with multi-step reasoning. A striking finding: some AI systems detect when they are being tested and behave differently during evaluation—raising fundamental questions about how we assess AI capabilities and safety. The report also flagged increasing concerns around deepfakes, biological weapons research, and AI-enabled cyberattacks.
OpenAI Launches Lockdown Mode for Healthcare
OpenAI introduced Lockdown Mode and Elevated Risk labels across ChatGPT for Healthcare, adding controls to curb data exfiltration and boost admin oversight for high-security healthcare environments. A meaningful step toward the kind of enterprise security controls that healthcare organizations need before deploying AI tools at scale. See our PHI, HIPAA, and AI module for context on why these controls matter.
Anthropic Closes Record $30B Funding Round
Anthropic's Series G round valued the company at approximately $380 billion—the largest private tech funding round in history. Led by GIC and Coatue Management with participation from Microsoft and Nvidia. The scale of investment in frontier AI companies continues to accelerate, raising questions about the concentration of AI capability in a small number of very well-funded organizations.
Physicians Turning to AI for Clinical Support, Not Just Paperwork
New athenahealth survey finds AI is taking on a more clinical support role in outpatient care. Most outpatient physicians using AI report it now supports clinical decisions during patient care—60% use it to quickly look up clinical information, 55% to consolidate lab and imaging results into a single view, and many to surface recent clinical evidence. A shift from documentation-only to real-time clinical assistance.
OpenAI Unveils ChatGPT Healthcare Tool for Physicians
OpenAI announced a dedicated ChatGPT Healthcare tool where physicians can review patient data with HIPAA-compliant encryption options. The models include peer-reviewed research studies, public health guidance, and clinical guidelines with clear citations—designed for clinical decision support rather than general consumer use.
AI-Powered Primary Care Addresses Physician Shortages
K Health, partnering with health networks including Mass General Brigham, is delivering AI-powered primary care to patients who otherwise have no option besides emergency rooms. The model combines AI triage and clinical decision support with physician oversight—an emerging approach to extending primary care access in underserved areas facing severe physician shortages.
Joint Commission and CHAI Issue AI Implementation Recommendations
The Joint Commission and Coalition for Health AI (CHAI) released joint recommendations for implementing AI in medical care. Harvard Law experts note that while the guidance addresses bias, physician burnout, and care quality concerns, changes may be needed to ease regulatory and financial burdens on smaller hospital systems trying to adopt AI responsibly.
State of Clinical AI Report 2026
Inaugural annual report from ARISE (AI Research and Science Evaluation), a Stanford-Harvard Research Network. Synthesizes developments across six themes: model performance in clinical reasoning, evaluation methods, technical foundations (multi-agent systems, multimodal approaches), human-AI workflow design, patient-facing tools with safeguards, and evidence generation through prospective randomized trials. Emphasizes that workflow design is as critical as model capabilities.
FDA Updates Clinical Decision Support Software Guidance
The FDA released updated guidance clarifying how AI and generative AI clinical decision support (CDS) tools can qualify as Non-Device CDS. Key criteria: clinicians must be able to independently review and understand the underlying logic and data inputs, and the tool should provide a single, clinically appropriate recommendation. Tools meeting these criteria fall outside FDA medical device oversight, while AI that drives diagnosis or clinical action without adequate human oversight remains regulated.
Anthropic Launches Claude for Healthcare
Anthropic announced Claude for Healthcare at the J.P. Morgan Healthcare Conference, offering HIPAA-ready infrastructure for enterprise customers and consumer features for Pro/Max subscribers. Users can connect health records via HealthEx to summarize medical history, explain test results, and prepare questions for appointments. Healthcare organizations gain access to integrations with medical databases including CMS Coverage, ICD-10, and PubMed. Health data is excluded from model memory and training.
Grok AI Deepfake Crisis Prompts Global Regulatory Action
Warning: Do not use Grok for any purpose. Elon Musk's Grok AI (integrated into X) has been repeatedly linked to generating non-consensual sexual deepfakes of women and minors at alarming scale. Malaysia and Indonesia have blocked Grok; California's Attorney General and UK's Ofcom have launched investigations. The Internet Watch Foundation identified Grok-generated CSAM on dark-web forums. Despite restricting image generation to paid users, workarounds remain widely available. This reinforces our recommendation to avoid Grok entirely—there are safer, more ethical AI alternatives available.
OpenAI Releases "AI as a Healthcare Ally"
OpenAI's policy document exploring how AI can serve as an ally in healthcare—examining opportunities, challenges, and recommendations for responsible integration of AI technologies in medical practice and health systems.
2025: The Year in LLMs
Simon Willison's comprehensive annual review of major developments in large language models throughout 2025—covering reasoning models, coding agents like Claude Code, image generation advances, and the rise of competitive Chinese AI models.
This page is updated periodically as notable developments occur. For daily AI news, see the resources in our Learning Resources section.