When AI starts to see: the next shift in clinical documentation

Last updated on 20 March 2026

 Authors Bradley Menz, an academic pharmacist in Flinders’ College of Medicine and Public Health & Associate Professor Ashley Hopkins

For years, AI scribes have promised to give clinicians something they rarely have enough of: time.

By listening to consultations and turning conversations into structured notes, these tools have begun to reduce the administrative burden that quietly consumes much of modern healthcare. Yet despite the progress, they have operated with a fundamental limitation. They can only capture what is said.

A new study from Flinders University suggests that is about to change, and the implications go well beyond efficiency.

More than words: why vision matters

Healthcare has never been purely verbal. It is visual, contextual and often dependent on details that are never spoken aloud.

Medication packaging, dosage labels, devices, even subtle cues in how a patient presents can all carry clinical meaning. When AI scribes rely on audio alone, those details are either missed or left for clinicians to manually fill in later.

The Flinders study explored what happens when AI is given both sight and sound, using smart glasses to capture video alongside audio during consultations. The difference was not incremental.

The vision-enabled system achieved 98% accuracy, compared with 81% for audio-only tools , with the most significant gains coming from a reduction in missing information rather than better transcription of what was already said.

That distinction matters, because in clinical documentation, what is omitted can be just as important as what is recorded.

The detail that gets missed

Medication management offers a clear example of the problem. Much of the critical information required for safe prescribing and administration is visual rather than verbal.

Strength, formulation and packaging details are often read directly from labels, not discussed in full during a consultation. As a result, audio-only systems struggle to capture them accurately.

In the study, audio-only AI captured medication strength and form just 28% of the time, compared with 97% when visual input was included .

That gap is not a technical curiosity. It represents a real-world risk, particularly in settings such as aged care where residents often have complex medication regimens and where documentation errors can have serious consequences.

From transcription to interpretation

What emerges from this shift is a subtle but important change in how AI functions within clinical workflows.

Rather than simply transcribing conversations, a vision-enabled system begins to interpret the environment. It can cross-check what is said against what is seen, identify inconsistencies and capture details that would otherwise require additional effort from clinicians.

In practical terms, this has the potential to reduce the time spent editing and verifying documentation, allowing clinicians to focus more on care.

At the same time, it reshapes the clinician’s role. The task moves from writing notes to reviewing and validating AI-generated outputs, which introduces a different kind of responsibility.

The risk no one can ignore

Even at 98% accuracy, the system is not infallible.

The study identified a small number of errors, including incorrect medication details and misinterpretation of clinical context. While these were relatively infrequent, they highlight a critical point. High accuracy does not eliminate risk; it changes where that risk sits.

If clinicians begin to rely too heavily on AI-generated documentation without rigorous review, the consequences of those errors could be amplified.

The researchers are clear that these tools are intended to support, not replace, clinical judgement. The challenge lies in ensuring that distinction remains meaningful in day-to-day practice.

What this means for aged care

For the aged care sector, the implications are significant.

Medication safety is already under intense scrutiny, and providers are navigating increasing expectations around documentation, compliance and clinical governance. A tool that can improve accuracy and reduce omissions has obvious appeal.

However, aged care also presents a more complex environment for implementation. Residents often have multiple conditions and medications, workforce capability varies, and digital systems are not always well integrated.

There are also legitimate concerns around privacy and consent, particularly when video recording is involved. Capturing visual data during care interactions introduces new governance considerations that providers will need to address carefully.

The operational reality

Beyond the clinical benefits, there is the practical question of how these systems fit into existing workflows.

In the study, recordings were processed after the consultation rather than in real time. While this approach is suitable for research, it is less aligned with the pace of everyday care delivery.

For vision-enabled AI scribes to be widely adopted, they will need to operate seamlessly within clinical environments, integrating with existing systems and delivering outputs quickly enough to be useful. Without that, even the most advanced technology risks becoming another underutilised tool.

The bigger shift

What this research ultimately points to is a broader evolution in healthcare AI.

The move towards multimodal systems, capable of combining speech, vision and contextual data, opens the door to more comprehensive and accurate clinical support. Medication histories are only one application.

Similar approaches could be applied to wound assessment, mobility monitoring, behavioural observation and other areas where visual information plays a central role.

This is not simply about improving documentation. It is about expanding how AI can support clinical decision-making and care delivery.

The bottom line

AI scribes that can see as well as hear are likely to become a standard part of healthcare workflows.

Their ability to reduce omissions and improve accuracy addresses a longstanding challenge in clinical documentation. However, the value of these tools will depend less on the technology itself and more on how they are implemented.

Clear workflows, strong governance and ongoing human oversight will be essential to ensure that the benefits are realised without introducing new risks.

As AI becomes more capable, the role of clinicians does not diminish. If anything, it becomes more critical, because the responsibility for interpreting, validating and acting on that information remains firmly human.

Tags:
research
aged care technology
AI in aged care
AI in health