Skip to main content

UCSF Urology Builds AI Tools to Accelerate Clinical Research at Scale

Submitted on February 6, 2026

UCSF Urology faculty member Anobel Odisho, MD, MPH, recently co-authored two studies that demonstrate how artificial intelligence can be put to work in a practical, scalable way to support clinical research. By lowering technical barriers, the work enables faster discovery, supports more nuanced clinical guidelines, and lays the groundwork for improved patient outcomes. 

Together, the two studies show how generative AI can move from isolated proof-of-concept projects to a reusable capability that can be applied across many research questions. The feasibility study focused on infrastructure and demonstrated high accuracy with low variability in extracting data points from prostate cancer MRI reports with low upfront programming requirements. The technical implementation study validated that generative AI models can accurately extract key clinical variables from free-text prostate MRI reports, converting narrative data into structured formats suitable for analysis. Dr. Odisho and his team developed a reusable pipeline and user-facing interface that allows researchers to apply these methods to new datasets and questions without custom engineering. 

“We went from tinkering in the garage to building a factory,” Dr. Odisho said. “The goal was to make this usable for clinical researchers, not just people with deep technical training.” This work began well before the recent surge of interest in generative AI. The team started developing and testing natural language processing approaches in 2019, using transformer-based models and academic collaborations to lay the groundwork for today’s more mature tools. 

Unlocking Clinical Data at Scale 

By converting unstructured text such as radiology reports, pathology reports, and clinic notes into structured data, these tools enable research that would otherwise be slow or impractical. Clinically, this approach supports bulk identification of patients lost to follow-up, patients needing repeat imaging or escalation of care, and individuals reporting concerning symptoms. It enables large-scale risk stratification, development of clinical decision support tools, and automated screening for clinical trial eligibility.

These capabilities have been valuable in kidney cancer research, including surveillance of patients with small renal masses, where timely follow-up and longitudinal data are essential. 

While the use of AI and large language models (LLMs) in health care research is becoming increasingly common, Dr. Odisho’s work focuses on a critical but often overlooked challenge: building tools and infrastructure that allow researchers to apply these methods reliably across many studies, without reinventing the wheel each time. 

“When health care transitioned to electronic health records, there was this promise that we’d be able to use all this data to do amazing things,” Dr. Odisho said. 

That promise, however, has proven difficult to realize in day-to-day research and clinical practice. “A lot of that data is still stored as unstructured text notes,” Dr. Odisho said. “It’s not something on which you can easily calculate or compute.” 

Much of the information needed for research and patient-facing decisions including risk stratification, cancer grade and stage, or the presence of a mass, must be uncovered in narrative reports rather than structured fields. As a result, extracting that information at scale has traditionally required time-consuming manual abstraction or highly specialized technical expertise. 

With Dr. Odisho’s work, that is starting to change. 

Accuracy, Context and Responsible Use

Across tasks, the models achieved approximately 95% accuracy, though Dr. Odisho emphasizes that accuracy depends on the task and the clarity of the source data. The models perform best when extracting specific, clearly stated facts and struggle when reports themselves are ambiguous, such as pathology findings with unclear judgments. 

“In situations where the ground truth isn’t clear, you can’t expect the model to manufacture the right answer,” Dr. Odisho said. 

In reviewing cases where the model’s extraction differed from manually extracted data, the team often found that the manual, human-entered data were incorrect, which highlights both the limitations of traditional abstraction and the importance of keeping clinicians in the loop. Lower-risk tasks may tolerate lower accuracy, while high-stakes decisions always require human oversight. 

Accessing AI Tools 

Looking ahead, Dr. Odisho sees the greatest impact not in any single study, but in how these tools enable others to work faster and more effectively. Over time, more structured clinical data will support new research questions, improved adherence to guidelines, and higher-quality care at scale. UCSF researchers interested in using AI in their own work can access Versa, a secure, UCSF-approved platform for working with large language models and patient data. Within the Department of Urology, investigators can also use UODB-LLM (Urologic Outcomes Database – Large Language Model), a user-friendly tool for extracting and storing structured clinical data. Access requires standard research compliance and no specialized AI training.