Skip to main content
padlock icon - secure page this page is secure

Accurate Identification of Colonoscopy Quality and Polyp Findings Using Natural Language Processing

Buy Article:

$52.00 + tax (Refund Policy)


The aim of this study was to test the ability of a commercially available natural language processing (NLP) tool to accurately extract examination quality–related and large polyp information from colonoscopy reports with varying report formats.


Colonoscopy quality reporting often requires manual data abstraction. NLP is another option for extracting information; however, limited data exist on its ability to accurately extract examination quality and polyp findings from unstructured text in colonoscopy reports with different reporting formats.

Study Design:

NLP strategies were developed using 500 colonoscopy reports from Kaiser Permanente Northern California and then tested using 300 separate colonoscopy reports that underwent manual chart review. Using findings from manual review as the reference standard, we evaluated the NLP tool’s sensitivity, specificity, positive predictive value (PPV), and accuracy for identifying colonoscopy examination indication, cecal intubation, bowel preparation adequacy, and polyps ≥10 mm.


The NLP tool was highly accurate in identifying examination quality–related variables from colonoscopy reports. Compared with manual review, sensitivity for screening indication was 100% (95% confidence interval: 95.3%–100%), PPV was 90.6% (82.3%–95.8%), and accuracy was 98.2% (97.0%–99.4%). For cecal intubation, sensitivity was 99.6% (98.0%–100%), PPV was 100% (98.5%–100%), and accuracy was 99.8% (99.5%–100%). For bowel preparation adequacy, sensitivity was 100% (98.5%–100%), PPV was 100% (98.5%–100%), and accuracy was 100% (100%–100%). For polyp(s) ≥10 mm, sensitivity was 90.5% (69.6%–98.8%), PPV was 100% (82.4%–100%), and accuracy was 95.2% (88.8%–100%).


NLP yielded a high degree of accuracy for identifying examination quality–related and large polyp information from diverse types of colonoscopy reports.
No Reference information available - sign in for access.
No Citation information available - sign in for access.
No Supplementary Data.
No Article Media
No Metrics

Keywords: colonoscopy; natural language processing; quality

Document Type: Research Article

Affiliations: 1: Department of Medicine, Division of Gastroenterology, University of California San Francisco, San Francisco 2: Division of Research, Kaiser Permanente Northern California, Oakland, CA 3: Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY 4: Department of Family Medicine, University of Pennsylvania, Philadelphia, PA

Publication date: January 1, 2019

  • Access Key
  • Free content
  • Partial Free content
  • New content
  • Open access content
  • Partial Open access content
  • Subscribed content
  • Partial Subscribed content
  • Free trial content
Cookie Policy
Cookie Policy
Ingenta Connect website makes use of cookies so as to keep track of data that you have filled in. I am Happy with this Find out more