Content-based Document Enhancement and Resizing
Abstract:Recent advances in information and communications technologies have increased the need for automated reading and processing of documents. Most of today's documents contain not only text and background, but also graphics, tables, and images. Common image enhancement and interpolation methods apply an interpolation or enhancement function indiscriminately to the whole image. The resulting document image suffers from objectionable moiré patterns, edge blurring and aliasing. Therefore, scanned documents must often be segmented before other document processing techniques, such as filtering, resizing, and compression can be applied. In this paper, we present a new system to segment and label document images into text, halftone images, and background using feature extraction and unsupervised clustering. Once the segmentation is performed, a specific enhancement or interpolation kernel can be applied to each document component. Each pixel is assigned a feature pattern consisting of a scaled family of differential geometrical invariant features and texture features extracted from the co-occurrence matrix. The invariant feature pattern is then assigned to a specific region using a two-stage neural network system. The first stage is a self-organizing principal components analysis (SOPCA) network that is used to project the feature vector onto its leading principal axes found by using principal components analysis (PCA). The next step is to cluster the output of the SOPCA network into different regions. This is accomplished using a self-organizing feature-map (SOFM) network. In this paper, we demonstrate the power of the SOPCA-SOFM approach to segment document images into text, halftone, and background. The proposed filtering and interpolation method results in a noticeable improvement in the enhanced image.
Document Type: Research Article
Publication date: 2000-01-01
For more than 25 years, NIP has been the leading forum for discussion of advances and new directions in non-impact and digital printing technologies. A comprehensive, industry-wide conference, this meeting includes all aspects of the hardware, materials, software, images, and applications associated with digital printing systems, including drop-on-demand ink jet, wide format ink jet, desktop and continuous ink jet, toner-based electrophotographic printers, production digital printing systems, and thermal printing systems, as well as the engineering capability, optimization, and science involved in these fields.
Since 2005, NIP has been held in conjunction with the Digital Fabrication Conference.
- Information for Authors
- Submit a Paper
- Subscribe to this Title
- Membership Information
- Terms & Conditions
- Ingenta Connect is not responsible for the content or availability of external websites