New Jersey Scanning and Indexing Company

Imaging Terminology

- A -


Additive Primary Colors: Red, Green, Blue; the 3 colors used to create all other colors when direct, or transmitted, light is used (as in a video monitor). They are called additive primaries, because when these three colors are superimposed they produce white.


ADF: Automatic Document Feeder. A device that holds pages and feeds them one after another into a scanner.


Alphanumeric: Set of characters composed of letters and numbers; may include punctuation marks or other symbols; excludes printer control characters such as Carriage Return and flow control characters such as XON and XOFF.


Anti-Aliasing: A method of filling in data which has been missed due to under-sampling. In imaging this usually takes on the process of removing jagged edges by interpolating values in-between pixels of contrast. These methods are most often used to remove or reduce the stair-stepping artifact found in digital high contrast images.


ASP: Stands for application service provider. ASPs deliver and manage applications and computer services from remote data centers to multiple users via the Internet or a private network. Obtaining these applications from an outside supplier can be a cost-effective solution to the demands of systems ownership: up-front capital expenses, implementation challenges, and a continuing need for maintenance, upgrades and customization. Commercial ASPs offer leasing arrangements to customers, whereas non-profit or government organizations may provide these services with no fee. An ASP may be a commercial entity or a not-for-profit or government organization supporting end users.


Aspect Ratio: The proportion of an image's size given in terms of the horizontal length verses the vertical height. An aspect ratio of 4:3 indicates that the image is 4/3 times as wide as it is high.


- B -


Barcode: Consists of a series of thin and thick black lines that when placed in defined patterns represent a numeric or alphabetic character. Various different symbologies identify the defined patterns. Barcodes can be one dimensional -- like the ones found on retail packages or two dimensional (known as 2D). 2D barcodes, which consist of a matrix of black and white blocks can contain large amounts of information. The most popular is PDF-417, developed by Symbol Technologies.


Batching: Collecting multiple pages together and separating with batch separators. Batches are either fixed quantities of single pages which can be counted to identify double feeds (see autofeeders), or consist of multiple levels often based on three levels of index. Recently there has been some interest in using color coded bars scanned with a color scanner to identify batches.


Batch Control Sheets: Coded pages usually with barcodes or OCR'able characters that automatically separate pages within a batch or separate batches.


Bitmap: An image is called a bit map if it contains a value for each of its pixels. This is the opposite of vector images where a small set of values can generate an object. BMP is a file format extension for bitmap images.


Bitonal: An image or file comprised of pixel or dot values of either black or white.


Book Scanning: requires either specialized scanners or for the spline to be cut off. Flatbed scanners damage the spline and provide a fuzzy image at the edges.


- C -


CMY and CMYK: Cyan, Magenta, Yellow, (K)black. Computer monitors are additive, but color printers are subtractive. Instead of combining light from monitor phosphors, printers coat paper with colored pigment which remove specific colors from the illumination light. CMY is the subtractive color model that corresponds to the additive RGB model. Cyan, magenta, and yellow are the color complements of red, green, and blue. Due to the difficulties of manufacturing pigments that will produce black when mixed together, a separate black ink is often used and is referred to as K ('B' is already used for blue).


Compression: An image processing method of saving valuable disk and memory space by reducing the amount of space required to save a digital image. The graphics data is rewritten so that it is represented by a smaller set of data. Not to be confused with encoding. See also lossless and lossy compression.


Compression Ratio: The ratio of a file's uncompressed size over its compressed size.


Crop: An image processing method of removing the region near the edge of the image, but keeping a central area.


- D -


Data Color: Refers to the color of the data that must be extracted and converted. Carbonless paper can often produce a very faint image.


Data Rate: The speed of a data communications channel, measured in bits per second.


Data Prep: A term covering one or all of the following manual actions: the opening of envelopes, unfolding of paper, removal of staples, repair of tears.


Decompression: When an image or other digital data set is compressed and stored, it is not usable until it is decompressed into it original form.


Device Driver: A set of low-level software routines which work with and control a specific hardware device. The names and functions are often standardized across many similar devices. This allows higher level software to use the hardware as a generic device. This frees the higher level software from dealing with the particulars of the specific devices and allows device to be interchanged easily.


Dithering: The method of using neighborhoods of actual display pixels to represent one image intensity or color. This method allows low intensity resolution display devices to simulate higher resolution images. For example, a binary laser printer can use block patterns to display gray-scale images.


DLL (Dynamic Linked Library): A compiled and linked collection of computer functions that are not directly bound to an executable the way regular libraries are. These libraries are linked at run-time by Windows. Since Windows is in charge of managing(loading, linking, and removing) the DLLs, they are available to all executables currently running. Thus, each executable can link to a commonly shared DLL saving memory by avoiding redundant functions from co-existing. DLLs also allow a new level of modularity by providing a means to modify and update executables without re-linking. All that need be done is copy a new version of the DLL to the correct disk directory.


Double Feed: The feeding of two sheets of paper at once. Sometimes on roller based scanners this can occur so cleanly that it cannot be detected.


DPI: Dots per inch. A measurement of scanner resolution. The number of pixels a scanner can physically distinguish in each vertical and horizontal inch of an original image. Documents are normally scanned at a resolution of between 200 dpi and 400 dpi.


Drop-Out Ink: Inks that are not visible to the light spectrum of the scanner. Can either be pastels, particularly in the yellow/green range or specific color inks that match the color of the light source. New color scanners often include the ability to remove, or drop-out specific colors. Users want to drop-out background colors in order to capture the foreground information so as to apply OCR or some other recognition to it.


Duplex: The ability of a scanner to scan both sides of a sheet simultaneously. Requires two scanner cameras and often two processing boards.


- E -


Encoding: The manner in which data is stored when uncompressed (binary, ASCII, etc.), how it's packed (e.g. 4-bit pixels may be packed at a rate of two pixels per byte), and the unique set of symbols used to represent the range of data items.


EPS (file format extension): Encapsulated Postscript. Format originator: Adobe Systems, Inc.: 1585 Charleston Road Mountain View, CA 94039


- F -


File Format: A specification for holding computer data in a disk file. The format dictates what information is present in the file and how it is organized within it.


Flatbed Scanners: Scanners that contain an autofeeder and a piece of glass where the paper can be placed and scanned. Can be useful for certain non-standard papers, but is slow and not good for production scanning (see transport)


- G -


GIF (file format extension): Graphics Interchange File Format. Format originator: CompuServe Inc. 500 Arlington Center Blvd./Columbus, OH 43220. Uses the LZW compression created by Unisys, which requires special licensing. It is the same as the LZW compression used in the TIFF file format, except that the bytes are reversed and the string table is upside-down. All GIF files have a palette. Some GIF files can be interlaced in that the raster lines can appear as every 4 lines, then every 8 lines, then every other line. This is due to GIF files usually being received from a modem.


GUI (Graphical User Interface): A computer-user interface which uses graphical objects and a mouse for user interaction. Microsoft Windows is one such GUI. Each program that runs under Windows follows similar conventions.


- H -


Host: Computer in which an application or database resides.


Hz: Abbreviation for Hertz; cycles per second. Often used with metric prefixes, as in kiloHertz (kHz).


- I -


ICR: Literally Intelligent Character Recognition. Initially used as a term to differentiate Kurzweil's OCR from other vendor's products. Recently come to mean hand print recognition. Usually related to neural net technologies, can be used also to identify marks such as check-off boxes or stylized pattern fonts such as OCR-A, OCR-B or MICR.


Image Compression Boards: An imaging-dedicated processor(s). Relieves the CPU (Central Processor Unit - the computer's main chip) from many imaging-specific tasks - compression, decompression, display, zooming, shrinking, scale-to-gray. In fact, does them better than the CPU.


Image Format: Refers to the specification under which an image has been saved to disk or in which it resides in computer memory. There are many commonly used digital image formats in use. Some of the most used are TIFF, DIB, GIF, and JPEG. The image format specification dictates what image information is present and how it is organized in memory. Many formats support various sub-formats or 'flavors'.


Image Processing: Think of "data processing": it refers to the manipulation of raw data to solve some problem or enlighten the user in some way not possible without manipulation. Taken as the name of Image Processing Systems, Inc.


Interface: 1. A mechanical or electrical link connecting two or more pieces of equipment together. 2. A point of demarcation between two devices where the electrical signals, connectors, timing and handshaking are defined.


- J -


JPEG (image compression): Joint Photographic Experts Group. A collaborative specification by the CCITT and the ISO for image compression. JPEG is usually a lossy compression.


JPG (file format extension): Format originator: Joint Photographics Experts Group


- L -


Levels of Index: (see also batching). Documents may be filed by cabinet, file, and folder. This represents a 3 level index.


Look-Up-Table: A look-up-table or LUT is a continuous block of computer memory that is initialized in such a fashion that it can be used to compute the values of a function of one variable. The LUT is set up so that the functions variable is used as an address or offset into the memory block. The value that resides at this memory location becomes the functions output. Because the LUT values need only be initialized once, LUTs are very useful for image processing because of their inherent high speed. LUT[pixel_value] = f( pixel_value ) LUTs come in various widths, usually in units of bits. An nxm bit LUT has 2n addresses or 256 stored values. Each value is 2m bits wide. If the second dimension is left off it can be assumed to be equal to the first. In gray-scale image processing LUTs are commonly 8x8, and the bit widths are usually assumed. A linear LUT, sometimes called a NOP LUT or passthrough, is a LUT that has been initialized to output the same values as the input. NOP_LUT[pixel_value ] = pixel_value. See also Palette.


- M -


Mean Time Between Failure: A statistical measure of reliability, this is calculated to indicate the anticipated average time between failures of a device. The longer the better.


- O -


OCR: Optical Character Recognition. A method of using pattern recognition of images of characters to create computer readable data. different OCR software works better than others on certain types of data


OMR: Optical Mark Recognition. Sometimes called mark sense. Conversion of check-off marks to meaningful data. Simple and accurate way to capture survey type information automatically from people.


- P -


PDF: Stands for "portable document format." A document converted to PDF file format can be delivered with complete visual fidelity to a wide variety of devices and platforms and can be printed, received as an email attachment, downloaded from a server or even viewed on a mobile device. It has become a worldwide standard for reliable electronic document distribution and storage. PDF is a technology of Adobe Systems.


Pixel: The basic building block of all images -- a simple dot. In bitonal images, it is merely a black or white dot (see "Bitonal" definition above). In grey scale images, dots will have between 1-to-256 possible values of grey (for an 8-bit grey scale image).


Portrait Orientation:An image registered so that it is taller than it is wide, with the narrow edge running along top and bottom. When scanning, orientation is determined by the leading edge of the document.


PPM: Pages per minute. A measurement of the throughput speed of a scanner - how many letter-size pages the scanner can scan in one minute. Beware: ppm can be misleading.


- R -


Reflectance:Refers to how much the ink and background paper reflect the light within the scanner. Affects the quality of image.


Resolution: Indicates the number of dots, often measured in dpi, that make up an image on a screen or printer. The larger the number of dots, and thus the higher the resolution, the finer and smoother images can appear when displayed at a given size. Low resolution causes jagged characters. The ideal resolution is a trade-off between quality and the overhead in storage power and processing strength required to use it.


RGB: Red, Green, Blue. A triplet of numeric values which are used to describe a color.


- S -


SCSI: Small Computer System Interface. Pronounced "scuzzy". An industry standard (of sorts) for connecting peripheral devices and their controllers to a microprocessor. SCSI defines both hardware and software standards for communication between a host computer and a peripheral. Computers and peripheral devices designed to meet SCSI specifications should work together.


Simplex: A document scanner that copies single-sided documents.


Skew: The angling of the paper which can cause failure of OCR. some scanners will angle small paper badly.


- T -


TGA (file format extension): Format originator: Truevision, Inc.7340 Shadeland Station Indianapolis, IN 46255


Throughput: The actual amount of useful and non-redundant information which is transmitted or processed. The relationship of what went in one end and what came out the other is a measure of the efficiency of that communications link - a function of cleanliness, speed, etc.


TIFF (file format): Tagged Image File Format.


TIF (file format extension): Format originator: Aldus Corp and Microsoft Corp; 411 First Ave South Seattle, WA 98104; 16011 NE 36th Way Redmond, WA 98073


Transport Speed: The speed at which the mechanical transport runs, measured in inches/centimeters per second (ips/cps).


- V -


VAR/VAD: Value Added Reseller/Dealer. VARs buy equipment from computer manufacturers, add some of their own software and possibly some peripheral hardware to it, then resell the system, with its newly added "value" to end users. A VAD is similar, but is generally less directly in touch with the end user.


Video Scanner Interface Board: An add-in board residing in the host computer which enables communication and control of the scanner device. The board provides device control and file or data compression. Also known as an accelerator or compression board.


- W -


WMF (file format extension): Format originator: Microsoft Corp16011 NE 36th Way Redmond, WA 98073


WPG (file format extension): Format originator: Word Perfect Corp


- X -


XML: eXtensible markup language provides content and structure for B2B based forms through allowing fields and structures to be tagged and layout to be enforced


XSL: eXtensible style language defines the styles associated with XML files


XSLT: eXtensible Style Language Translation allows for XML formatted documents to be automatically translated and reformatted.


The definitions in the glossary are adapted from Moore's Imaging Dictionary; normicro.com, hsassocis.com, and 1st-in-scanners.com.