home Log-In to Library sitemap
image spacer image
spacer

Education

spacer

What next:

download Download

Articles

State of OCR
Distributed Capture
Easy, Effective Data Processing

view View

The Software
Press Releases
Case Studies

view Contact

Technical Support

AnyDoc Software
Regional Sales Rep
     
 

AnyDoc Software, Inc.
P: 800.775.3222
F: 813.222.0018
E: info@anydocsoftware.com

 
image spacer image
image spacer image
   
spacer
spacer
 

default font sizeincrease font size
Glossary of Terms

 
spacer
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

A

spacer

AccuID

A method within AnyDoc Software’s data capture solutions for identifying master form templates. This method works at the form family level to build an identification table based on the unique topology of each master form template in the form family and compare it to incoming data images. Then AccuID sorts and identifies data images from the correct master form template for processing.

 

AccuZip

A database containing address information from the United States Post Office. AnyDoc Software’s data capture solutions can optionally access the AccuZip database to validate addresses in the U.S., U.S. territories, and military installations abroad.

 

Address Extraction

The Address Extraction feature of within AnyDoc Software’s data capture solutions provides the ability to extract U.S. and Canadian address data off documents and define how the data will display in the output. Address Extraction gives you the option to validate address data using AccuZip.

 

AnyApp Technology

AnyApp locates data on template-resistant forms—regardless of where it resides on the document—by searching for defined data labels, such as “amount due” and “invoice number”; data format, such as dd/mm/yyyy; data type, such as alpha, numeric or both; and/or location, such as “just look in the top half of the document for this data.” It then remembers where it found the data when the document type is processed again. AnyApp is the technology behind the AnyDoc Software solutions AnyDocEOB and AnyDocINVOICE.

 

ASCII (American Standard Code for Information Interchange)

A file output format for captured data. It is a character-encoding scheme based on the ordering of the English alphabet.

 

Attachments

Document images to be archived along with the processed document.

 

Auditor

The Auditor module of OCR for AnyDoc allows you to review the work of your verification operators. You can audit specific tasks, do random checks, or review all of an operator’s work.

 

AutoFlow

An automated method used to check available workstations for batch processing jobs that need to be completed.

more case studies

B

 

Bar Code Zone

A master form template zone that defines the location of bar code data in the document image. AnyDoc Software's data capture solutions recognize these bar codes and converts them into alphanumeric data.

 

Batch Separator Page

Printed for a form family to provide a fast and accurate method of entering control information for a given batch of forms.

 

BCR

Bar Code Recognition. The process of reading and extracting data from bar codes on a document. See also Bar Code Zone.

 

BPM

BPM is a process-centric approach to managing business operations, evolving from earlier and existing technologies and practices including business process re-engineering, workflow management, and document management.

more case studies

C

 

Capture (Document and Data)

Document capture is the method of obtaining the electronic document image (either from scanning or importing) for data capture. Data capture is the automated extraction of this data from the document image, which can be transferred to a database or back-end system. Data capture eliminates manual data entry.

 

Capture Workflow

An emerging technology which encompasses automated document classification, data capture, and processing for entry into ECM, ERP, accounting, or other back-end systems. Capture workflow allows document processing to be automated from the moment paper enters an organization.

 

Character Constraint Boxes

With character restraint boxes, you restrict the amount of data that can be entered on a form by providing a specific number of boxes to be filled in by the user.

 

Check 21

The Check 21 Act allows the recipient of the original paper check to create a digital version of the original check, called a "substitute check," eliminating the need for further manual handling of the physical document. Both sides of the paper check are scanned to produce a digital image that can be virtually endorsed and deposited.

 

CMS 1500

The standard form from the Health Care Finance Administration, designated for submitting healthcare claims to insurance companies. Previously known as the HCFA 1500.

 

Commit Phase

The last phase of batch processing prior to data output, where output files (e.g., TXT, GTO, XML, PDF) and archive images are written to the appropriate directories.

 

Conceptual Classification

A technology based on Latent Semantic Indexing (LSI), designed to automatically identify and sort document images based on understanding textual relationships and concepts. The software “reads” the entire document and then calculates a “value.” These mathematical values assigned to each document are then compared to a master list for classification.

 

Conditional Procedure

A user-designed routine that features advanced character searches, recognition, and replacements. A conditional procedure retains or filters data, based on the specific condition.

more case studies

D

 

Data Capture

The ability to capture digital data off scanned paper document images, without needing to perform manual data entry. This data then can be transmitted to an ECM, ERP, financial, or back-end system for entry into an ODBC-compliant database.

 

Date Extraction

A feature in AnyDoc Software's data capture solutions that automatically converts extracted date information from multiple input formats into a user-defined output format.

 

Delimiters

Special characters that separate data fields and/or records so the data can be parsed from the file by a program or a script.

 

Distributed Capture

A means by which organizations can scan documents remotely, either from branch offices around the world or simply downstairs. The scanned images are then transmitted via a secure Internet connection for data capture processing at a centralized location, such as corporate headquarters. See Remote Capture.

 

Document Set

A group of related documents that need to be processed together as a batch.

more case studies

E

 

ECM (Enterprise Content Management)

Strategies, methods, and tools used to capture, manage, store, preserve, and deliver content and documents related to organizational processes.  

 

EDI (Electronic Data Interchange)

The transfer of data from one business to another over a network in a specified format. Formats include: 810 (invoices), 835 (EOBs), and 837 (CMS 1500/UB04).  

 

Endorser

A mechanism found on some scanners that print an incremental number on an image, which facilitates document indexing.

 

EOB

Explanation of Benefits. A statement from a healthcare provider that itemizes how benefits were approved or denied for a claim.

 

ERP (Enterprise Resource Planning)

A company-wide computer software system used to manage and coordinate all the resources, information, and functions of a business, including accounting, manufacturing, supply chain management, and more.

 

External Table

A database table connected through an ODBC link.

 

Extract Process

A batch control process within AnyDoc Software’s data capture solutions that de-skews the image, performs form removal functions, enhances images, regenerates characters, applies pre- and post-processing rules that have been set, etc.

more case studies

F

 

Form Family

One or more master form templates grouped together for batch processing.  Examples of form families include a batch of invoices that must be batch-balanced and a mortgage folder that has pages to be processed, containing information used to index other pages in the folder within an image retrieval system.

Form families can do the following:

  • Archive images
  • Batch balance controls
  • Create a header record
  • Enhance form identification
  • Name directory structure
  • Perform batch verification
more case studies

G

 

GUI (Graphical User Interface)
A software’s design to facilitate interaction with the end user based on images, drop-down menus, pop-up boxes, and more.

more case studies

H

 

HCFA 1500

See CMS 1500.

 

High Speed Verification

An optional verification phase where operators see only the image’s questionable characters. Verification is “high speed” when the operators can correct at once all questionable characters in a batch, rather than tabbing to each questionable character on a data image.

more case studies

I

 

ICR

Intelligent Character Recognition. The process of converting handwritten characters into ASCII text through the use of a recognition engine.

 

Identify Phase

The phase of processing within AnyDoc Software’s data capture solutions, during which AutoID automatically identifies each document type. AutoID uses static elements on each document, such as barcodes, literals or graphics, to identify the document.

 

Image Registration

The use of an image on a document (such as a square or a triangle), both contiguous and containable, as a registration point to help auto ID a document type.

 

Import Phase

The process of bringing document images into AnyDoc Software’s data capture or classification solutions with or without the use of a scanner.

 

Indexing

A means of electronically identifying a scanned document image for archival and retrieval purposes.

 

Intelligent Extraction

During data capture, Intelligent Extraction recognizes a date, an address, or a currency type and converts that data zone into a user-specified format. For example, all dates can be output into the format MM/DD/YYYY, no matter how the date is written on the document.

 

Inverted Text

The placement of white text on a black background in a scanned document image. The text and background are inverted to optimize the recognition engine’s ability to capture data within AnyDoc Software’s data capture solutions.

more case studies

J

 

Job Queue Directory

A temporary network location where processing files are stored. These files identify the status of each job/image/page within AnyDoc Software solutions.

 

Job Manager

A tool used to facilitate automated remote data capture, Job Manager is a server component, typically installed on a network server.

more case studies

K

 

Key

A key (or primary key) is a field that uses a number or character sequence unique to each record in a table (e.g., social security number) for identification purposes.

 

Key-from-Image

The process by which data entry operators view and key data off electronic, rather than paper, versions of documents. The key-from-image approach to data entry is approximately 10% - 20% more efficient than traditional data entry methods.
In OCR for AnyDoc, the key-from-image verification GUI is the default verification method. With the program’s rope and expand capabilities, however, operators key significantly less data.

more case studies

L

 

Latent Semantic Indexing (LSI)

A way to express conceptual text relationships as mathematics. See Conceptual Classification.

 

Literal

Synonymous with text. A literal can be machine print (OCR) or handprint (ICR). Static literals on a document can help OCR for AnyDoc with registration points and to identify a form type.

 

Lookup Table

A table that AnyDoc Software’s data capture solutions, including OCR for AnyDoc®, AnyDoc®INVOICE™, and AnyDoc®REMIT™, access to validate specific data residing on a processed document image. For example, AnyDocINVOICE can access a P.O. number table to validate the vendor associated with the P.O. number on an invoice.

more case studies

M

 

Mark Sense

Data confined to one or more selections in a series, as in a survey. The data is selected by checking a box or filling in a bubble. For example, a survey may include gender information. A respondent fills in the bubble next to ‘M’ or ‘F’ on the survey to indicate his or her gender. AnyDoc Software's data capture solutions seek that mark sense zone for the data and extracts the selected response for that question on the form, based on the pixilation present in the selected bubble, using OMR (optical mark recognition).

 

Mark Sense Mark

A mark on a form identifying a selection of mark sense data. The mark consists of the presence of pixels (such as a check mark, a filled-in box, a signature, etc.). The recognition engine searches for the presence (a “hit”) or absence (a “miss”) of a mark.

 

Master Form Template

Scanned or imported document images used to define the zones and parameters for processing data from structured documents of the same type.

 

Metadata

Transaction, system, and document data captured during scanning and passed to an AnyDoc Software data capture solution for further processing, including the document set, batch number, operator ID, bar code(s), check MICR code, and more.

 

MICR (Magnetic Ink Character Recognition)

The specialized code or font found on checks, containing routing and account information. Also a type of recognition engine designed to capture MICR data from electronic check images.

more case studies

N

 

Noise Filtering

Removes particles (black dots representing noise) from the document image.

 

Note Zone

Note zones define areas of the document image containing data that are not processed by an OCR or ICR engine. AnyDoc Software’s data capture solutions can optionally prompt the operator to enter the data or select from drop-down menu options during verification. A note zone is useful for obtaining data such as signatures or other unconstrained handprint.

more case studies

O

 

OCR (Optical Character Recognition)

The electronic translation of text contained on an electronic document image into machine-editable text. This electronic or “automated data capture” eliminates the need to hand key data from paper documents into an ECM, ERP, or other back-end system.

 

ODBC (Open Database Connectivity)

Provides a standard software API (application programming interface ) method for using database management systems (DBMS).

 

Omit Zone

With omit zones, you define the areas of a document to be ignored during OCR or ICR evaluation.  Omit zones ensure that preprinted literals in a zone are not recognized as text.

 

OMR

Optical Mark Recognition. The process of data selection from a list of options on a document, based on the presence or absence of a mark next to item(s) on that list. See also Mark Sense.

 

Orientation

The way text is displayed on a page, either vertically (portrait) or horizontally (landscape). The orientation parameters in AnyDoc Software’s data capture solutions allow users to ensure that text in a page reads from left to right as it is being processed, regardless of the text orientation on the page when it was scanned.

 

Output

Output is the final phase of data capture processing. Once the data has been captured, validated and verified, both the data and the document images are then delivered to a company’s ECM, ERP, accounting, or other back-end system.

 

Output Parameters

Output parameters enable the configuration of both ASCII text and images output by AnyDoc Software’s data capture solutions. They can be configured in the form level or the zone level.

 

Overlay

An image that is superimposed on all data images during verification and/or is archived for a specific master form template.

more case studies

P

 

Parameter

A set of tools to help fine-tune form removal and recognition. It also helps define rules and output specifications, according to unique business rules.

 

Pass 1 Verification

During this phase of data verification, operators view a data image’s questionable characters in the context of the zone and form or document in which they appear. Pass 1 Verification also allows the operator to correct any recognition rules implemented by rules parameters, mark sense parameters, table link parameters, etc.

 

Pass 2 Verification

An optional data verification phase that functions either as a method to verify data not examined by Pass 1 Verification, or as a follow-on supplement to Pass 1 Verification.

 

Patch Code

A parallel pattern of alternating black bars separated by spaces and placed near the leading edge of a paper document. Sometimes used to separate documents and batches or to perform identification.

 

Permissions

Security measures applied to objects (e.g., database tables, etc.), based on defined user rights.

 

Pixels

Picture (pix) elements (els). Filled-in dots in a grid that form text or a picture on a computer screen or on printed output.

 

Process

Process is the hardest working phase in AnyDoc Software’s data capture and classification solutions. During processing, the solution separates data from non-data form elements, such as character boxes, lines and background noise. Once the data is separated, the data is captured and validated against pre-defined business rules.

more case studies

Q

 

Quality Assure (QA)

An additional batch processing phase (off by default) that allows the operator to check and improve the quality of scanned or imported images.

 

Questionable Character

A data character with a value undetermined by the recognition engine (where the confidence percent level is below the configured value).

 

QuickApp®

With QuickApp® technology, AnyDoc Software data capture solution users can eliminate key-from-image processes when capturing data from exception or seldom-seen documents— without the need for a template.

more case studies

R

 

Reader Response Zone

A type of mark sense zone, Reader Response zones define areas of the form to be evaluated for the presence or absence of a mark, which typically takes the form of a circle around a number.

 

Recognition Engine

A computer algorithm used to “read” and interpret data from document images. Different types of recognition engines are used to capture different types of data, for example, Optical Character Recognition (OCR) for machine print and Intelligent Character Recognition (ICR) for handprint.

 

Registration Zone

The defined area of a document image that allows the image’s length and width to be determined so the solution can effectively remove skew from the image and align it to the associated Master Form Template. The Registration Zone consists of two or more registration points on the document, defined by an image, a literal, a cross line and/or data.

 

Remote Capture

A means by which an organization can scan documents remotely, either from branch offices around the world or simply from a different floor. The scanned images are transmitted via a secure Internet connection for data capture processing at a centralized location, such as corporate headquarters. Advantages include eliminating the delay and expense of postage and faxing. See Distributed Capture.

 

Remote Verification

The ability of a human operator to verify, from an off-site location, characters flagged by AnyDoc Software’s data capture solutions as questionable. Access to data verification from a remote location is provided via a LAN or an Internet connection.

 

Rope and Expand

Roping and expanding magnifies a selected area on a document image. The smaller the roped area, the greater the magnification. During template design, an area must be roped and expanded prior to adding a zone. During key-from-image processing, roping and expanding text on the document image automatically populates the associated data fields.

more case studies

S

 

Semi-Structured Documents

Semi-structured documents contain common data elements but the data has a different location, from document to document. For example, nearly every invoice contains data such as a P.O. number and an invoice total, but it is in a different location on each invoice depending on the vendor. Because of these location differences, it is not feasible to use templates to capture data from semi-structured documents. Instead, AnyDoc Software uses its own AnyApp® Technology to capture and remember the location of the data.

 

Sticky Note

A tool within the data capture solution that verification operators can use, without disrupting verification activities, to notify a supervisor of unexpected results in a particular row or line of data during processing.

 

String

A sequence of data characters.

 

Structured Documents

Standardized forms that come in the exact same format or layout every time. Examples of structured documents include credit applications, surveys, and order forms. The data to be captured is always located in the same place on the form. To eliminate the need to manually enter the data from structured forms, a template is created to define each of the individual data fields to capture, like name, address or Social Security Number. The software then captures that information at the same location each time.

more case studies
T

 

Table

A file containing organized data (in rows and columns) on a specific topic, such as Vendor ID Number. During processing, AnyDoc Software’s data capture solutions can access lookup tables to automatically validate captured data and populate additional data fields.

 

Template

See Master Form Template.

 

TWAIN

A standard software protocol and API (applications programming interface) to standardize and regulate communication between hardware and software, such as a scanner and data capture software.

more case studies

U

 

UB04

The UB04 claim form replaces the UB92 and is redesigned to accommodate reporting of the National Provider Identifier (NPI) number. The NPI number, a requirement of the HIPAA legislation, must be used by all HIPAA covered entities. The form is required by the CMS (Centers for Medicare and Medicare Services).

 

Unstructured Documents

Forms and documents where the desired data can be located in varying positions on the page of the same document type. An example of unstructured documents is Explanation of Benefits (EOB) forms. AnyApp® Technology was developed to process semi- and unstructured documents and is found in AnyDoc®EOB™, AnyDoc®INVOICE™, AnyDoc®REMIT™, AnyDoc®NOTICE™, and more.

 

User Group

A group of software users with the same access rights. The solution administrator defines the user groups and grants rights to them.

more case studies

V

 

Verify Phase

The phase of automated document processing where data characters flagged as questionable get verified, either by a separate recognition engine or by a human operator.

 

Virtual Multi-Stream®

Allows an unaltered color document image to be archived in a back-end repository in support of compliance requirements, but also creates a black-and-white document image which is optimized for use within AnyDoc Software data capture solutions to maximize data capture speed and recognition accuracy.

more case studies

W

 

Work Flow Manager

The control panel for all production-level batch processing performed by AnyDoc Software’s data capture solutions.

more case studies

X

 

X9.37

A file created during Check 21 processing which can be electronically transmitted for deposit and meets all Check 21 clearing requirements.

 

Xml (Extensible Markup Language) Output
A set of rules regarding a data file for encoding documents electronically.  This data (provided from an automated data capture solution such as OCR for AnyDoc®) can then be queried, exported, and serialized into the desired format.

more case studies

Z

 

Zone

An area in the Master Form Template defined as the location of a specific data type. Each zone type is designated by a separate zone boundary color.

 

To learn more, contact us today.

spacer
image spacer image