OCR device contains the set of functions and variables for controlling algorithm for optical character recognition (OCR). OCR is an algorithm for conversion of captured pictures of typewritten or printed text into the computer-readable text. Typical use-case of the OCR device is provided in the Using OCR section.
Device info Go to top
In order to initialize this algorithm properly, several parameters must be set. The following example presents the proper definition of an algorithm instance inside the test_env.ini (test environment description) configuration file:
[device] alias = OCR name = OCR config = OCR.ini
Details about the parameters that need to be set in order to correctly configure OCR device are provided here:
alias– is a tag name which will be used in the test case script.
name– presents a unique identifier for the algorithm and must be set as
config– is the name of the file which contains additional configuration (macro commands) related to the OCR device (this parameter is optional but desirable).
These parameters present a device configuration and they are described in the Configuration & Macros (Anatomy of Configuration file) section.
Device parameters Go to top
For the proper function of each device, there are certain parameters which should be set. Some of the basic device parameters for the OCR device are:
ocr_lang_file– Expected language, e.g. ENG for English, FRA for French, etc. The detailed 3-letter list of languages is presented in the ISO 639 standard.
image– the name of the test image on which OCR algorithm will be applied. Algorithm supports BMP and TIFF images.
output– the name of the text file in which results are going to be stored. Only name of the file must be specified. Extension (.txt) will be added automatically. The default value is text (.txt).
ref_file– the name of the referent file used for comparison.
subimage– if this variable is 0, a full image is analysed. If this variable is 1, only the part of the image is going to be analyzed. The region is set using xstart, ystart, xend and yend variable. The default value is 0.
When configuring and controlling devices from the test itself, such tests often become less readable as well as too large and very complicated for maintenance. To avoid this situation, the macros are used.
Device usage Go to top
Controlling of the OCR algorithm can be easily performed by the macro file usage. User can use previously prepared macro definitions (set of commands which will be sent to the OCR algorithm) which are stored in the OCR.ini file. This kind of device controlling makes test cases shorter and clearer (more readable), particularly when macros are intuitively named ([INFO_BAR_READING], [CHANNEL_NUMBER_READING]…). Also, this approach allows easy adjustment of large number of test cases without changing test scripts. Thi is done by the simple modification of macro definitions (modifications will affect all test scripts where particular macro is used).
One example of the macro that can be used by the OCR device is given here:
[MY_TV_RELATED_ITEMS_NUM] "ocr_lang_file ENG 0" "subimage 1 0" "xstart 1609" "ystart 981" "xend 1752" "yend 1034" "ref_file ref 0" "colorsnum 1 0" "operation1 CONVERT_TO_GRAYSCALE 0"
You can note here the following:
- English language characters are expected to be recognized, and that is defined with the
"ocr_lang_file ENG 0". The third parameter presents the deley that should be introduced after executing this command on the OCR device, and in this case, it is 0.
- Just a part of the input image will be analysed, and that is defined with the
"subimage 1 0" "xstart 1609" "ystart 981" "xend 1752" "yend 1034".
- Text defined in the “ref.txt” will be used for comparation with the recognized one, and that is defined with the
"ref_file ref 0".