OcrOptions Properties |
The OcrOptions type exposes the following members.
Name | Description | |
---|---|---|
CanRedact |
Gets a flag to indicate that existing or recognized text should be redacted..
| |
Debug |
Gets or sets the debugging level to capture intermediate results.
| |
DetectOrientation |
Gets or sets a flag to indicate whether the page orientation should be
detected for better, but potentially slower, OCR. Depending on OCR confidence,
the page may be rotated in orthogonal directions and OCR results be compared.
The detected orientation is not applied to the undrlying image when saved.
| |
FullPageOcr |
Gets a flag to indicate whether existing PDF text should be preserved.
| |
Language |
Gets or sets the Tesseract prefix of the selected OCR language(s).
For multi-lingual documents use plus sign to connect multiple language codes.
| |
MaxTasks |
Gets or sets the maximum number of tasks to run in parallel.
Default is 0, indicating that the number of tasks will be automatically
determined based on the number of processing cores.
| |
Overwrite |
Gets or sets a flag to indicate whether existing OCR data, if
detected, should be overwritten.
| |
PageSegMode |
Gets or sets the page segmentation mode of the tesseract engine.
| |
PdfOptions |
Gets or sets options that control how PDF files are created.
| |
PixOptions |
Gets or sets options that control how images are enhanced before OCR.
| |
RedactOptions |
Gets or sets the options that control automatic redaction of PDF files.
| |
Rules |
Gets or sets a list of available regular expression rules for correcting OCR words.
| |
SpellDict |
Gets or sets the language / culture identifier of the spell checker
dictionary to use. Set to null to disable spell checking.
| |
Timeout |
Gets or sets a timeout in milliseconds for the OCR process.
The timeout is for each page in the document. The OCR engine will
wait indefinitely for completion, cancellation or failure if the
timeout is set to zero.
| |
ValidateOcrWord |
Gets a flag to indicate whether OCR words should be validated by removing
certain common OCR artifacts.
|