SetTextExtractionOptions
Version history
This function was introduced in Quick PDF Library version 8.11.
Description
Sets various options that affect the text extraction functionality.
From 8.13, this function sets the text extraction options for the selected document only. It also only affects the results of the GetPageText function.
To adjust the text extraction for the ExtractFilePageText and DAExtractPageText functions, use the new DASetTextExtractionOptions function.
Syntax
Delphi
function TDebenuPDFLibrary1811.SetTextExtractionOptions(OptionID,
NewValue: Integer): Integer;
ActiveX
Function DebenuPDFLibrary1811.PDFLibrary::SetTextExtractionOptions(OptionID As Long,
NewValue As Long) As Long
DLL
int DPLSetTextExtractionOptions(int InstanceID, int OptionID, int NewValue)
Parameters
OptionID |
1 = Ignore Font changes to allow grouping different blocks together 2 = Ignore Color changes to allow grouping different blocks together 3 = Ignore Text Block changes to allow grouping different blocks together 4 = Output CMYK color values 5 = Sort text blocks based on top left position 6 = Descenders from font metrics 7 = Ignore overlaps 8 = Ignore duplicates 9 = Split on double space 10 = Trim characters outside area 11 = Alternative block matching 12 = Ignore rotated text blocks 13 = Trim leading and trailing whitespace from text blocks 14 = Output non ASCII characters below Space character (0x32) 15 = Remove certain character strings such as underscore lines (see below) |
NewValue |
For OptionID = 1, 2, 3 and 6: 0 = Use, 1 = Ignore For OptionID = 4: 0 = Show as RGB (default), 1 = Show as CMYK For OptionID = 5: 0 = Do not sort blocks (default), 1 = Sort blocks For OptionID = 7, 8 and 12: 0 = Do not ignore, 1 = Ignore OptionID = 9: 0 = Do not split on double space (default) 1 = Split on double space OptionID = 10: 0 = Do not trim characters outside area (default) 1 = Trim characters outside area OptionID = 11: 0 = Regular block matching 1 = Alternative block matching OptionID = 13: 0 = Do not trim leading or trailing whitespace 1 = Trim leading and trailing whitespace OptionID = 14 0 = Remove non ASCII chracters below space character from output (default) 1 = Output raw unfiltered ASCII characters OptionID = 15 0 = Output text lines made with Underscore characters (default) 1 = Remove text lines made with Underscore characters |
Return values
0 | The OptionID or NewValue parameter was not valid |
1 | The text extraction option was set successfully |