Debenu Quick PDF logo

SetTextExtractionOptions

Text, Extraction

Version history

This function was introduced in Quick PDF Library version 8.11.

Description

Sets various options that affect the text extraction functionality.

From 8.13, this function sets the text extraction options for the selected document only. It also only affects the results of the GetPageText function.

To adjust the text extraction for the ExtractFilePageText and DAExtractPageText functions, use the new DASetTextExtractionOptions function.

Syntax

Delphi

function TDebenuPDFLibrary1811.SetTextExtractionOptions(OptionID, 
  NewValue: Integer): Integer;

ActiveX

Function DebenuPDFLibrary1811.PDFLibrary::SetTextExtractionOptions(OptionID As Long,
  NewValue As Long) As Long

DLL

int DPLSetTextExtractionOptions(int InstanceID, int OptionID, int NewValue)

Parameters

OptionID 1 = Ignore Font changes to allow grouping different blocks together
2 = Ignore Color changes to allow grouping different blocks together
3 = Ignore Text Block changes to allow grouping different blocks together
4 = Output CMYK color values
5 = Sort text blocks based on top left position
6 = Descenders from font metrics
7 = Ignore overlaps
8 = Ignore duplicates
9 = Split on double space
10 = Trim characters outside area
11 = Alternative block matching
12 = Ignore rotated text blocks
13 = Trim leading and trailing whitespace from text blocks
14 = Output non ASCII characters below Space character (0x32)
15 = Remove certain character strings such as underscore lines (see below)
NewValue For OptionID = 1, 2, 3 and 6:
0 = Use, 1 = Ignore
For OptionID = 4:
0 = Show as RGB (default), 1 = Show as CMYK
For OptionID = 5:
0 = Do not sort blocks (default), 1 = Sort blocks
For OptionID = 7, 8 and 12:
0 = Do not ignore, 1 = Ignore
OptionID = 9:
0 = Do not split on double space (default)
1 = Split on double space
OptionID = 10:
0 = Do not trim characters outside area (default)
1 = Trim characters outside area
OptionID = 11:
0 = Regular block matching
1 = Alternative block matching
OptionID = 13:
0 = Do not trim leading or trailing whitespace
1 = Trim leading and trailing whitespace
OptionID = 14
0 = Remove non ASCII chracters below space character from output (default)
1 = Output raw unfiltered ASCII characters
OptionID = 15
0 = Output text lines made with Underscore characters (default)
1 = Remove text lines made with Underscore characters

Return values

0 The OptionID or NewValue parameter was not valid
1 The text extraction option was set successfully

Copyright © 2020 Debenu. All rights reserved. AboutContactBlogNewsletterSupport