Debenu Quick PDF logo

DASetTextExtractionOptions

Text, Extraction, Direct access functionality

Version history

This function was introduced in Quick PDF Library version 8.13.

Description

Sets various options that affect the text extraction functionality.

This function affects the results of the ExtractFilePageText and DAExtractPageText functions only.

Syntax

Delphi

function TDebenuPDFLibrary1511.DASetTextExtractionOptions(OptionID, 
  NewValue: Integer): Integer;

ActiveX

Function DebenuPDFLibrary1511.PDFLibrary::DASetTextExtractionOptions(
  OptionID As Long, NewValue As Long) As Long

DLL

int DPLDASetTextExtractionOptions(int InstanceID, int OptionID, int NewValue)

Parameters

OptionID 1 = Ignore Font changes to allow grouping different blocks together
2 = Ignore Color changes to allow grouping different blocks together
3 = Ignore Text Block changes to allow grouping different blocks together
4 = Output CMYK color values
5 = Sort text blocks based on top left position
6 = Descenders from font metrics
7 = Ignore overlaps
8 = Ignore duplicates
9 = Split on double space
10 = Trim characters outside area
11 = Alternative block matching
12 = Ignore rotated text blocks
13 = Trim leading and trailing whitespace from text blocks
14 = Output non ASCII characters below Space character (0x32)
15 = Remove certain character strings such as underscore lines (see below)
NewValue For OptionID = 1, 2, 3 and 6:
0 = Use, 1 = Ignore
For OptionID = 4:
0 = Show as RGB (default), 1 = Show as CMYK
For OptionID = 5:
0 = Do not sort blocks (default), 1 = Sort blocks
For OptionID = 7, 8 and 12:
0 = Do not ignore, 1 = Ignore
OptionID = 9:
0 = Do not split on double space (default)
1 = Split on double space
OptionID = 10:
0 = Do not trim characters outside area (default)
1 = Trim characters outside area
OptionID = 11:
0 = Regular block matching
1 = Alternative block matching
OptionID = 13:
0 = Do not trim leading or trailing whitespace
1 = Trim leading and trailing whitespace
OptionID = 14
0 = Remove non ASCII chracters below space character from output (default)
1 = Output raw unfiltered ASCII characters
OptionID = 15
0 = Output text lines made with Underscore characters (default)
1 = Remove text lines made with Underscore characters

Return values

0 The OptionID or NewValue parameter was not valid
1 The text extraction option was set successfully

Copyright © 2014 Debenu. All rights reserved. AboutContactBlogNewsletterSupport