Search Results for 'Extract'

Extract text from PDFs as a text block list

Debenu Quick PDF Library provides an extensive API for programmatically extracting text from PDF files. This includes the options of just plain text output and also returning the text in a formatted CSV string with details about the font, size and style of the text. The API now includes additional text extraction functions for extracting […]

Extract paths from a PDF

Debenu Quick PDF Library does not currently support the extraction of path information. However, the GetContentStreamToString function will extract the content stream which contains all of the drawing commands. You would need to parse the content stream to extract the paths as well as processing transformations including rotation and scaling. Here is the contents of […]

Extract text from a defined rectangular area on a page

Debenu Quick PDF Library includes a range of functionality for extracting text from PDF files, but usually it’s for extract text from an entire page. The extract functions which include “area” in the name let you specify a rectangular area from which you wish to extract text. The key functions for this using regular memory […]

Extract images from PDF files as the appropriate image type

Sometimes it’s necessary to extract images from PDF files and save them to disk. When this happens you’ll most likely want to save the image data back into the image format that it was originally in before it was added to the PDF. This can be tricky at times because some image formats such as […]

Programmatically extract form field data from PDF files

As well as enabling you to generate form fields and fill form fields, Debenu Quick PDF Library makes it easy to extract form field data or information about form fields from PDF files. In the sample code below we demonstrate how to iterate through each page in a PDF to extract information about all of […]

Extract fonts from a PDF programmatically

Debenu Quick PDF Library lets you extract embedded TrueType fonts from PDF files to a font file on the local disk. All other font types and subsetted TrueType fonts are not supported by the SaveFontToFile function. Here is some C# code that demonstrates how to extract the embedded fonts. 1 2 3 4 5 6 […]

When extracting an image with ARTS PDF Aerialist X Pro, the resolution of this image is changing to 96 dpi. Why is this happening?

Extracted images are resized to the system resolution (on Windows this is 96 dpi, on Macintosh this is 72 dpi). The extracted image contains all of the image data, but if these images are placed/opened in a page layout or an external image editing application the size will appear larger than it was in the […]

Setup Android Studio and Debenu Quick PDF Library

This tutorial demonstrates how to use Debenu Quick PDF Library to create an Android app using Android Studio. If you haven’t already downloaded the Android trial then you can do that from the trial here. Setup Android Studio project with Debenu Quick PDF Library Open Android Studio. If Quick Start window is displayed then click […]

Get embedded image coordinates from PDF files

Debenu Quick PDF Library lets you analyze, extract and replace embedded images in PDF files using the extensive image handling functions. The GetPageImageList function returns an ImageListID which you can use in the GetImageListItemDblProperty function. With this function you can get the coordinates for each image in the image list. The GetImageListItemIntProperty function useful for […]

Memory optimization tips when processing large PDF files

When dealing with PDF files that are very large in file size (north of 1 GB) or PDF files that have many pages (north of 1,000 to 10,000 depending on documents contents) it is desirable or sometimes necessary to write code that ensures memory usage does not climb too high. We will continue to enhance […]