practicalPDF

Areas of My Expertise

Adobe Acrobat Services REST API

Text extraction and post-processing
With Adobe PDF Extract API, you no longer need to use a PDF library tool to parse the content of a PDF file. PDF Extract API creates a JSON representation of the PDF content that can be transformed into other formats to support GenAI applications or leveraged to create unique and innovative PDF experiences with Embed API.

For your convenience, I’ve developed a comprehensive set of post-processors for PDF Extract API. These tools enable you to output plain text, Markdown, and HTML from the extracted JSON, providing you with a wide range of options for your document processing needs. You can find these post-processors in my Git repo, and they are free to use and licensed under CC BY-SA 4.0.

However, these post-processors might not be appropriate for every use case. If you need something specific and aren’t quite sure how to approach the project, reach out to me, and we can discuss engaging my services to help you unlock the power of PDF Extract API.

Data driven document creation
Adobe Document Generation API is a powerful tool for data-driven document creation. It allows developers to merge data into tagged Microsoft Word (.docx) templates, enabling the creation of customized documents at scale. While creating basic template logic is straightforward, the API uses JSONata, which can enhance the flexibility and functionality of your document generation process. JSONata is a lightweight query and transformation language for JSON data that provides sophisticated query expressions and built-in operators and functions for manipulating and combining data.

I can help you leverage the full capabilities of JSONata and the Document Generation API for your projects.

Dynamic HTML templates for PDF creation
Like the Document Generation API, the HTML to PDF API is also a powerful tool for data-driven document creation. It allows developers to merge data into tagged templates, enabling customized documents to be created at scale, except that it uses HTML templates and JavaScript.

The key to creating a good-looking PDF from HTML is using the CSS paged-media module, which defines the properties that control the presentation of content for print or any other media that splits content into discrete pages like PDF. It allows you to set page breaks and page sizes, style left and right pages differently, and control breaks inside page elements.

But paged-media media queries are not for the faint of heart. I’ve got years of experience creating good-looking PDFs from HTML. I can help you navigate the complexities of CSS to do the same.