D
document_extraction

Projects with this topic

View Prototype_Doc_Extractor project

BBSR_IDA_public / base_projects / Prototype_Doc_Extractor

The Prototype Document Extractor is a lightweight, containerized service designed to extract structured content from PDF files using the Unstructured IO library. It exposes a minimal HTTP API that allows users to submit PDFs and receive parsed content in JSON format. This project includes:
A backend service that handles PDF parsing using Unstructured IO. A Python client library for programmatically interacting with the API from within your code. Docker configurations to run the service in a portable, reproducible environment.

pdf document_ext... docker unstructured_io

0

Updated May 02, 2025

0 0 0 0

Updated May 02, 2025