This is an automated document processing system that works like a smart assistant for PDFs. You drop a PDF into a Google Drive folder, and the system takes it from there — it reads every page, pulls out all the text, saves any images it finds, writes a short summary of what each page is about, and neatly organizes everything into a spreadsheet. No manual copying and pasting.
It's designed for anyone dealing with large PDF documents — whether it's reports, scanned contracts, manuals, or research papers. The workflow handles up to 200 pages per document and processes everything in the background.
The system watches a designated Google Drive folder. As soon as a PDF is uploaded, the workflow starts automatically.
The PDF is sent to Mistral's OCR API, which scans each page and extracts both text and embedded images with high accuracy.
Every image found in the document is extracted, given a clear filename (page number + image number), and uploaded to a separate Google Drive folder. The image links are tracked in the spreadsheet.
Each page's text is passed to GPT-4.1-mini, which generates a short, one-line summary. This makes it easy to understand what a document contains at a glance.
All extracted data lands in a Google Sheet — page number, full extracted text, AI summary, image filenames, and image links. It's a clean, searchable record of your entire document.
There's nothing to click once the PDF is uploaded. The system watches, processes, and organizes everything from start to finish automatically.
Images aren't just dumped into a folder. Each one is named based on which document and page it came from, so you can always trace an image back to its source.
AI writes a short summary for every page. This is especially useful for long documents — you can browse the summaries to find the page you actually need without scrolling through hundreds of pages.
Everything lives in a Google Sheet with separate sections for text content and image data. It's easy to search, filter, export, or share with others.