by labeveryday
A powerful Model Context Protocol (MCP) server built with FastMCP that provides comprehensive PDF processing capabilities including text extraction, image extraction, and OCR for reading text within images.
You need to install Tesseract OCR on your system:
Ubuntu/Debian:
sudo apt update sudo apt install tesseract-ocr tesseract-ocr-eng
macOS:
brew install tesseract
Windows:
conda install -c conda-forge tesseract
# For multiple languages sudo apt install tesseract-ocr-fra tesseract-ocr-deu tesseract-ocr-spa
# macOS/Linux curl -LsSf https://astral.sh/uv/install.sh | sh # Windows powershell -c "irm https://astral.sh/uv/install.ps1 | iex"
mkdir mcp-pdf-reader-server cd mcp-pdf-reader-server
# Copy the files (pdf_reader_server.py and pyproject.toml) # Then install dependencies uv sync
uv run python -c "import pytesseract; print(pytesseract.get_tesseract_version())"
If you prefer traditional setup:
python -m venv .venv source .venv/bin/activate # On Windows: .venv\Scripts\activate
pip install fastmcp PyMuPDF pytesseract Pillow
With UV:
uv run python pdf_reader_server.py
Or if you have the environment activated:
python pdf_reader_server.py
The server will start and listen for MCP requests on stdin/stdout.
read_pdf_text
Extract text content from PDF pages.
Parameters:
file_path
(string, required): Path to the PDF filepage_range
(object, optional): Dict with start
and end
page numbersExample:
*Configuration content*
extract_pdf_images
Extract all images from a PDF file.
Parameters:
file_path
(string, required): Path to the PDF fileoutput_dir
(string, optional): Directory to save imagespage_range
(object, optional): Page range to processExample:
*Configuration content*
read_pdf_with_ocr
Extract text from both regular text and images using OCR.
Parameters:
file_path
(string, required): Path to the PDF filepage_range
(object, optional): Page range to processocr_language
(string, optional): OCR language code (default: "eng")Example:
*Configuration content*
Supported OCR Languages:
eng
- Englishfra
- Frenchdeu
- Germanspa
- Spanisheng+fra
- Multiple languagesget_pdf_info
Get comprehensive metadata and statistics about a PDF.
Parameters:
file_path
(string, required): Path to the PDF fileanalyze_pdf_structure
Analyze the structure and content distribution of a PDF.
Parameters:
file_path
(string, required): Path to the PDF fileAdd this to your claude_desktop_config.json
:
*Configuration content*
*Configuration content*
*Configuration content*
*Configuration content*
*Configuration content*
Tesseract not found:
TesseractNotFoundError: tesseract is not installed
Permission errors:
Poor OCR results:
Memory errors:
Run with debug logging using UV:
PYTHONUNBUFFERED=1 uv run python pdf_reader_server.py
Or with regular Python:
PYTHONUNBUFFERED=1 python pdf_reader_server.py
Test Tesseract directly:
tesseract --list-langs tesseract image.png output.txt
You can modify the OCR configuration in the code:
ocr_text = pytesseract.image_to_string( pil_image, lang=ocr_language, config='--psm 6 -c tessedit_char_whitelist=0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz ' )
For better OCR results, consider adding image preprocessing:
# Add to requirements: opencv-python, numpy import cv2 import numpy as np # Preprocessing example def preprocess_image(image): gray = cv2.cvtColor(np.array(image), cv2.COLOR_RGB2GRAY) thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)[1] return Image.fromarray(thresh)
MIT License - see LICENSE file for details.
No version information available