Skip to content

ID-izlq-Github/PDF_Lite

Repository files navigation

📄 PDF Lite

Cross‑platform lightweight PDF toolkit

FeaturesQuick StartGUICLIStructureRoadmap

🇨🇳 中文版


Split, merge, compress, extract pages, convert to/from images — all from CLI or GUI.
No need to remember qpdf flags. Outputs auto‑name themselves.

v0.3.3 — CLI + Tkinter GUI | Packed size: ~90 MB (UPX)


Features

Command What it does Smart default
split Split PDF into pages input_1.pdf, input_2.pdf
merge Merge multiple PDFs merged.pdf
extract Extract specific pages input_extracted.pdf
compress Optimize / linearize input_compressed.pdf
pdf2img PDF → PNG / JPEG input_images/ dir
img2pdf Images → PDF images.pdf
info Show metadata (pages, size, …)
extract … --compress Extract + compress in one go

Batch? compress *.pdf · merge *.pdf · img2pdf *.png — wildcards work everywhere.


Requirements

  • Python 3.10+
  • qpdf (must be installed separately)
# Windows (scoop)        macOS (Homebrew)          Linux
scoop install qpdf       brew install qpdf         sudo apt install qpdf

Setup

conda create -n pdf_lite python=3.13
conda activate pdf_lite
pip install -r requirements.txt

Quick Start

# Split into pages
python src/main.py split doc.pdf
# → doc_1.pdf, doc_2.pdf, …

# Extract pages + compress in one command
python src/main.py extract doc.pdf 1,3,5-7 --compress
# → doc_extracted_compressed.pdf

# Batch compress every PDF in a folder
python src/main.py compress *.pdf

# Show metadata
python src/main.py info doc.pdf
#   Pages:    42
#   Size:     3.2 MB

All commands accept --timeout N (seconds) for large files.
Files > 50 MB trigger a warning and a live heartbeat every 10 s.


GUI

python src/main.py gui
# or from the packaged binary:
pdf-lite gui

7 tabs · Shared bottom bar with progress, cancel, timeout & coloured log

GUI Mockup

Tab What you do
Split Pick file → choose auto‑name or output dir → Run
Merge Add files, reorder → pick output → Run
Extract Pick file → enter "1,3,5-7" → optional ☑ Compress → Run
Compress Pick file → optional ☑ Linearize → optional Timeout → Run
PDF→Img Pick file → DPI, format (PNG/JPEG), page range → Run
Img→PDF Add images, reorder → pick output → Run
Info Pick file → Show Info → 📋 Copy to clipboard

💡 For batch operations, use the CLI: pdf-lite compress *.pdf


CLI Reference

split — Split PDF into pages

pdf-lite split input.pdf
pdf-lite split input.pdf -o ./pages/
pdf-lite split input.pdf -o page_%d.pdf    # explicit pattern

Auto‑names files in the same directory as the input.

merge — Merge PDFs

pdf-lite merge a.pdf b.pdf                 # → input dir / merged.pdf
pdf-lite merge *.pdf                         # all PDFs in CWD
pdf-lite merge a.pdf b.pdf -o result.pdf

If inputs are in different dirs, you must supply -o.

extract — Extract pages

pdf-lite extract doc.pdf 1,3,5-7            # → doc_extracted.pdf
pdf-lite extract doc.pdf 1-5 -o out.pdf
pdf-lite extract doc.pdf 1-3 --compress

compress — Optimize / linearize

pdf-lite compress doc.pdf                   # → doc_compressed.pdf
pdf-lite compress *.pdf                      # batch
pdf-lite compress doc.pdf --no-linearize

Prints compression ratio: 156KB → 89KB (-43%)

pdf2img — PDF to images

pdf-lite pdf2img doc.pdf                     # → doc_images/  (PNG, 150 DPI)
pdf-lite pdf2img doc.pdf --fmt jpg --dpi 300
pdf-lite pdf2img doc.pdf --pages "1,3,5-7"

img2pdf — Images to PDF

pdf-lite img2pdf page1.png page2.png        # → images.pdf
pdf-lite img2pdf *.png

info — Metadata

pdf-lite info doc.pdf
pdf-lite info *.pdf

completion — Shell tab‑completion

pdf-lite completion bash >> ~/.bashrc        # bash
pdf-lite completion zsh  >> ~/.zshrc         # zsh

Project Structure

PDF_Lite/
├── src/
│   ├── core/               # One file per PDF operation
│   ├── services/           # Task worker, exceptions
│   ├── adapters/           # qpdf / fitz / weasyprint wrappers
│   ├── ui/                 # cli.py (argparse) + gui.py (Tkinter)
│   └── main.py             # Entry point
├── tests/                  # pytest suite
├── completions/            # Shell completion scripts
├── AGENTS.md               # AI‑assisted development guide
├── pdf-lite.spec           # PyInstaller spec (UPX)
└── requirements.txt

Architecture

CLI ──────────────────→ core/ → adapters/      (direct calls)
GUI → services/worker ─→ core/ → adapters/      (threaded, queue.Queue)
  • Core layer: stateless pure functions, zero UI dependencies
  • Adapter layer: wraps subprocess (qpdf) / C libraries (PyMuPDF)
  • Service layer: TaskWorker (threading) + poll_queue for thread‑safe progress

Roadmap

Version What Status
v0.1 Core PDF ops + CLI
v0.1.1 Page extraction
v0.2 Smart CLI (auto‑naming, batch, chain, info)
v0.2.2 Engineering polish (unbundled qpdf, output‑follows‑input, result printing)
v0.2.3 UPX, completion, timeout, large‑file heartbeat
v0.3.0 Tkinter GUI (7 tabs, progress, cancel, log)
v0.3.1 / v0.3.2 GUI fixes (DPI, async, placeholders, placeholder bugfix)
v0.3.3 GUI operation logs, Info no longer slow ✅ Current
v0.4 Document conversion (Markdown / HTML / Word → PDF)
v1.0 Stable release, Excel → PDF, batch folders

Built with PyMuPDF · img2pdf · qpdf · PyInstaller · Tkinter
MIT License

About

轻量简单的PDF工具(整合了部分优秀开源项目)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors