PayrollPilot - LLM Parsing + Accountant's World Automation
PayrollPilot is a production-grade automation system that uses LLMs (Google Gemini) and Playwright-based bots to parse messy payroll reports and upload them seamlessly to Accountantβs World. It was used in a live business, processing over 1,000 payrolls, saving $20K+ in labor costs, and generating $2K+ in new revenue.
π Key Features
- π LLM-powered Parsing: Converts messy payroll reports (PDF, RTF, Excel) into structured JSON.
- π CSV Auto-Fill: Populates Accountantβs World-compatible CSV templates per client.
- π€ Portal Automation: Automates CSV uploads and tax payments to Accountantβs World (AW).
- π§ Smart Field Detection: Dynamically maps earnings, deductions, and tax categories.
- π Batch Payroll Support: Processes multiple client folders in one click.
- π Streamlit Interface: Simple UI to review extracted data and approve uploads.
π Use Cases
- Bookkeeping firms handling 50+ client payrolls
- Automation-first accountants aiming to cut labor costs
- LLM startups showcasing GenAI for operations
- Any business tired of manual AW uploads
π§ Tech Stack
- Python 3.10+
- Streamlit β UI
- Playwright β Headless browser automation
- Google Gemini API β Large Language Model parsing
- pandas, PyMuPDF, python-docx, striprtf β Data + doc processing
π§° Project Structure
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
Payroll-LLM-Extractor/
β
βββ streamlit_app.py # Main Streamlit app
βββ upload_runner.py # Automates payroll CSV uploads to AW
βββ upload_tax.py # Automates tax form filling on AW
β
βββ src/
β βββ send_chunk_llm.py # Gemini API integration
β βββ excel_raw_text_chunk.py # Raw text extraction from Excel
β βββ populate_csv_template.py # Populates CSV template
β
βββ utils/
β βββ gemni_parser.py # Gemini prompt + parsing logic
β βββ extract_rtf_pdf.py # PDF/RTF chunker
β βββ label_analysis.py # Detects suspicious earnings
β βββ populate_csv.py # Final CSV output generator
β
βββ agent_project/
β βββ agents.py # Playwright login + upload logic
β βββ opt.py # Browser headless setup
β
βββ misc/ # Experimental scripts + prompts
βββ requirements.txt
βββ README.md
π§ͺ How It Works
- Upload a clientβs payroll file (PDF, RTF, Excel).
- Extracts text chunks β
src/excel_raw_text_chunk.py
- Sends chunks to Gemini β
src/send_chunk_llm.py
- Post-processes earnings/deductions β
utils/label_analysis.py
- Auto-fills CSV using template β
src/populate_csv_template.py
- Uploads CSV and fills taxes on AW β
upload_runner.py
,upload_tax.py
βΆοΈ Streamlit Demo
1
2
3
pip install -r requirements.txt
streamlit run streamlit_app.py
Make sure you have a .env file with your Gemini API key:
1
GEMINI_API_KEY=your_gemini_key_here
AW credentials are managed securely within the agent_project/ automation scripts or set manually before automation.
π‘ Created for a live payroll system in NY: 1,000+ payrolls processed, $20K+ labor saved, $2K+ new revenue.
π§ Future Enhancements
β SharePoint sync for final reports
β AI-driven field mapping & validation
β Full audit trail + override logging
β Email notifications for payroll events
βοΈ License
MIT β free to use and modify with credit.