Connect any data source — SharePoint, SQL databases, S3, Snowflake, local drives, PDFs, Word, Excel, JSON — and make it conversational with LLM-powered Q&A. One command to ingest. One prompt to find anything.
ingestAI uses a Retrieval-Augmented Generation (RAG) pipeline. Data is parsed, chunked, embedded, and stored locally — then retrieved semantically when you ask a question.
One-command ingestion from SharePoint, SQL databases, AWS S3, Snowflake, local disks, and mounted network shares — no custom ETL scripts needed.
Natively parses PDF, Word (.docx), Excel (.xlsx/.xls), JSON, and images (with EXIF metadata extraction). Tables, text, and properties all captured.
Ask plain-English questions in the browser UI or via REST API. The LLM answers using only your documents and always cites the source file.
All data lives on your machine. The vector store is a local ChromaDB instance. Only the query and relevant chunks leave your environment (to OpenAI).
The ingest.py CLI was built to run unattended — schedule it with
cron or a task scheduler for automatic nightly re-indexing of changing data.
Every function exposed via FastAPI: upload, list, delete, and query endpoints. Integrate ingestAI into any internal tool or workflow in minutes.
ingestAI ships connectors for the most common enterprise data locations. Optional dependencies are installed only when needed.
Install once, point at your data, then open the browser — or schedule the CLI for nightly re-indexing. No cloud infrastructure required.
# Local folder or any mounted drive $ python ingest.py local ~/Documents/reports --recursive $ python ingest.py local /Volumes/NAS/Finance --ext .pdf --ext .xlsx # SharePoint Online $ python ingest.py sharepoint \ --site-url https://company.sharepoint.com/sites/HR \ --library "Policy Documents" # Any SQL database $ python ingest.py database \ --url "postgresql://user:pass@host/mydb" \ --query "SELECT id, title, body FROM articles" \ --text-col body # AWS S3 $ python ingest.py s3 --bucket corp-docs --prefix legal/ # Snowflake data warehouse $ python ingest.py snowflake \ --account myorg.us-east-1 \ --database PROD_DB \ --query "SELECT TITLE, CONTENT FROM KNOWLEDGE_BASE" # List everything indexed $ python ingest.py list
Index SharePoint policy libraries and let employees ask plain-English questions about HR policies, IT security standards, or compliance requirements.
Ingest quarterly PDFs, Excel models, and analyst notes from a shared drive. Ask questions like "What was our EBITDA trend over the last 3 quarters?"
Pull SOPs, maintenance logs, and incident reports from a warehouse. Field technicians can query procedures without searching through folders.
Ingest research papers, internal wikis, and data exports. Surface cross-document insights that would take hours to find manually.
Load product specs, pricing tables, and customer feedback from a database or S3. Sales teams ask natural-language questions and get instant, cited answers.
Make handbooks, benefit guides, and training materials instantly searchable. New hires find answers without waiting for a colleague to respond.
ingestAI is open-source and runs entirely on your machine. For enterprise deployment, managed hosting, and support, talk to the doingnow team.