down-craft

Node.js package to simplify the process of converting documents (PDF, DOCX, PPTX, and XLSX) into Markdown format. It uses tesseract.js, mammoth, pdf.js, and...

Uses tesseract.js, mammoth, pdf.js, and turndown under the hood, with optional vLLM-based OCR (OpenAI API) for PDFs. The downCraft(fileBuffer, fileType?, options?) function auto-detects file type when not specified. For PDFs, pdfConverterType selects between standard, llm, and ocr converters.