Skip to content

Releases: eltsai/term_miner

🚀 TermMiner v1.0 – Automated Pipeline for Extracting & Analyzing Unfavorable Financial Terms from Shopping Website T&Cs

03 Feb 14:34
Compare
Choose a tag to compare

This release includes TermMiner, an open-source pipeline for automated data collection and topic modeling of unfavorable financial terms in shopping websites. It includes:

✅ Shopping website collection from a given list of URLs
✅ Automated Terms & Conditions extraction
✅ Text sanitization & paragraph segmentation
✅ Topic modeling & clustering of financial terms

🤗 Dataset collected: ShopTC-100K dataset

What's Included?
📂 Full codebase for running the pipeline
📜 Configurable settings in configs/measurement.yaml
📊 Predefined prompts for classification & clustering
🔍 Step-by-step instructions in measurement/README.md