- 🔐 Automatic and manual login support
- 📋 Batch scraping from company list
- 🤖 Anti-detection measures with randomized delays
- 💾 CSV export with detailed company information
- 💱 Currency conversion
- 🌐 Proxy support via Selenium
- Company name and legal name
- About/Description
- Funding information
- Location
- Employee count
- Company type (Public/Private)
- Website
- Year founded
- Company ranking
- Acquisitions count
- Investments count
- Exits count
- Stock symbol
- Operating status
- Python 3.8+
- Chrome browser
- Crunchbase account
- Clone the repository:
git clone https://github.com/afk-procrastinator/crunchbase-scraper
cd crunchbase-scraper
- Create and activate a virtual environment:
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
- Install dependencies:
pip install -r requirements.txt
- Set up environment variables:
cp .env.template .env
Edit .env
with your Crunchbase credentials:
[email protected]
CRUNCHBASE_PASSWORD=your-password
- Create a list of companies to scrape in
company_list.txt
, separated by newlines:
Company Name 1
Company Name 2
- Run the scraper:
python main.py
The script will:
- Log in to Crunchbase
- Process each company in the list
- Save results to
companies.csv
├── src/
│ ├── auth.py # Authentication handling
│ ├── models.py # Data models
│ ├── scraper.py # Core scraping logic
│ ├── selectors.py # CSS selectors
│ └── utils.py # Utility functions
├── main.py # Entry point
├── requirements.txt # Dependencies
├── .env.template # Environment template
└── company_list.txt # Input companies
- The scraper includes automatic retry logic for failed requests
- Manual login fallback if automatic login fails
- Graceful handling of missing data points
Contributions are welcome! Please feel free to submit a Pull Request.
This tool is for educational purposes only. Please review and comply with Crunchbase's terms of service before use.