news-watch: Indonesia's top news websites scraper¶
news-watch scrapes structured news data from Indonesia's top news websites with keyword and date filtering.
Installation¶
pip install news-watch
playwright install chromium
Development setup: https://okky.dev/news-watch/getting-started/
Quick Start¶
Docs¶
Supported News Sources¶
| Source | Domain |
|---|---|
| Antara News | antaranews.com |
| Bisnis.com | bisnis.com |
| Bloomberg Technoz | www.bloombergtechnoz.com |
| CNBC Indonesia | www.cnbcindonesia.com |
| CNN Indonesia | www.cnnindonesia.com |
| Detik | detik.com |
| Jawa Pos | jawapos.com |
| Katadata | katadata.co.id |
| Kompas | kompas.com |
| Kontan | kontan.co.id |
| Liputan6 | www.liputan6.com |
| Media Indonesia | mediaindonesia.com |
| Metro TV News | metrotvnews.com |
| Okezone | okezone.com |
| Tempo | tempo.co |
| Tribunnews | www.tribunnews.com |
| Viva | viva.co.id |
Important Considerations¶
Ethical Use: Always respect website terms of service and implement appropriate delays between requests.
Performance: Works best in local environments. Cloud platforms may experience reduced performance due to anti-bot measures.