Skip to content

news-watch: Indonesia's top news websites scraper

PyPI version Build Status PyPI Downloads

news-watch scrapes structured news data from Indonesia's top news websites with keyword and date filtering.

Installation

pip install news-watch
playwright install chromium

Development setup: https://okky.dev/news-watch/getting-started/

Quick Start

newswatch --keywords ihsg --start_date 2025-01-01
import newswatch as nw

df = nw.scrape_to_dataframe("ihsg", "2025-01-01")
print(len(df))

Docs

Supported News Sources

Source Domain
Antara News antaranews.com
Bisnis.com bisnis.com
Bloomberg Technoz www.bloombergtechnoz.com
CNBC Indonesia www.cnbcindonesia.com
CNN Indonesia www.cnnindonesia.com
Detik detik.com
Jawa Pos jawapos.com
Katadata katadata.co.id
Kompas kompas.com
Kontan kontan.co.id
Liputan6 www.liputan6.com
Media Indonesia mediaindonesia.com
Metro TV News metrotvnews.com
Okezone okezone.com
Tempo tempo.co
Tribunnews www.tribunnews.com
Viva viva.co.id

Important Considerations

Ethical Use: Always respect website terms of service and implement appropriate delays between requests.

Performance: Works best in local environments. Cloud platforms may experience reduced performance due to anti-bot measures.