Analysis Pipeline

This repository drives analysis.castromedia.org, a fully transparent data workflow.

Aggregation – The data/ folder stores every raw dataset. A notebook called update.ipynb pulls from each source on a schedule, versioning the files and opening a pull request with the changes.
Analysis – Jupyter notebooks inside analysis/ use the latest data snapshots to produce figures and markdown reports. When data changes, these notebooks re-run automatically and propose another pull request.
Publication – Once both pull requests are merged automatically (assuming no conflicts), GitHub Pages rebuilds the site using Jekyll. The rendered HTML lives in this repo so anyone can inspect the exact inputs and outputs.
Metadata – Every dataset directory now ships with a metadata.md file capturing the catalog fields and a short description. Analysis folders include a similar metadata.md listing the result columns. The homepage reads these files to populate its project and dataset lists.

This setup aggregates data from many sources, analyzes it, and publishes the results in real time. Because every step happens in the open through Git commits and PRs, it offers a maximally transparent workflow for public research.

Pipelines

These analysis drive the content behind the public properties which function as a mass consumption format for the data:

TopStoryReview: pulls the most recent output of the news topics analysis and basically builds a website around those headlines.

News feeds are organized by region under data/news/<region>/<source>/. Each of those directories has its own metadata.md describing the source so analyses and the homepage can find them easily.

Name		Name	Last commit message	Last commit date
Latest commit History 6,034 Commits
.github/workflows		.github/workflows
_includes		_includes
_layouts		_layouts
analysis		analysis
assets		assets
data		data
AGENTS.md		AGENTS.md
CNAME		CNAME
README.md		README.md
dashboard.md		dashboard.md
dependencies.json		dependencies.json
index.md		index.md
run.ipynb		run.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Analysis Pipeline

Pipelines

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

Castro-Media/Analysis

Folders and files

Latest commit

History

Repository files navigation

Analysis Pipeline

Pipelines

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages