Portfolio-grade renewable energy platform

Wind energy forecasting from distributed Spark pipelines to live NOAA analysis.

This project combines large-scale NOAA weather processing, PySpark ETL, turbine-inspired wind modeling, ML forecasting, Airflow orchestration, benchmarking, preserved website artifacts, and a deployable FastAPI live analysis backend.

Historical Window

1995–2025

Verified Live Stations

1,981

Forecast Evaluation Rows

535,961

Live Wind Outlook

Explore live NOAA observations, turbine-inspired power-curve estimates, and deployable backend wind outlook analysis.

Pipeline Architecture

Review the Spark ETL pipeline, Airflow orchestration, feature engineering, ML workflow, and artifact preservation design.

Historical Results

Analyze long-run wind potential trends, state summaries, regional outputs, and historical Spark analytics.

Forecasting Model

Inspect model metrics, holdout forecast evaluation, feature importance, and forecasting diagnostics.

Benchmarking

Compare Spark and DuckDB analytical execution performance across benchmark workloads.

End-to-end forecasting workflow

NOAA ISD
Spark ETL
Gold Tables
ML Training
Artifact Exports
Website + API

ingestion → cleaning → feature engineering → forecasting → preserved artifacts → live analysis