EDC support
Can climate data reveal how meteorological shifts affect economies? PhD researchers Vinh Phan and Fiorella Parra Mujica set out to measure the economic impacts of thermal inversions using multiple economic indicators across countries and years. Thermal inversions trap pollutants near the ground, creating smog and health hazards—conditions with potential economic consequences. The team sourced five years of geospatial data from the Climate Data Store (CDS) but faced barriers: limited infrastructure, slow code and a lack of technical resources.
EDC’s data lab re-engineered the pipeline by rewriting the initial R workflow with optimised Python libraries and a robust multiprocessing approach. We corrected the distance-weighting method by replacing a flat-earth calculation with a matrix-based Haversine formula that accounts for Earth’s curvature, improving statistical accuracy. Computations ran on high-RAM EDC workstations—avoiding external supercomputers and reducing costs.
- Optimised Python rewrite of an R-based workflow
- Multiprocessing for major runtime reductions
- Accurate geodesic distances via Haversine matrices
- Execution on EDC high-RAM machines for cost efficiency
Impact
- Data processing times reduced by 80%+
- Lower energy consumption and improved data quality
- Analysis completed at a fraction of expected cost and time
- Tailored technical support blending sustainable computing, smart optimisation and domain consulting
Testimonial
Vinh Phan PhD Researcher“The collaboration with the EDC was very helpful for us in optimising our process. EDC scientists quickly understood our problem and what we were trying to achieve, creating a seamless transition to a faster pipeline. We saved significant computing resources due to faster processing time and, equally important, we can learn from and build upon the new code thanks to the detailed documentation provided.
Further reading
- Climate Data Store (CDS)
- Haversine distance (geodesic)
- Polars & multiprocessing patterns
