Appendix A: Data Source

1. Introduction

The success of our AI-driven heatwave prediction system relies on high-quality, high-resolution meteorological, climate, water, and environmental data. Canada’s Meteorological Service of Canada (MSC) Open Data ecosystem provides an extensive array of datasets through open, interoperable web services and APIs. This appendix details the available data sources, explains the associated API protocols and standards, and outlines best practices for accessing and integrating these data into our models.

Our system leverages both free and cost-recovered services from the MSC, including real-time and historical data, to support both the training and operational phases of our forecasting model. In addition, the MSC data integrates seamlessly with international systems (e.g., WIS2), ensuring that our approach aligns with global standards.


2. Overview of MSC Open Data Ecosystem

2.1 MSC Open Data Portal

The MSC Open Data portal is the authoritative source for weather, climate, water, ice, and environmental information for Canada and the world. It offers:

  • Real-Time Data: Weather alerts, current conditions, and public forecasts.

  • Observational Data: High-resolution weather radar imagery, in situ observations, lightning density, and satellite imagery.

  • Numerical Weather Prediction (NWP) Forecasts: Deterministic and ensemble forecasts from systems like GDPS, HRDPS, RDPS, and various precipitation and air quality analyses.

  • Climate Data: Historical observations, adjusted and homogenized climate data (AHCCD), gridded datasets (CANGRD), and future climate model projections (CMIP5, CMIP6).

  • Other Resources: Bulletins, Meteocode, MetNotes, and spatial polygons for forecast regions.

These datasets are accessible via a variety of methods, including standard web services and programmatic interfaces, ensuring that our system has the data required for both near-real-time predictions and long-term climate trend analysis.


3. Data Sources and Their Descriptions

3.1 MSC GeoMet

MSC GeoMet is a suite of web services provided by MSC and Environment and Climate Change Canada (ECCC) that offers public access to weather, climate, and environmental data. Key features include:

  • Interoperability: Data is delivered through open standards such as the Web Map Service (WMS), Web Coverage Service (WCS), and the emerging OGC API standards. This ensures that data is easy to discover, access, visualize, and integrate into various applications.

  • Service Layers:

    • GeoMet-Weather: Provides real-time weather and environmental data, including current conditions and forecasts.

    • GeoMet-Climate: Focuses on climate data, delivering historical records and long-term climate information.

    • GeoMet-OGC-API: Offers access to MSC and ECCC data using standardized APIs, making integration with custom applications straightforward.

  • Data Customization: Users can perform on-demand data clipping, reprojection, and format conversion to suit the needs of specific applications, such as focusing on the Toronto area for heatwave prediction.

  • Access Methods: MSC GeoMet services are accessed anonymously and free of charge, provided that the client software supports geospatial web services.

3.2 MSC Datamart

MSC Datamart is the raw data server of MSC, intended for specialized users with robust IT and meteorological expertise. It provides:

  • Raw Data Access: Offers access to high-resolution datasets including weather observations, NWP model outputs, and archived environmental data.

  • Real-Time Push Notifications: Utilizes the Advanced Message Queuing Protocol (AMQP) to notify users when new data is available, enabling automated, near-real-time data retrieval.

  • Alternative Access: An alternative HPFX server is available to provide higher bandwidth during peak usage times. This server supports rapid data access, which is critical during high-demand periods.

  • Data Formats: Data on MSC Datamart is available in formats such as GRIB2, NetCDF, CSV, XML, and GeoJSON, ensuring compatibility with various processing pipelines and ML models.

3.3 WIS2 (WMO Information System 2.0)

WIS2 is the global data-sharing framework managed by the World Meteorological Organization (WMO), now powered by MSC’s Global Discovery Catalogue (GDC). Its role includes:

  • International Data Exchange: Supports free and unrestricted access to meteorological data on an international scale, using open standards and APIs (e.g., OGC API - Records).

  • Data Discovery: Users can search, browse, and retrieve metadata and datasets related to Canadian and global weather and climate conditions.

  • Interoperability: WIS2’s adherence to FAIR data principles ensures that data is findable, accessible, interoperable, and reusable.


4. Detailed API Documentation

4.1 MSC GeoMet APIs

4.1.1 Web Map Service (WMS)

  • Description: WMS provides geospatial map images based on user requests. It allows the integration of layered weather and climate maps into visualization tools.

  • Usage Example: A typical WMS request URL might look like:

    https://geo.weather.gc.ca/geomet?SERVICE=WMS&VERSION=1.3.0&REQUEST=GetMap&LAYERS=layer_name&CRS=EPSG:4326&BBOX=-79.6393,43.5810,-79.1169,43.8555&WIDTH=800&HEIGHT=600&FORMAT=image/png
  • Key Parameters:

    • LAYERS: Specifies the data layer (e.g., temperature, precipitation).

    • CRS: Coordinate reference system (e.g., EPSG:4326).

    • BBOX: Bounding box for spatial extent.

    • WIDTH/HEIGHT: Dimensions of the output image.

4.1.2 Web Coverage Service (WCS)

  • Description: WCS allows users to retrieve geospatial data in coverage format (e.g., gridded datasets). It is particularly useful for obtaining raw NWP or climate model outputs.

  • Usage Example: A WCS request URL might appear as follows:

    https://geo.weather.gc.ca/geomet?SERVICE=WCS&VERSION=2.0.1&REQUEST=GetCoverage&COVERAGEID=coverage_name&FORMAT=application/netcdf
  • Key Parameters:

    • COVERAGEID: Identifier for the dataset.

    • FORMAT: Output format (e.g., NetCDF).

4.1.3 OGC API

  • Description: The OGC API is a modern RESTful API standard that provides uniform access to geospatial data. It supports both features and coverages.

  • Usage Example: To retrieve a JSON list of available datasets, a GET request may be made to:

    https://geo.weather.gc.ca/geomet/collections?f=json
  • Key Endpoints:

    • Collections Endpoint: Lists available datasets.

    • Items Endpoint: Retrieves individual data items with metadata.

4.1.4 SpatioTemporal Asset Catalog (STAC)

  • Description: STAC is an experimental community standard for cataloging spatiotemporal assets. It enables efficient discovery and retrieval of geospatial data.

  • Usage Example: Accessing a STAC collection via an API might involve querying:

    https://geo.weather.gc.ca/geomet/stac/collections?f=json
  • Key Parameters:

    • Temporal Range: Define start and end dates for filtering datasets.

    • Spatial Query: Specify geographic boundaries.

4.2 MSC Datamart Access

  • Data Retrieval: The MSC Datamart is accessed via HTTPS at https://dd.weather.gc.ca/. Data is organized in a directory structure based on dates, enabling users to download data by specifying the desired date (e.g., YYYYMMDD).

  • AMQP Integration: Users can subscribe to an AMQP data wire to receive push notifications when new data files are published. This protocol supports filtering based on data type (e.g., weather warnings, NWP model outputs).

  • Alternative Access (HPFX Server): For high-demand periods, the HPFX server (http://hpfx.collab.science.gc.ca/YYYYMMDD/WXO-DD/) provides significantly increased bandwidth. Users should build functionality to revert to the primary MSC Datamart in case of HPFX unavailability.

4.3 Cost-Recovered Data Services

  • Dedicated Data Feeds: For applications requiring high-resolution, low-latency data (e.g., our heatwave prediction model), MSC offers dedicated data feeds. The starting price is typically $500/month plus a $500 setup fee for up to 1 GB/day, with additional charges of $500/month for each extra GB/day.

  • Radar Data Services: Raw radar data and derived digital products are available on a cost-recovered basis:

    • Raw Radar Data (IRIS/ODIM_H5): Pricing ranges from $1,600/month for 1–5 radars to $2,000/month for 11+ radars, with a $2,000 one-time setup fee.

    • Digital Radar Products: Range from $600/month for 1–5 radars to $800/month for 11+ radars, with a $800 setup fee.

    • Other Products: GeoTIFF and GeoJSON products cost around $800/month each, plus corresponding setup fees.


5. Best Practices for Data Access and Integration

5.1 Utilizing MSC GeoMet

  • API Documentation: Always refer to the latest API documentation and usage tutorials provided on the MSC GeoMet portal. This ensures that you are using the most current endpoints, query parameters, and data formats.

  • Open Standards Compliance: Ensure that your client applications support OGC standards (WMS, WCS, OGC API) to facilitate seamless integration and data interoperability.

  • Data Clipping and Reprojection: Use on-demand services to clip and reproject data to your area of interest (e.g., Toronto). This reduces unnecessary data volume and ensures that the spatial resolution is tailored to your modeling needs.

5.2 MSC Datamart Strategies

  • AMQP Subscription: Set up an AMQP client to subscribe to data notifications for the most critical datasets. This enables near-real-time data ingestion, which is essential for operational forecasting.

  • Redundancy with HPFX: Develop failover mechanisms that switch to the HPFX server during peak times to maintain high data throughput.

  • Data Retention Policies: Be aware of the dynamic nature of the MSC Datamart’s directory structure, which organizes data by date. Establish a local archival strategy to retain historical data beyond the default retention period.

5.3 Cost Management

  • Assessing Data Needs: Carefully evaluate whether your application requires dedicated high-resolution feeds or if free API access suffices. This decision can significantly impact monthly operating costs.

  • Scaling Strategies: Optimize data volume by aggregating or filtering datasets where possible. Implement data compression and efficient storage formats (e.g., Parquet) to reduce storage and bandwidth costs.

5.4 Tutorials and Community Resources

  • MSC AniMet Tutorials: Leverage the MSC AniMet tool and its associated tutorials to gain familiarity with data visualization and to experiment with custom animations for various weather events.

  • Official Documentation: Regularly consult the MSC GeoMet “Usage Overview”, “Tutorials and Guides”, and “Available OGC Standards” pages for updated information.

  • Support and Mailing Lists: Subscribe to the GeoMet-Info announcement mailing list to receive notifications on API changes, new data products, and system enhancements. Reach out via “Contactez-nous” for technical support if needed.


6. Conclusion

This appendix provides a comprehensive overview of the data sources and API documentation critical to our heatwave prediction system. By leveraging the MSC Open Data ecosystem, our teams can access a wide array of high-resolution weather, climate, and environmental datasets, ensuring that our models are informed by the best available information.

Key takeaways include:

  • MSC GeoMet offers real-time access to weather, climate, and environmental data via OGC-compliant web services, supporting both visualization and automated data ingestion.

  • MSC Datamart provides raw, high-resolution datasets, with AMQP protocols enabling near-real-time notifications. Its alternative HPFX server ensures high bandwidth during peak usage.

  • WIS2 enhances data discovery and international interoperability, supporting global best practices and standards.

  • Cost-Recovered Data Services offer dedicated, high-resolution feeds essential for operational forecasting, albeit at a financial cost that must be managed carefully.

  • Best Practices involve leveraging open standards, optimizing data pipelines, and integrating community resources to ensure robust, scalable, and cost-effective data access.

This detailed technical description and API documentation will serve as the foundational reference for our engineering, data science, and operations teams, enabling them to efficiently integrate MSC data into our heatwave prediction models and ensuring that our system operates at the highest level of performance and reliability.

Last updated

Was this helpful?