The EnergizAIR project collects and publishes public-friendly interpretations of energy production statistics for various renewable energy sources (photovoltaic, thermal, wind) across several European countries.
Enthought’s role was to create the data-management framework for this project. This framework was required to:
- retrieve raw energy data and meteorological data from various online sources at regular intervals
- process raw data: in particular, aggregating over time (e.g., what was the mean daily energy production for January 2012) and space (e.g., what was the total energy production for the Lyon region of France)
- store raw and processed data in a reusable and easily searchable format
- provide interpretations of processed data in ‘real-world’ terms (e.g., “in Belgium in March 2012, the average household equipped with photovoltaic solar panels produced enough energy to run a refrigerator and 2 televisions”).
- create and distribute reports in various formats on a regular basis (e.g., a daily email sent to a particular email address, a weekly summary in XML format posted to an FTP server, monthly and yearly reports to various channels, etc.)
The output from the framework and the energy input sources could vary, so the task was to create a *flexible*, *extensible* framework that provided all of the infrastructure for the above requirements.
Some of the challenges involved in this project:
- Creating the framework from scratch meant putting significant thought and energy into the design, and iterating several times until we were satisfied that we had something that was clean, robust and flexible.
- The deliverables needed to be usable and extensible by non-expert Python programmers: that is, our end users were programmers rather than people interacting with a UI. This meant investing a lot of time and energy in making sure that the API was well thought out and meticulously documented, and providing several example uses of the framework for those end-users to build on.
- The data retrieval and report publishing components needed to be robust and give sane results in the face of external server problems.
- Similarly, the framework needed to allow for various report distribution mechanisms (e.g., by email, HTTP, FTP).
- The data storage backend needed to be potentially accessible by several processes and threads at once; we thus needed a ‘broker’ architecture to serialize read and write requests to the underlying storage.
- Many of the tasks needed to happen at particular times and dates: e.g., we needed to check a particular FTP server for new files after 7am on every weekday, or send an email report with the previous week’s statistics every Monday morning. For this we developed a scheduler in Python, which became a core part of the solution.
As with almost every Enthought project, we used an agile approach to crafting the eventual deliverables:
- Rapid iterations, together with continuous feedback from the customer, allowed us to converge on a working solution fast. The customer had full read and write access to the development repository and issue tracker.
- We used test-driven development for a large portion of the project, resulting in code with a high level of test coverage and a high degree of confidence in its reliability. This has been of great value for the EnergizAIR team when the project was handed off to them as it helped them to extend the framework with no fear of breaking the system.
- Much of the code was developed using pair programming. This was especially true during the early stages, where we were iterating over the design; it resulted in a well thought out, consistent, easy-to-use API.
HDF5 and ZeroMQ
Using standard and well-tested solutions for data storage and inter-process communication allowed us to build a working solution rapidly. We chose to use HDF5 for storage of the raw and processed time-series data, via the excellent PyTables Python package. ZeroMQ, with its existing Python bindings, provided us with a lightweight and flexible solution for communication between the HDF5 broker process (serializing concurrent reads and writes to the HDF5 backend) and its clients.
A Clean, User-Oriented OOP Design
The solution revolved around a number of Traits-based Python class hierarchies, designed to be easily accessible and comprehensible to the programmers who would have to maintain and use the code. Key base classes:
- the Provider class represented a raw data source, encapsulating the information necessary to retrieve data from that data source.
- the Report class represented the generated report
- the Publisher class class was responsible for publishing reports once created
A Python-Based DSL
A major challenge was to produce a solution that was flexible and extensible, while remaining readable and requiring minimal Python knowledge to extend and modify. Part of our solution was to create a Python-based Domain Specific Language (DSL). Here’s some example code that sets up an application retrieving photovoltaic, meteorological and wind data from various French and Belgian sources, and publishing daily and weekly reports by email. This code also serves to highlight the OOP design described above.
dms = DMS( logging_level = logging.DEBUG, storage = HDF5TimeSeriesStorageClient(), schedule = [ # Actions to import time series... Daily( at = datetime.time(hour=8, minute=30), action = ImportTimeSeries( providers = [ EpicePvyieldProvider( country=Belgium ), EpiceWeatherProvider(), FranceWindProvider(), IndexisWindProvider(), ] ), ), # Actions to publish reports... Daily( at = datetime.time(hour=11), action = PublishReport( publisher = EmailPublisher( recipients = RECIPIENTS, subject = u'Daily France Wind Report' ), report = DailyFranceWindReport(), ) ), Daily( first_date = datetime.date.today(), at = datetime.time(hour=11), action = PublishReport( publisher = EmailPublisher( recipients = RECIPIENTS, subject = u'Weekly Belgium Report' ), report = DailyWebReport( # Wind model. speed_db = GEOGRAPHIC_DB, geographic_db = REGIONS_DB, wind_farm_db = WIND_FARM_DB, wind_turbine_db = WIND_TURBINE_DB, # Photovoltaic model. appliances_db = APPLIANCES_DB, equivalent_appliances_db = EQUIVALENT_APPLIANCES_DB, systems_db = SYSTEMS_DB, # Thermal model solar_point_names = SOLAR_POINTS_DB, consumption_level = 140, country = Belgium ) ) ), ] )
More About EnergizAIR
The basic principle of the project was to utilize renewable energy indicators (in this case, photovoltaic, solar thermal, and wind energy) in an every day weather forecast. Simply, how can wind and sun meet our energy requirements? Belgium, France, Italy, Portugal and Slovenia are all participating in the project and now provide renewable energy indicators online and in local weather reports.