A Python-based Framework for the EnergizAIR Program
The EnergizAIR project collects and publishes public-friendly interpretations of energy production statistics for various renewable energy sources (photovoltaic, thermal, wind) across several European countries.
Enthought’s role was to create the data-management framework for this project. This framework was required to:
- retrieve raw energy data and meteorological data from various online sources at regular intervals
- process raw data: in particular, aggregating over time (e.g., what was the mean daily energy production for January 2012) and space (e.g., what was the total energy production for the Lyon region of France)
- store raw and processed data in a reusable and easily searchable format
- provide interpretations of processed data in ‘real-world’ terms (e.g., “in Belgium in March 2012, the average household equipped with photovoltaic solar panels produced enough energy to run a refrigerator and 2 televisions”).
- create and distribute reports in various formats on a regular basis (e.g., a daily email sent to a particular email address, a weekly summary in XML format posted to an FTP server, monthly and yearly reports to various channels, etc.)
The output from the framework and the energy input sources could vary, so the task was to create a *flexible*, *extensible* framework that provided all of the infrastructure for the above requirements.
Challenges
Some of the challenges involved in this project:
- Creating the framework from scratch meant putting significant thought and energy into the design, and iterating several times until we were satisfied that we had something that was clean, robust and flexible.
- The deliverables needed to be usable and extensible by non-expert Python programmers: that is, our end users were programmers rather than people interacting with a UI. This meant investing a lot of time and energy in making sure that the API was well thought out and meticulously documented, and providing several example uses of the framework for those end-users to build on.
- The data retrieval and report publishing components needed to be robust and give sane results in the face of external server problems.
- We needed to be able to deal with inputs from various sources: e.g., one set of energy inputs was available by downloading selected files from an FTP server; another was available in JSON form through a RESTful web interface; yet another had to be retrieved directly from a JavaScript interface on an existing web page. Moreover, the framework needed to be able to cope with the addition of new types of data sources at a later time.
- Similarly, the framework needed to allow for various report distribution mechanisms (e.g., by email, HTTP, FTP).
- The data storage backend needed to be potentially accessible by several processes and threads at once; we thus needed a ‘broker’ architecture to serialize read and write requests to the underlying storage.
- Many of the tasks needed to happen at particular times and dates: e.g., we needed to check a particular FTP server for new files after 7am on every weekday, or send an email report with the previous week’s statistics every Monday morning. For this we developed a scheduler in Python, which became a core part of the solution.
Using Agile
As with almost every Enthought project, we used an agile approach to crafting the eventual deliverables:
- Rapid iterations, together with continuous feedback from the customer, allowed us to converge on a working solution fast. The customer had full read and write access to the development repository and issue tracker.
- We used test-driven development for a large portion of the project, resulting in code with a high level of test coverage and a high degree of confidence in its reliability. This has been of great value for the EnergizAIR team when the project was handed off to them as it helped them to extend the framework with no fear of breaking the system.
Page 1 of 2 | Next page
