Carbon footprint, the (not so) hidden cost of high performance computing



There’s something a little surreal about having dozens of processors and gigabytes of memory at your fingertips. Yet this is the essence of High Performance Computing (HPC): by centralizing servers in data centers and opening secure communication channels with the user, HPC installations have dramatically reduced the purchase load. and supercomputer maintenance. For users, this has made it easier than ever to access these resources; any laptop, tablet or even smartphone can do it painlessly. Cloud computing platforms, such as Amazon’s AWS, Google’s GCP, or Microsoft’s Azure are the most popular options, but most research institutes and businesses have their own data centers and follow the same principles.

On the bright side, such computer developments have enabled discoveries like the first direct image of a black hole 55 million light-years away, more accurate weather forecasts than ever before, and the discovery of thousands of genetic variants linked to disease. To measure the extent of HPC usage, we can look at usage statistics for XSEDE, the Extreme Science and Engineering Discovery Environment, a network of data centers at US universities used for scientific research. In 2020, every hour, one million hours of computing were performed on the network, for a total of nine billion hours of computing.

What’s the deal with that? After all, it leads to groundbreaking discoveries. It also requires mountains of hardware and electricity to power it, which comes at a significant, but largely overlooked, environmental cost. Data centers are estimated to have an annual carbon footprint of 100 megatonnes of CO2e from power generation alone, similar to the entire US commercial aviation industry. Unsurprisingly, this will only increase over the next few years, 2-9 times over the next decade, according to some studies.

A variety of environmental impacts

Large IT facilities impact the environment in many ways: power generation, IT hardware manufacturing, long-term storage management, cooling, maintenance, etc.

Energy consumption is perhaps the most discussed aspect in the media. The environmental cost of powering data centers depends directly on the carbon footprint of energy production, which varies greatly with the energy mix of the country where the data center is located. For example, producing 1 kWh of electricity in Switzerland (powered mainly by hydropower) emits 12 gCO2e on average, but 253 gCO2e in the United Kingdom and 880 gCO2e in Australia (where coal is the main source of energy). Therefore, this means that exactly the same task will have a 73 times larger carbon footprint in Australia compared to Switzerland.

However, this is far from the only downside. The manufacture of computer hardware is notoriously bad for the environment due to the mining of precious metals. In the case of consumer devices, up to 70% to 80% of the total carbon footprint comes from manufacturing, with use and disposal only responsible for the remaining 20-30%. This shows the importance for us of trying to conserve, repair and reuse our devices as much as possible. In data centers, it is also important to consider the environmental impact in the equipment renewal cycle.

Artificial intelligence: big models, huge carbon footprint

“Driving a single model of AI can emit as much carbon as five cars in their lifetime.” This headline made headlines on all tech journals in the summer of 2019, from MIT Technology Review To New Scientist. This followed an article studying the carbon footprint of algorithms trying to understand natural language, an extremely difficult task where it is common for algorithms to run for days or even weeks. Similar concerns have been raised in articles such as “Green AI” and “On the Dangers of Stochastic Parrots: Can Language Patterns Be Too Big?” In which the researchers discuss the issue of accessibility (what happens to research if only a handful of tech companies can afford to develop such models?) And the risks posed by these technologies. They stress in particular that the populations who suffer the most from the environmental cost of AI also benefit the least from innovations such as Apple’s Siri or Amazon’s Alexa. Notwithstanding the ethical issues that arise if these tech companies are the only ones overseeing the linguistic models that underpin many aspects of society. Almost to illustrate this point, Google dissolved the Ethical AI team, who helped write the article on stochastic parrots shortly after its release.

These various studies have led to the development of several tools aimed specifically at estimating the carbon footprint of machine learning models, such as the ML CO2 impact and the Experiment Impact Tracker.

AI can be the tree that hides the forest

The ever-growing computing needs of artificial intelligence are a real cause for concern, but we should not forget all the other areas of science that also rely heavily on computation. As mentioned at the beginning, complex algorithms are everywhere, from genomics to physics and astronomy. This realization that the carbon footprint must be taken into account by all scientists using algorithms, not just in AI, is what led to the development of a vast initiative: the Green Algorithms project. This is a theoretical framework (another one!) For estimating the carbon footprint of any calculation, but above all, an online calculator to make the estimates easily. This has shed light on the carbon footprint of bioinformatics, for example, where genomics tools or molecular simulations emit kilograms of CO2e with each use.



Leave A Reply