Welcome to Thomas Insights — every day, we publish the latest news and analysis to keep our readers up to date on what’s happening in industry. Sign up here to get the day’s top stories delivered straight to your inbox. Artificial Intelligence (AI) now helps us do everything from drive cars to order food , but there’s no such thing as a free lunch; we’re just now realizing the staggering hidden costs behind AI.
Emma Strubell, a PhD student at the University of Massachusetts Amherst, and her colleagues Aranya Ganesh and Andrew McCallum addressed AI’s dirty secret in their recent study on natural language processing (NPL). They found that the processes used to train neural networks on a single model emit significant carbon emissions – five times the emissions of an average car over the course of its lifetime.
Strubell and her team examined four models for NPL: BERT , ELMo , GPT-2, and the Transformer . In their study, they trained each model using a single GPU for a day and measured how much power each model required. Based on each model’s manufacturing data, they calculated the total amount of energy consumed and then estimated the amount of carbon dioxide consumed based on the average energy mix in the U.S.
It should be noted that training just one model is the bare minimum of work that’s required and unlikely to be sufficient in practice. To explore this idea, the researchers added additional steps to further refine the results of the models. As expected, each step to expand a model increased its accuracy, but with diminishing improvements and skyrocketing costs in energy consumption.
Examining the Implications of AI’s Carbon Impact
With AI’s increasing prevalence in virtually every industry from medicine to defense to consumer goods, these findings of AI’s high emissions and high dollar requirements could have big implications.
First, we’re currently on pace to miss the “best chance” to avoid uncontrollable climate change. As we continue to innovate AI, we will also continue to emit significant amounts of carbon.
Second, the energy required to power these systems is enormously costly, and these costs can price educational institutions (the traditional centers of AI-related innovations) out of the research altogether.
So, what can we do about this? Strubell and her colleagues do offer some solutions, including increasing awareness of how models are structured and the associated costs. “Authors should report training time and sensitivity to hyperparameters,” they said. Including this information alongside a model, especially one that will be retrained for downstream use, will help future researchers compare models in a more apples-to-apples manner.
Another point they make is that academic researchers need fair access to these costly computation resources. Limiting research to labs funded by corporations intrinsically harms the AI community. It limits creativity as researchers with capped resources are unable to conduct their experiments due to the high cost of computation hardware.
It also virtually eliminates certain types of research that are not supported by high-worth corporations. Building a government-supported academic compute cloud would free researchers from relying on costly private cloud services like Amazon Web Services (AWS) and Google Cloud Services and enable equitable access for all.
Finally, the team recommends researchers in both academia and industry “prioritize computationally efficient hardware and algorithms.” In NPL software, this type of effort has gained momentum already, and it’s time for that same line of thinking to migrate to the hardware that supports NPL. They suggest researchers create and use APIs for hyperparameter tuning rather than energy-intensive brute force methods.
Strubell will present her findings this month at the Association for Computational Linguistics (ACL) annual meeting in Florence, Italy. In an email to VICE , she said, “I’m not against energy use in the name of advancing science, obviously, but I think we could do better in terms of considering the trade-off between required energy and resulting model improvement.”
Image Credit: Gwoeii / Shutterstock.com