Since the first article studying the impact of this technology on the environment was published three years ago, a movement has grown among researchers to self-report the energy consumed and the emissions generated by their work. Having accurate numbers is an important step in making changes, but gathering those numbers can be a challenge.
“You can’t improve what you can’t measure,” says Jesse Dodge, a research scientist at Seattle’s Allen Institute of AI. “The first step for us, if we want to make progress in reducing emissions, is that we have to get a good measure.”
To that end, the Allen Institute recently collaborated with Microsoft, the AI Hugging Face company, and three universities to create a tool that measures the electricity usage of any machine learning program that runs on Azure, Microsoft’s cloud service With it, Azure users who create new models can see the total electricity consumed by the graphics processing units (GPUs) – specialized computer chips to perform calculations in parallel – during all phases of their project, from selecting a model to training and putting it into use. . It is the first major cloud provider to offer users access to information about the energy impact of their machine learning programs.
Although tools that measure the power usage and emissions of machine learning algorithms running on local servers already exist, these tools do not work when researchers use cloud services provided by companies such as Microsoft, Amazon, and Google. These services do not give users direct visibility into the GPU, CPU, and memory resources that consume their activities, and existing tools such as Carbontracker, Experiment Tracker, EnergyVis, and CodeCarbon need these values to provide accurate estimates.
The new Azure tool, which debuted in October, currently reports on energy usage, not emissions. So Dodge and other researchers figured out how to map energy use with emissions, and presented a supplementary paper on this work at FAccT, a major computer science conference, in late June. The researchers used a service called Watttime to estimate emissions from the zip codes of cloud servers running 11 machine learning models.
They found that emissions can be significantly reduced if researchers use servers in specific geographic locations and at certain times of the day. Emissions from the formation of small machine learning models can be reduced by up to 80% if the training begins at times when there is more renewable electricity available on the grid, while emissions from large models can be reduced by more than ‘20% if the training work stops when it is renewable. electricity is scarce and restarts when it is most abundant.