Dr. Baharan Mirzasoleiman, Assistant Professor in the Department of Computer Science at UCLA, has been awarded the prestigious Okawa Research Foundation Grant for her groundbreaking work in sustainable training of foundation machine learning models. This competitive grant is presented to leading researchers who are making significant contributions to the fields of information and telecommunications.
Recent advancements in foundational machine learning models, such as GPT-4 (a language model) and CLIP (a vision-language model), have enabled a wide range of new applications, from personal assistants to breakthroughs in personalized medicine. However, training these foundation models demands vast amounts of data to ensure their high performance, resulting in significant financial and environmental costs. For instance, training a foundational model like GPT-3 involves multiple iterations, with just one training iteration consuming approximately 1287 MWh of electricity and generating 502 metric tons of CO2. Thisis comparable to the annual emissions of 112 gasoline-powered vehicles. This substantial carbon footprint poses serious challenges for the climate. Furthermore, projections suggest that by 2032, a single training cycle for large-scale foundational models could account for more than 1% of the US GDP, further exacerbating barriers to AI democratization and limiting the ability of smaller organizations to adopt these technologies.
Mirzasoleiman’s research proposal aims to address these challenges by developing a rigorous theoretical framework to significantly enhance the sustainability and efficiency of training foundational models. The focus will be on identifying small, high-quality subsets of training data that maintain model performance while drastically reducing the energy consumption and environmental impact of training processes.