ClimateGPT: Towards AI Synthesizing Interdisciplinary Research on Climate Change
Authors:
David Thulke,
Yingbo Gao,
Petrus Pelser,
Rein Brune,
Rricha Jalota,
Floris Fok,
Michael Ramos,
Ian van Wyk,
Abdallah Nasir,
Hayden Goldstein,
Taylor Tragemann,
Katie Nguyen,
Ariana Fowler,
Andrew Stanco,
Jon Gabriel,
Jordan Taylor,
Dean Moro,
Evgenii Tsymbalov,
Juliette de Waal,
Evgeny Matusov,
Mudar Yaghi,
Mohammad Shihadah,
Hermann Ney,
Christian Dugast,
Jonathan Dotan
, et al. (1 additional authors not shown)
Abstract:
This paper introduces ClimateGPT, a model family of domain-specific large language models that synthesize interdisciplinary research on climate change. We trained two 7B models from scratch on a science-oriented dataset of 300B tokens. For the first model, the 4.2B domain-specific tokens were included during pre-training and the second was adapted to the climate domain after pre-training. Addition…
▽ More
This paper introduces ClimateGPT, a model family of domain-specific large language models that synthesize interdisciplinary research on climate change. We trained two 7B models from scratch on a science-oriented dataset of 300B tokens. For the first model, the 4.2B domain-specific tokens were included during pre-training and the second was adapted to the climate domain after pre-training. Additionally, ClimateGPT-7B, 13B and 70B are continuously pre-trained from Llama~2 on a domain-specific dataset of 4.2B tokens. Each model is instruction fine-tuned on a high-quality and human-generated domain-specific dataset that has been created in close cooperation with climate scientists. To reduce the number of hallucinations, we optimize the model for retrieval augmentation and propose a hierarchical retrieval strategy. To increase the accessibility of our model to non-English speakers, we propose to make use of cascaded machine translation and show that this approach can perform comparably to natively multilingual models while being easier to scale to a large number of languages. Further, to address the intrinsic interdisciplinary aspect of climate change we consider different research perspectives. Therefore, the model can produce in-depth answers focusing on different perspectives in addition to an overall answer. We propose a suite of automatic climate-specific benchmarks to evaluate LLMs. On these benchmarks, ClimateGPT-7B performs on par with the ten times larger Llama-2-70B Chat model while not degrading results on general domain benchmarks. Our human evaluation confirms the trends we saw in our benchmarks. All models were trained and evaluated using renewable energy and are released publicly.
△ Less
Submitted 17 January, 2024;
originally announced January 2024.
Open Problems in DAOs
Authors:
Joshua Tan,
Tara Merk,
Sarah Hubbard,
Eliza R. Oak,
Helena Rong,
Joni Pirovich,
Ellie Rennie,
Rolf Hoefer,
Michael Zargham,
Jason Potts,
Chris Berg,
Reuben Youngblom,
Primavera De Filippi,
Seth Frey,
Jeff Strnad,
Morshed Mannan,
Kelsie Nabben,
Silke Noa Elrifai,
Jake Hartnell,
Benjamin Mako Hill,
Tobin South,
Ryan L. Thomas,
Jonathan Dotan,
Ariana Spring,
Alexia Maddox
, et al. (4 additional authors not shown)
Abstract:
Decentralized autonomous organizations (DAOs) are a new, rapidly-growing class of organizations governed by smart contracts. Here we describe how researchers can contribute to the emerging science of DAOs and other digitally-constituted organizations. From granular privacy primitives to mechanism designs to model laws, we identify high-impact problems in the DAO ecosystem where existing gaps might…
▽ More
Decentralized autonomous organizations (DAOs) are a new, rapidly-growing class of organizations governed by smart contracts. Here we describe how researchers can contribute to the emerging science of DAOs and other digitally-constituted organizations. From granular privacy primitives to mechanism designs to model laws, we identify high-impact problems in the DAO ecosystem where existing gaps might be tackled through a new data set or by applying tools and ideas from existing research fields such as political science, computer science, economics, law, and organizational science. Our recommendations encompass exciting research questions as well as promising business opportunities. We call on the wider research community to join the global effort to invent the next generation of organizations.
△ Less
Submitted 12 June, 2024; v1 submitted 29 October, 2023;
originally announced October 2023.