Statistics > Applications
[Submitted on 11 May 2023]
Title:How to out-perform default random forest regression: choosing hyperparameters for applications in large-sample hydrology
View PDFAbstract:Predictions are a central part of water resources research. Historically, physically-based models have been preferred; however, they have largely failed at modeling hydrological processes at a catchment scale and there are some important prediction problems that cannot be modeled physically. As such, machine learning (ML) models have been seen as a valid alternative in recent years. In spite of their availability, well-optimized state-of-the-art ML strategies are not being widely used in water resources research. This is because using state-of-the-art ML models and optimizing hyperparameters requires expert mathematical and statistical knowledge. Further, some analyses require many model trainings, so sometimes even expert statisticians cannot properly optimize hyperparameters. To leverage data and use it effectively to drive scientific advances in the field, it is essential to make ML models accessible to subject matter experts by improving automated machine learning resources. ML models such as XGBoost have been recently shown to outperform random forest (RF) models which are traditionally used in water resources research. In this study, based on over 150 water-related datasets, we extensively compare XGBoost and RF. This study provides water scientists with access to quick user-friendly RF and XGBoost model optimization.
References & Citations
Bibliographic and Citation Tools
Bibliographic Explorer (What is the Explorer?)
Litmaps (What is Litmaps?)
scite Smart Citations (What are Smart Citations?)
Code, Data and Media Associated with this Article
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub (What is DagsHub?)
Gotit.pub (What is GotitPub?)
Papers with Code (What is Papers with Code?)
ScienceCast (What is ScienceCast?)
Demos
Recommenders and Search Tools
Influence Flower (What are Influence Flowers?)
Connected Papers (What is Connected Papers?)
CORE Recommender (What is CORE?)
arXivLabs: experimental projects with community collaborators
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.
Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.
Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.