-
Unicorns Do Not Exist: Employing and Appreciating Community Managers in Open Source
Authors:
Raphael Sonabend,
Anna Carnegie,
Anne Lee Steele,
Marie Nugent,
Malvika Sharan
Abstract:
Open-source software is released under an open-source licence, which means the software can be shared, adapted, and reshared without prejudice. In the context of open-source software, community managers manage the communities that contribute to the development and upkeep of open-source tools. Despite playing a crucial role in maintaining open-source software, community managers are often overlooke…
▽ More
Open-source software is released under an open-source licence, which means the software can be shared, adapted, and reshared without prejudice. In the context of open-source software, community managers manage the communities that contribute to the development and upkeep of open-source tools. Despite playing a crucial role in maintaining open-source software, community managers are often overlooked. In this paper we look at why this happens and the troubling future we are heading towards if this trend continues. Namely if community managers are driven to focus on corporate needs and become conflicted with the communities they are meant to be managing. We suggest methods to overcome this by stressing the need for the specialisation of roles and by advocating for transparent metrics that highlight the real work of the community manager. Following these guidelines can allow this vital role to be treated with the transparency and respect that it deserves, alongside more traditional roles including software developers and engineers.
△ Less
Submitted 29 June, 2024;
originally announced July 2024.
-
Taming the Long Tail of Deep Probabilistic Forecasting
Authors:
Jedrzej Kozerawski,
Mayank Sharan,
Rose Yu
Abstract:
Deep probabilistic forecasting is gaining attention in numerous applications ranging from weather prognosis, through electricity consumption estimation, to autonomous vehicle trajectory prediction. However, existing approaches focus on improvements on the most common scenarios without addressing the performance on rare and difficult cases. In this work, we identify a long tail behavior in the perf…
▽ More
Deep probabilistic forecasting is gaining attention in numerous applications ranging from weather prognosis, through electricity consumption estimation, to autonomous vehicle trajectory prediction. However, existing approaches focus on improvements on the most common scenarios without addressing the performance on rare and difficult cases. In this work, we identify a long tail behavior in the performance of state-of-the-art deep learning methods on probabilistic forecasting. We present two moment-based tailedness measurement concepts to improve performance on the difficult tail examples: Pareto Loss and Kurtosis Loss. Kurtosis loss is a symmetric measurement as the fourth moment about the mean of the loss distribution. Pareto loss is asymmetric measuring right tailedness, modeling the loss using a generalized Pareto distribution (GPD). We demonstrate the performance of our approach on several real-world datasets including time series and spatiotemporal trajectories, achieving significant improvements on the tail examples.
△ Less
Submitted 2 March, 2022; v1 submitted 27 February, 2022;
originally announced February 2022.
-
Scalable Pooled Time Series of Big Video Data from the Deep Web
Authors:
Chris Mattmann,
Madhav Sharan
Abstract:
We contribute a scalable implementation of Ryoo et al's Pooled Time Series algorithm from CVPR 2015. The updated algorithm has been evaluated on a large and diverse dataset of approximately 6800 videos collected from a crawl of the deep web related to human trafficking on DARPA's MEMEX effort. We describe the properties of Pooled Time Series and the motivation for using it to relate videos collect…
▽ More
We contribute a scalable implementation of Ryoo et al's Pooled Time Series algorithm from CVPR 2015. The updated algorithm has been evaluated on a large and diverse dataset of approximately 6800 videos collected from a crawl of the deep web related to human trafficking on DARPA's MEMEX effort. We describe the properties of Pooled Time Series and the motivation for using it to relate videos collected from the deep web. We highlight issues that we found while running Pooled Time Series on larger datasets and discuss solutions for those issues. Our solution centers are re-imagining Pooled Time Series as a Hadoop-based algorithm in which we compute portions of the eventual solution in parallel on large commodity clusters. We demonstrate that our new Hadoop-based algorithm works well on the 6800 video dataset and shares all of the properties described in the CVPR 2015 paper. We suggest avenues of future work in the project.
△ Less
Submitted 21 October, 2016;
originally announced October 2016.