Showing 1–2 of 2 results for author: Jas, D S

Search v0.5.6 released 2020-02-24

arXiv:2207.13771 [pdf, other]

cs.CL

CompText: Visualizing, Comparing & Understanding Text Corpus

Authors: Suvi Varshney, Divjeet Singh Jas

Abstract: A common practice in Natural Language Processing (NLP) is to visualize the text corpus without reading through the entire literature, still gras** the central idea and key points described. For a long time, researchers focused on extracting topics from the text and visualizing them based on their relative significance in the corpus. However, recently, researchers started coming up with more comp… ▽ More A common practice in Natural Language Processing (NLP) is to visualize the text corpus without reading through the entire literature, still gras** the central idea and key points described. For a long time, researchers focused on extracting topics from the text and visualizing them based on their relative significance in the corpus. However, recently, researchers started coming up with more complex systems that not only expose the topics of the corpus but also word closely related to the topic to give users a holistic view. These detailed visualizations spawned research on comparing text corpora based on their visualization. Topics are often compared to idealize the difference between corpora. However, to capture greater semantics from different corpora, researchers have started to compare texts based on the sentiment of the topics related to the text. Comparing the words carrying the most weightage, we can get an idea about the important topics for corpus. There are multiple existing texts comparing methods present that compare topics rather than sentiments but we feel that focusing on sentiment-carrying words would better compare the two corpora. Since only sentiments can explain the real feeling of the text and not just the topic, topics without sentiments are just nouns. We aim to differentiate the corpus with a focus on sentiment, as opposed to comparing all the words appearing in the two corpora. The rationale behind this is, that the two corpora do not many have identical words for side-by-side comparison, so comparing the sentiment words gives us an idea of how the corpora are appealing to the emotions of the reader. We can argue that the entropy or the unexpectedness and divergence of topics should also be of importance and help us to identify key pivot points and the importance of certain topics in the corpus alongside relative sentiment. △ Less

Submitted 27 July, 2022; originally announced July 2022.
arXiv:2112.14464 [pdf, other]

cs.SE

Forking Around: Correlation of forking practices with the success of a project

Authors: Anurag Dhasmana, Arindaam Roy, Divjeet Singh Jas, Kiranpreet Kaur, Pinn Prugsanapan

Abstract: Forking-based development has made it easier and straightforward for developers to contribute to open-source software (OSS). Developers can fork an existing project and add changes in their local version without interrupting the development process in the main project. Despite the efficiency of OSS, more than 80% of the projects are not sustainable. Identifying the elements related to OSS success… ▽ More Forking-based development has made it easier and straightforward for developers to contribute to open-source software (OSS). Developers can fork an existing project and add changes in their local version without interrupting the development process in the main project. Despite the efficiency of OSS, more than 80% of the projects are not sustainable. Identifying the elements related to OSS success can enlighten developers regarding the sustainability of a project. In our study, we explore whether or not the inefficiencies which arise due to forking-based development like redundant development, fragmented communities, lack of modularity, etc. have any relation to the outcome of a project in terms of sustainability. We formulate eight metrics to quantify attributes for projects in the ASFI dataset. To find the correlation between the metrics and the success of a project, we built a logistic regression model to metrics with significant p-values and performed backward stepwise regression analysis, using the stepAIC function in R to cross-check our findings. The findings show that modularity, centralized management index, and hard forks are consequential for the success of a project. Developers can use the outcomes of our research to plan and structure their projects to increase the probability of their success. △ Less

Submitted 29 December, 2021; originally announced December 2021.

Search v0.5.6 released 2020-02-24