-
BioMedLM: A 2.7B Parameter Language Model Trained On Biomedical Text
Authors:
Elliot Bolton,
Abhinav Venigalla,
Michihiro Yasunaga,
David Hall,
Betty Xiong,
Tony Lee,
Roxana Daneshjou,
Jonathan Frankle,
Percy Liang,
Michael Carbin,
Christopher D. Manning
Abstract:
Models such as GPT-4 and Med-PaLM 2 have demonstrated impressive performance on a wide variety of biomedical NLP tasks. However, these models have hundreds of billions of parameters, are computationally expensive to run, require users to send their input data over the internet, and are trained on unknown data sources. Can smaller, more targeted models compete? To address this question, we build an…
▽ More
Models such as GPT-4 and Med-PaLM 2 have demonstrated impressive performance on a wide variety of biomedical NLP tasks. However, these models have hundreds of billions of parameters, are computationally expensive to run, require users to send their input data over the internet, and are trained on unknown data sources. Can smaller, more targeted models compete? To address this question, we build and release BioMedLM, a 2.7 billion parameter GPT-style autoregressive model trained exclusively on PubMed abstracts and full articles. When fine-tuned, BioMedLM can produce strong multiple-choice biomedical question-answering results competitive with much larger models, such as achieving a score of 57.3% on MedMCQA (dev) and 69.0% on the MMLU Medical Genetics exam. BioMedLM can also be fine-tuned to produce useful answers to patient questions on medical topics. This demonstrates that smaller models can potentially serve as transparent, privacy-preserving, economical and environmentally friendly foundations for particular NLP applications, such as in biomedicine. The model is available on the Hugging Face Hub: https://huggingface.co/stanford-crfm/BioMedLM.
△ Less
Submitted 27 March, 2024;
originally announced March 2024.
-
MosaicBERT: A Bidirectional Encoder Optimized for Fast Pretraining
Authors:
Jacob Portes,
Alex Trott,
Sam Havens,
Daniel King,
Abhinav Venigalla,
Moin Nadeem,
Nikhil Sardana,
Daya Khudia,
Jonathan Frankle
Abstract:
Although BERT-style encoder models are heavily used in NLP research, many researchers do not pretrain their own BERTs from scratch due to the high cost of training. In the past half-decade since BERT first rose to prominence, many advances have been made with other transformer architectures and training configurations that have yet to be systematically incorporated into BERT. Here, we introduce Mo…
▽ More
Although BERT-style encoder models are heavily used in NLP research, many researchers do not pretrain their own BERTs from scratch due to the high cost of training. In the past half-decade since BERT first rose to prominence, many advances have been made with other transformer architectures and training configurations that have yet to be systematically incorporated into BERT. Here, we introduce MosaicBERT, a BERT-style encoder architecture and training recipe that is empirically optimized for fast pretraining. This efficient architecture incorporates FlashAttention, Attention with Linear Biases (ALiBi), Gated Linear Units (GLU), a module to dynamically remove padded tokens, and low precision LayerNorm into the classic transformer encoder block. The training recipe includes a 30% masking ratio for the Masked Language Modeling (MLM) objective, bfloat16 precision, and vocabulary size optimized for GPU throughput, in addition to best-practices from RoBERTa and other encoder models. When pretrained from scratch on the C4 dataset, this base model achieves a downstream average GLUE (dev) score of 79.6 in 1.13 hours on 8 A100 80 GB GPUs at a cost of roughly $20. We plot extensive accuracy vs. pretraining speed Pareto curves and show that MosaicBERT base and large are consistently Pareto optimal when compared to a competitive BERT base and large. This empirical speed up in pretraining enables researchers and engineers to pretrain custom BERT-style models at low cost instead of finetune on existing generic models. We open source our model weights and code.
△ Less
Submitted 16 January, 2024; v1 submitted 29 December, 2023;
originally announced December 2023.
-
An Empirical Study On Correlation between Readme Content and Project Popularity
Authors:
Akhila Sri Manasa Venigalla,
Sridhar Chimalakonda
Abstract:
Readme in GitHub repositories serves as a preliminary source of information, and thus helps developers in understanding about the projects, for reuse or extension. Different types of contextual and structural content, which we refer to as categories of the content and features in the content respectively, are present in readme files, and could determine the extent of comprehension about project. C…
▽ More
Readme in GitHub repositories serves as a preliminary source of information, and thus helps developers in understanding about the projects, for reuse or extension. Different types of contextual and structural content, which we refer to as categories of the content and features in the content respectively, are present in readme files, and could determine the extent of comprehension about project. Consequently, the structural and contextual aspects of the content could impact the project popularity. Studying the correlation between the content and project popularity could help in focusing on the aspects that could improve popularity, while designing the readme files. However, existing studies explore the categories of content and types of features in readme files, and do not explore their usefulness towards project popularity. Hence, we present an empirical study to understand correlation between readme file content and project popularity. We perform the study on 1950 readme files of public GitHub projects, spanning across ten programming languages, and observe that readme files in majority of the popular projects are well organised using lists and images, and comprise links to external sources. Also, repositories with readme files containing contribution guidelines and references were observed to be associated with higher popularity.
△ Less
Submitted 21 June, 2022;
originally announced June 2022.
-
VedicViz: Towards Visualizing Vedic Principles in Mental Arithmetic
Authors:
Noble Saji Mathews,
Akhila Sri Manasa Venigalla,
Sridhar Chimalakonda
Abstract:
Augmenting teaching with visualization can help students understand concepts better. Researchers have leveraged visualization to teach conventional mathematics some examples being spatial and origami visualizations. Apart from conventional mathematics, systems such as mental arithmetic involve techniques for rapid calculation without the use of any computing tools and hence have been used in devel…
▽ More
Augmenting teaching with visualization can help students understand concepts better. Researchers have leveraged visualization to teach conventional mathematics some examples being spatial and origami visualizations. Apart from conventional mathematics, systems such as mental arithmetic involve techniques for rapid calculation without the use of any computing tools and hence have been used in develo** computational competence among students. Vedic Mathematics is one such set of techniques for mental computation. However, there is a lack of technical tools which tackle mental arithmetic concepts and provide aid in the teaching of these topics to school students. Therefore, we propose VedicViz, a web portal that provides dynamic visualization of mathematical operations such as addition, multiplication and square root calculation, based on techniques in Vedic Mathematics. The web portal also provides visualization that enables learners to compare and contrast the mental mathematics based approach with the traditional methods for various inputs and operations. We evaluated VedicViz with 20 volunteers, who were in their high school education level. They found our web portal to be useful in practicing and learning to use the methods to perform various mathematical operations.
△ Less
Submitted 18 May, 2022;
originally announced May 2022.
-
WAccess -- A Web Accessibility Tool based on WCAG 2.2, 2.1 and 2.0 Guidelines
Authors:
Kowndinya Boyalakuntla,
Akhila Sri Manasa Venigalla,
Sridhar Chimalakonda
Abstract:
The vision of providing access to all web content equally for all users makes web accessibility a fundamental goal of today's internet. Web accessibility is the practice of removing barriers from websites that could hinder functionality for users with various disabilities. Web accessibility is measured against the accessibility guidelines such as WCAG, GIGW, and so on. WCAG 2.2 is the latest set o…
▽ More
The vision of providing access to all web content equally for all users makes web accessibility a fundamental goal of today's internet. Web accessibility is the practice of removing barriers from websites that could hinder functionality for users with various disabilities. Web accessibility is measured against the accessibility guidelines such as WCAG, GIGW, and so on. WCAG 2.2 is the latest set of guidelines for web accessibility that helps in making websites accessible. The web accessibility tools available in the World Wide Web Consortium (W3C), only conform up to WCAG 2.1 guidelines, while no tools exist for the latest set of guidelines. Despite the availability of several tools to check the conformity of websites with WCAG 2.1 guidelines, there is a scarcity of tools that are both open source and scalable. To support automated accessibility evaluation of numerous websites against WCAG 2.2, 2.1, and 2.0 we present a tool, WAccess. WAccess highlights violations of 13 guidelines from WCAG 2.0, 9 guidelines from WCAG 2.1, and 7 guidelines from WCAG 2.2 of a specific web page on the web console and suggests the fix for violations while specifying violating code snippet simultaneously. We evaluated WAccess against 2227 government websites of India and observed a total of about 6.1 million violations.
△ Less
Submitted 20 September, 2021; v1 submitted 14 July, 2021;
originally announced July 2021.
-
GitQ- Towards Using Badges as Visual Cues for GitHub Projects
Authors:
Akhila Sri Manasa Venigalla,
Kowndinya Boyalakunta,
Sridhar Chimalakonda
Abstract:
GitHub hosts millions of software repositories, facilitating developers to contribute to many projects in multiple ways. Most of the information about the repositories is text-based in the form of stars, forks, commits, and so on. However, developers willing to contribute to projects on GitHub often find it challenging to select appropriate projects to contribute to or reuse due to the large numbe…
▽ More
GitHub hosts millions of software repositories, facilitating developers to contribute to many projects in multiple ways. Most of the information about the repositories is text-based in the form of stars, forks, commits, and so on. However, developers willing to contribute to projects on GitHub often find it challenging to select appropriate projects to contribute to or reuse due to the large number of repositories present on GitHub. Further, obtaining this required information often becomes a tedious process, as one has to carefully mine information hidden inside the repository. To alleviate the effort intensive mining procedures, researchers have proposed npm-badges to outline information relating to build status of a project. However, these badges are static and limit their usage to package dependency and build details. Adding visual cues such as badges to the repositories might reduce the search space for developers. Hence, we present GitQ, to automatically augment GitHub repositories with badges representing information about source code and project maintenance. Presenting GitQ as a browser plugin to GitHub could make it easily accessible to developers using GitHub. GitQ is evaluated with 15 developers based on the UTAUT model to understand developer perception towards its usefulness. We observed that 11 out of 15 developers perceived GitQ to be useful in identifying the right set of repositories using visual cues such as generated by GitQ. The source code and tool are available for download on GitHub at https://github.com/gitq-for-github/plugin, and the demo can be found at https://youtu.be/c0yohmIat3A.
△ Less
Submitted 2 May, 2022; v1 submitted 8 July, 2021;
originally announced July 2021.
-
MuseumViz -- Towards Visualizing Online Museum Collections
Authors:
Dheeraj Vagavolu,
Akhila Sri Manasa Venigalla,
Sridhar Chimalakonda
Abstract:
Despite the growth of online museums for India's cultural heritage data, there is limited increase in terms of visitors. Over the years, online museums adopted many techniques to improve the overall user experience. However, many Indian online museums display artifacts as lists and grids with basic search functionality, making it less visually appealing and difficult to comprehend. Our work aims t…
▽ More
Despite the growth of online museums for India's cultural heritage data, there is limited increase in terms of visitors. Over the years, online museums adopted many techniques to improve the overall user experience. However, many Indian online museums display artifacts as lists and grids with basic search functionality, making it less visually appealing and difficult to comprehend. Our work aims to enhance the user experience of accessing Indian online museums by utilizing advancements in information visualization. Hence, we propose MuseumViz, a framework which processes data from online museums and visualizes it using four different interactive visualizations: the Network Graph, TreepMap, Polygon Chart and SunBurst Chart. We demonstrate MuseumViz on a total of 723 cultural heritage artifacts present in the Archaeological Survey of India, Goa. Based on our evaluation with 25 users, about 83% of them find it easier and more comprehensible to browse cultural heritage artifacts through MuseumViz.
△ Less
Submitted 22 June, 2021;
originally announced June 2021.
-
SurviveCovid-19++ : A collaborative healthcare game towards educating people about safety measures and vaccination for Covid-19
Authors:
Akhila Sri Manasa Venigalla,
Dheeraj Vagavolu,
Sridhar Chimalakonda
Abstract:
Covid-19 has been affecting population across the world for more than an year, with diverse strains of this virus being identified in many countries. Vaccines to help in curbing the virus are being developed and administered. Preventing the spread of the disease requires collaborative efforts from everyone. People with varied professional backgrounds have varied responsibilities in controlling the…
▽ More
Covid-19 has been affecting population across the world for more than an year, with diverse strains of this virus being identified in many countries. Vaccines to help in curbing the virus are being developed and administered. Preventing the spread of the disease requires collaborative efforts from everyone. People with varied professional backgrounds have varied responsibilities in controlling the pandemic. It is important that everyone is aware of their respective responsibilities and also empathize with efforts and duties of other individuals. It is here, we wish to leverage the potential of games in healthcare domain, towards educating about Covid-19. With an aim to educate the population about vaccination against Covid-19, responsibilities of citizens with varied professional backgrounds, and emphasize on the need for collaboration to fight against the pandemic, by following safety measures, we present SurviveCovid-19++, a collaborative multiplayer desktop based game. The game essentially revolves around four roles - doctor, sanitation worker, citizen and law enforcer, delivering their duties, following safety measures and collaboratively clearing multiple stages in the game. We have performed a preliminary evaluation of the game through a qualitative and quantitative user survey. The results of the user survey were encouraging, with volunteers expressing their increased empathy towards efforts of individuals with varied professional backgrounds, and better understanding of the importance of safety measures against Covid-19.
△ Less
Submitted 8 July, 2021; v1 submitted 17 April, 2021;
originally announced April 2021.
-
Representation range needs for 16-bit neural network training
Authors:
Valentina Popescu,
Abhinav Venigalla,
Di Wu,
Robert Schreiber
Abstract:
Deep learning has grown rapidly thanks to its state-of-the-art performance across a wide range of real-world applications. While neural networks have been trained using IEEE-754 binary32 arithmetic, the rapid growth of computational demands in deep learning has boosted interest in faster, low precision training. Mixed-precision training that combines IEEE-754 binary16 with IEEE-754 binary32 has be…
▽ More
Deep learning has grown rapidly thanks to its state-of-the-art performance across a wide range of real-world applications. While neural networks have been trained using IEEE-754 binary32 arithmetic, the rapid growth of computational demands in deep learning has boosted interest in faster, low precision training. Mixed-precision training that combines IEEE-754 binary16 with IEEE-754 binary32 has been tried, and other $16$-bit formats, for example Google's bfloat16, have become popular. In floating-point arithmetic there is a tradeoff between precision and representation range as the number of exponent bits changes; denormal numbers extend the representation range. This raises questions of how much exponent range is needed, of whether there is a format between binary16 (5 exponent bits) and bfloat16 (8 exponent bits) that works better than either of them, and whether or not denormals are necessary.
In the current paper we study the need for denormal numbers for mixed-precision training, and we propose a 1/6/9 format, i.e., 6-bit exponent and 9-bit explicit mantissa, that offers a better range-precision tradeoff. We show that 1/6/9 mixed-precision training is able to speed up training on hardware that incurs a performance slowdown on denormal operations or eliminates the need for denormal numbers altogether. And, for a number of fully connected and convolutional neural networks in computer vision and natural language processing, 1/6/9 achieves numerical parity to standard mixed-precision.
△ Less
Submitted 6 April, 2021; v1 submitted 29 March, 2021;
originally announced March 2021.
-
Understanding Emotions of Developer Community Towards Software Documentation
Authors:
Akhila Sri Manasa Venigalla,
Sridhar Chimalakonda
Abstract:
The availability of open-source projects facilitates developers to contribute and collaborate on a wide range of projects. As a result, the developer community contributing to such open-source projects is also increasing. Many of the projects involve frequent updates and extensive reuses. A well-updated documentation helps in a better understanding of the software project and also facilitates effi…
▽ More
The availability of open-source projects facilitates developers to contribute and collaborate on a wide range of projects. As a result, the developer community contributing to such open-source projects is also increasing. Many of the projects involve frequent updates and extensive reuses. A well-updated documentation helps in a better understanding of the software project and also facilitates efficient contribution and reuse. Though software documentation plays an important role in the development and maintenance of software, it also suffers from various issues that include insufficiency, inconsistency, ill-maintainability, and so on. Exploring the perception of developers towards documentation could help in understanding the reasons behind prevalent issues in software documentation. It could further aid in deciding on training that could be given to the developer community towards building more sustainable projects for society. Analyzing sentiments of contributors to a project could provide insights on understanding developer perceptions. Hence, as the first step towards this direction, we analyze sentiments of commit messages specific to the documentation of a software project. To this end, we considered the commit history of 998 GitHub projects from the GHTorrent dataset and identified 10,996 commits that correspond to the documentation of repositories. Further, we apply sentiment analysis techniques to obtain insights on the type of sentiment being expressed in commit messages of the selected commits. We observe that around 45% of the identified commit messages express trust emotion.
△ Less
Submitted 1 March, 2021;
originally announced March 2021.
-
What's in a GitHub Repository? -- A Software Documentation Perspective
Authors:
Akhila Sri Manasa Venigalla,
Sridhar Chimalakonda
Abstract:
Developers use and contribute to repositories on GitHub. Documentation present in the repositories serves as an important source by hel** developers to understand, maintain and contribute to the project. Currently, documentation in a repository is diversified, among various files, with most of it present in ReadMe files. However, other software artifacts in the repository, such as issue reports…
▽ More
Developers use and contribute to repositories on GitHub. Documentation present in the repositories serves as an important source by hel** developers to understand, maintain and contribute to the project. Currently, documentation in a repository is diversified, among various files, with most of it present in ReadMe files. However, other software artifacts in the repository, such as issue reports and pull requests could also contribute to documentation, without documentation being explicitly specified. Hence, in this paper, we propose a taxonomy of documentation sources by analyzing different software artifacts, developer interviews and card-sorting approach. We inspected multiple artifacts of 950 public GitHub repositories, written in four different programming languages, C++, C#, Python and Java, and analyzed the type and amount of documentation that could be extracted from these artifacts. To this end, we observe that, about 25.93% of information extracted from all sources proposed in the taxonomy contains error-related documentation, and that pull requests contribute to around 18.21% of extracted information.
△ Less
Submitted 1 March, 2021; v1 submitted 25 February, 2021;
originally announced February 2021.
-
EmoG- Towards Emojifying Gmail Conversations
Authors:
Akhila Sri Manasa Venigalla,
Sridhar Chimalakonda
Abstract:
Emails are one of the most frequently used medium of communication in the present day across multiple domains including industry and educational institutions. Understanding sentiments being expressed in an email could have a considerable impact on the recipients' action or response to the email. However, it is difficult to interpret emotions of the sender from pure text in which emotions are not e…
▽ More
Emails are one of the most frequently used medium of communication in the present day across multiple domains including industry and educational institutions. Understanding sentiments being expressed in an email could have a considerable impact on the recipients' action or response to the email. However, it is difficult to interpret emotions of the sender from pure text in which emotions are not explicitly present. Researchers have tried to predict customer attrition by integrating emails in client-company environment with emotions. However, most of the existing works deal with static assessment of email emotions. Presenting sentiments of emails dynamically to the reader could help in understanding senders' emotion and as well have an impact on readers' action. Hence, in this paper, we present EmoG as a Google Chrome Extension which is intended to support university students. It augments emails with emojis based on the sentiment being conveyed in the email, which might also offer faster overview of email sentiments and act as tags that could help in automatic sorting and processing of emails. Currently, EmoG has been developed to support Gmail inbox on a Google Chrome browser, and could be extended to other inboxes and browsers with ease. We have conducted a user survey with 15 university students to understand the usefulness of EmoG and received positive feedback.
△ Less
Submitted 14 October, 2020; v1 submitted 13 October, 2020;
originally announced October 2020.
-
Adaptive Braking for Mitigating Gradient Delay
Authors:
Abhinav Venigalla,
Atli Kosson,
Vitaliy Chiley,
Urs Köster
Abstract:
Neural network training is commonly accelerated by using multiple synchronized workers to compute gradient updates in parallel. Asynchronous methods remove synchronization overheads and improve hardware utilization at the cost of introducing gradient delay, which impedes optimization and can lead to lower final model performance. We introduce Adaptive Braking (AB), a modification for momentum-base…
▽ More
Neural network training is commonly accelerated by using multiple synchronized workers to compute gradient updates in parallel. Asynchronous methods remove synchronization overheads and improve hardware utilization at the cost of introducing gradient delay, which impedes optimization and can lead to lower final model performance. We introduce Adaptive Braking (AB), a modification for momentum-based optimizers that mitigates the effects of gradient delay. AB dynamically scales the gradient based on the alignment of the gradient and the velocity. This can dampen oscillations along high curvature directions of the loss surface, stabilizing and accelerating asynchronous training. We show that applying AB on top of SGD with momentum enables training ResNets on CIFAR-10 and ImageNet-1k with delays $D \geq$ 32 update steps with minimal drop in final test accuracy.
△ Less
Submitted 10 July, 2020; v1 submitted 2 July, 2020;
originally announced July 2020.
-
Mood of India During Covid-19 -- An Interactive Web Portal Based on Emotion Analysis of Twitter Data
Authors:
Akhila Sri Manasa Venigalla,
Dheeraj Vagavolu,
Sridhar Chimalakonda
Abstract:
The severe outbreak of Covid-19 pandemic has affected many countries across the world, and disrupted the day to day activities of many people. During such outbreaks, understanding the emotional state of citizens of a country could be of interest to various organizations to carry out tasks and to take necessary measures. Several studies have been performed on data available on various social media…
▽ More
The severe outbreak of Covid-19 pandemic has affected many countries across the world, and disrupted the day to day activities of many people. During such outbreaks, understanding the emotional state of citizens of a country could be of interest to various organizations to carry out tasks and to take necessary measures. Several studies have been performed on data available on various social media platforms and websites to understand the emotions of people against many events, inclusive of Covid-19, across the world. Twitter and other social media platforms have been bridging the gap between the citizens and government in various countries and are of more prominence in India. Sentiment Analysis of posts on twitter is observed to accurately reveal the sentiments. Analysing real time posts on twitter in India during Covid-19, could help in identifying the mood of the nation. However, most of the existing studies related to Covid-19, on twitter and other social media platforms are performed on data posted during a specific interval. We are not aware of any research that identifies emotional state of India on a daily basis. Hence, we present a web portal that aims to display mood of India during Covid-19, based on real time twitter data. This portal also enables users to select date range, specific date and state in India to display mood of people belonging to the specified region, on the specified date or during the specified date range. Also, the number of Covid-19 cases and mood of people at specific cities and states on specific dates is visualized on the country map. As of May 6 2020, the web portal has about 194370 tweets, and each of these tweets are classified into seven categories that include six basic emotions and a neutral category. A list of Trigger Events are also specified, to allow users to view the mood of India on specific events happening in the country during Covid-19.
△ Less
Submitted 6 May, 2020;
originally announced May 2020.
-
SurviveCovid-19 -- An Educational Game to Facilitate Habituation of Social Distancing and Other Health Measures for Covid-19 Pandemic
Authors:
Akhila Sri Manasa Venigalla,
Dheeraj Vagavolu,
Sridhar Chimalakonda
Abstract:
Covid-19 has been causing severe loss to the human race. Considering the mode of spread and severity, it is essential to make it a habit to follow various safety precautions such as using sanitizers and masks and maintaining social distancing to prevent the spread of Covid-19. Individuals are widely educated about the safety measures against the disease through various modes such as announcements…
▽ More
Covid-19 has been causing severe loss to the human race. Considering the mode of spread and severity, it is essential to make it a habit to follow various safety precautions such as using sanitizers and masks and maintaining social distancing to prevent the spread of Covid-19. Individuals are widely educated about the safety measures against the disease through various modes such as announcements through online or physical awareness campaigns, advertisements in the media and so on. The younger generations today spend considerably more time on mobile phones and games. However, there are very few applications or games aimed to help in practicing safety measures against a pandemic, which is much lesser in the case of Covid-19. Hence, we propose a 2D survival-based game, SurviveCovid-19, aimed to educate people about safety precautions to be taken for Covid-19 outside their homes by incorporating social distancing and usage of masks and sanitizers in the game. SurviveCovid-19 has been designed as an Android-based mobile game, along with a desktop (browser) version, and has been evaluated through a remote quantitative user survey, with 30 volunteers using the questionnaire based on the MEEGA+ model. The survey results are promising, with all the survey questions having a mean value greater than 3.5. The game's quality factor was 69.3, indicating that the game could be classified as excellent quality, according to the MEEGA+ model.
△ Less
Submitted 3 May, 2021; v1 submitted 21 April, 2020;
originally announced April 2020.
-
Pipelined Backpropagation at Scale: Training Large Models without Batches
Authors:
Atli Kosson,
Vitaliy Chiley,
Abhinav Venigalla,
Joel Hestness,
Urs Köster
Abstract:
New hardware can substantially increase the speed and efficiency of deep neural network training. To guide the development of future hardware architectures, it is pertinent to explore the hardware and machine learning properties of alternative training algorithms. In this work we evaluate the use of small batch, fine-grained Pipelined Backpropagation, an asynchronous pipeline parallel training alg…
▽ More
New hardware can substantially increase the speed and efficiency of deep neural network training. To guide the development of future hardware architectures, it is pertinent to explore the hardware and machine learning properties of alternative training algorithms. In this work we evaluate the use of small batch, fine-grained Pipelined Backpropagation, an asynchronous pipeline parallel training algorithm that has significant hardware advantages. We introduce two methods, Spike Compensation and Linear Weight Prediction, that effectively mitigate the downsides caused by the asynchronicity of Pipelined Backpropagation and outperform existing techniques in our setting. We show that appropriate normalization and small batch sizes can also aid training. With our methods, fine-grained Pipelined Backpropagation using a batch size of one can match the accuracy of SGD for multiple networks trained on CIFAR-10 and ImageNet. Simple scaling rules allow the use of existing hyperparameters for traditional training without additional tuning.
△ Less
Submitted 9 April, 2021; v1 submitted 25 March, 2020;
originally announced March 2020.
-
StackEmo-Towards Enhancing User Experience by Augmenting Stack Overflow with Emojis
Authors:
Akhila Sri Manasa Venigalla,
Sridhar Chimalakonda
Abstract:
With the increase in acceptance of open source platforms for knowledge sharing, Question and Answer (Q\&A) websites such as Stack Overflow have become increasingly popular in the programming domain. Many novice programmers visit Stack Overflow for reasons that include posing questions, finding answers for issues they come across in the process of programming. Practitioners voluntarily answer quest…
▽ More
With the increase in acceptance of open source platforms for knowledge sharing, Question and Answer (Q\&A) websites such as Stack Overflow have become increasingly popular in the programming domain. Many novice programmers visit Stack Overflow for reasons that include posing questions, finding answers for issues they come across in the process of programming. Practitioners voluntarily answer questions on Stack Overflow based on their experience or prior knowledge. Most of these answers are also accompanied by comments from users of Stack Overflow. Questions, answers and comments on Stack Overflow also include sentiments of users, which when analysed and presented could motivate users in reading and contributing to the posts. However, the sentiment of these posts is not being depicted in the current Stack Overflow platform. There is extensive research on analysing sentiments on social networking platforms such as twitter. Representing sentiment of a post might motivate users to follow or answer certain posts. While there exist several tools that augment or annotate Stack Overflow platform for developers, we are not aware of tools that deal with sentiment of the posts. In this paper, we propose StackEmo as a Google Chrome plugin to augment comments on Stack Overflow with emojis, based on the sentiment of the comments posted, with the aim to provide users with visual cues that could motivate the users to review and contribute to available comments. We evaluated StackEmo through an in-user likert scale based survey with 30 university students. The results of the survey provided us insights on improving StackEmo, with 83% participants having recommended the plugin to their peers.
△ Less
Submitted 17 June, 2020; v1 submitted 30 January, 2020;
originally announced January 2020.