SuperNOVA: Design Strategies and Opportunities for Interactive Visualization in Computational Notebooks

Zijie J. Wang 0000-0003-4360-1423 Georgia TechAtlantaGeorgiaUSA , David Munechika 0000-0002-3643-6899 Georgia TechAtlantaGeorgiaUSA , Seongmin Lee 0000-0002-1950-5004 Georgia TechAtlantaGeorgiaUSA and Duen Horng Chau 0000-0001-9824-3323 Georgia TechAtlantaGeorgiaUSA

(2024)

Abstract.

Computational notebooks, such as Jupyter Notebook, have become data scientists’ de facto programming environments. Many visualization researchers and practitioners have developed interactive visualization tools that support notebooks, yet little is known about the appropriate design of these tools. To address this critical research gap, we investigate the design strategies in this space by analyzing 163 notebook visualization tools. Our analysis encompasses 64 systems from academic papers and 105 systems sourced from a pool of 55k notebooks containing interactive visualizations that we obtain via scra** 8.6 million notebooks on GitHub. Through this study, we identify key design implications and trade-offs, such as leveraging multimodal data in notebooks as well as balancing the degree of visualization-notebook integration. Furthermore, we provide empirical evidence that tools compatible with more notebook platforms have a greater impact. Finally, we develop SuperNOVA, an open-source interactive browser to help researchers explore existing notebook visualization tools. SuperNOVA is publicly accessible at: https://poloclub.github.io/supernova/.

Computational Notebook, Interactive Visualization, Systematic Review, Data Science, Design, Cross-Platform Visualization

^†^†journalyear: 2024^†^†copyright: rightsretained^†^†conference: Extended Abstracts of the CHI Conference on Human Factors in Computing Systems; May 11–16, 2024; Honolulu, HI, USA^†^†booktitle: Extended Abstracts of the CHI Conference on Human Factors in Computing Systems (CHI EA ’24), May 11–16, 2024, Honolulu, HI, USA^†^†doi: 10.1145/3613905.3650848^†^†isbn: 979-8-4007-0331-7/24/05^†^†ccs: Human-centered computing Visualization^†^†ccs: Human-centered computing Interactive systems and tools^†^†ccs: Human-centered computing Visualization systems and tools^†^†ccs: Human-centered computing Visualization design and evaluation methods

Refer to caption — Fig. 1. SuperNOVA is a browser for exploring 163 notebook interactive visualization tools. Users can filter and search for tools with specific properties in the left panel. Clicking on a tool reveals details including paper metadata and GitHub repository.

1. Introduction

Computational notebooks, such as Jupyter Notebook (Kluyver and others, 2016) and Colab, are the most popular programming environments among data scientists (Kaggle, 2022). These notebooks seamlessly combine text, code, and visual outputs in a document that consists of an arbitrary number of cells—small text and code editors. Users can execute a code cell, and its output (e.g., text and visualizations) will be displayed below the cell. By providing a literate programming environment, notebooks enable users to perform exploratory data analysis, document their work, and share insights with collaborators (Rule et al., 2018).

To create easy-to-adopt tools, there is a trend in the VIS community to develop interactive visualization systems that can be used in notebooks (e.g., Ono et al., 2021; Xenopoulos et al., 2023; Wang et al., 2022e). Designing visualizations for notebook environments presents unique opportunities and considerations. On the one hand, notebook visualization tools allow direct modification of data through user interactions (Uber, 2016), and users can mix-and-match different visualization tools to create dashboards (Wang et al., 2022a). However, notebook users often write fragmentary code and execute it nonlinearly (Mcnutt et al., 2023; Weinman et al., 2021), which differs from traditional workflows for using interactive visualization systems (Chen and Golan, 2016).

Therefore, if researchers do not consider notebooks’ unique characteristics, their notebook visualization tools may not fully realize the potential of notebooks and, at worst, may impede the ability of notebook users to effectively use these tools. To shed light on the existing landscape of notebook visualization tools and help visualization researchers and practitioners harness the potential of notebook environments, we contribute:

•

The first systematic review of 163 notebook visualization tools including 64 systems introduced in academic papers and 105 tools sourced from a pool of 55k notebooks containing interactive visualizations that we obtain via scra** 8.6 million notebooks on GitHub (Fig. 2). To inform the design of future tools, we discuss unique design implications (§ 4) and trade-offs (§ 5).
•

Organizational framework to characterize notebook visualization tools in terms of their motivation for supporting notebooks (§ 4), targeted users (§ 4.1), and a four-dimensional design space based on user needs (§ 5). This framework facilitates a more comprehensive understanding of the landscape of notebook visualization tools. Based on this framework, we further analyze the effects of design factors on the impact of notebook visualization tools. We find tools supporting more notebook platforms have significantly more GitHub stars and paper citations (§ 6).

To broaden the public’s access to our collection, we develop SuperNOVA (Fig. 1), an interactive tool that helps researchers and designers explore existing notebook visualization tools and search for design inspiration and implementation references. Anyone can easily add new tools to this open-source¹¹1SuperNOVA code: https://github.com/poloclub/supernova explorer. SuperNOVA is publicly accessible at: https://poloclub.github.io/supernova/.

2. Related Work

Our work joins the research body of studying interactive tools for notebooks. To understand notebook users’ behaviors, researchers conduct interview studies (Kery et al., 2018) and analyze 1 million notebooks scraped from GitHub (Rule et al., 2018). Researchers present methods to help researchers develop notebook-compatible visualization tools (Piazentin Ono et al., 2021; Wang et al., 2022d). More recently, a design space analysis is conducted for AI-powered code assistants for notebooks (Mcnutt et al., 2023). In contrast, our work focuses on the design of visualization tools for notebooks by analyzing 163 tools identified from academic papers and 8.6 million notebooks. Additionally, inspired by the popular interactive survey browsers for text visualization (Kucher and Kerren, 2015), biological data visualization (Kerren et al., 2017), visualizations for trust in machine learning (Chatzimparmpas et al., 2024), and embedding visualization (Huang et al., 2023), we develop SuperNOVA, the first interactive explorer for notebook visualization tools.

3. Methodology

Systematic Review. To study how researchers and practitioners design notebook visualization tools, we collected and analyzed 64 academic papers and 105 tools in the wild. In this study, we define notebook visualization tools as systems that can display interactive visualizations in Python computational notebooks. (1) Literature collection: we searched Google Scholar for notebook visualization tools and performed forward and backward reference searches to snowball the results. (2) In-the-wild tool collection: we scraped 8.6 million notebooks from GitHub and filtered 55k notebooks containing interactive visualizations by matching notebook cell output types. We extracted 984 potential visualization packages by matching variable names and imported modules using abstract syntax trees, and we manually examined each package to keep 105 that were indeed notebook visualization tools (see ‡ B for details). (3) Coding: we conducted a multi-phase coding process to analyze the collected papers, documentation, and demo notebooks. First, three authors independently open coded (Braun and Clarke, 2006) the same 30 random tools regarding the motivations for using notebooks and design strategies using Google Sheets. After discussing the codebook and resolving disagreements, the three coders independently conducted open coding on the remaining tools, allocating an equal number of tools to each author. Following the analysis of the final codebook and themes, one author applied deductive coding (Merriam et al., 2002) to assign identified design patterns to each tool. We share all scra** code, codebook, and metadata of 163 tools in SuperNOVA’s repository.

Organizational Framework. Our large-scale systematic review resulted in an organizational framework characterizing notebook visualization tools in terms of motivations for supporting notebooks (§ 4), targeted users (§ 4.1), and design patterns based on user needs (§ 5). Using this framework, we develop SuperNOVA (Fig. 1), an interactive explorer that allows for easy filtering and searching for notebook visualization tools with desired properties. Based on our review, we distill 4 design implications and 4 design trade-offs to help future researchers design notebook visualization tools. Finally, we conduct a correlation analysis and two regression analyses to examine the effects of design patterns on the impacts of notebook interactive visualization tools (§ 6).

4. Why Notebook Visualization Tools

This section discusses the motivation for develo** interactive visualization tools for computational notebooks. We organize these motivations into four non-mutually exclusive groups.

4.1. Seamless Workflow Integration

Our study reveals that most of the surveyed visualization tools support notebooks as a means of aligning with the workflows of end-users. We observe that different user groups have distinct notebook usage patterns. Therefore, to ground our discussion on the notebook workflows of end-users, we categorize end-users into three user groups: data scientists, scientists, and educators and students.

[Uncaptioned image]

Data Scientists. Notebooks are the most popular programming environment among data scientists (Kaggle, 2022). Consequently, many researchers have developed notebook visualization tools to promote adoption among data scientists. Data scientists use notebooks for conducting rapid experiments, collaborating with other stakeholders, and directly deploying notebooks within production pipelines (Chattopadhyay et al., 2020). Notebook visualization tools have covered almost every stage of data scientists’ workflow, from annotating data (Zhang et al., 2023b) and exploring data (Li et al., 2023b), to develo** ML models (Ono et al., 2021), documenting models (Bhat et al., 2023), evaluating models (Munechika et al., 2022), and communicating findings to stakeholders (Wang et al., 2023a).

[Uncaptioned image]

Scientists. Notebooks are also popular among scientists, including biologists and physicists. Scientists use them as an interface for accessing remote clusters (Sbailò et al., 2022), and publishing notebooks with academic papers is considered good practice for reproducible research (Herwig et al., 2018). Thus, many notebook visualization tools are developed to facilitate scientific research workflows, such as designing experiments (Guo et al., 2021), simulating physical environments (Freeman et al., 2021), and analyzing molecules (Nguyen et al., 2018) and astronomical data (Araya et al., 2018).

[Uncaptioned image]

Educators and Students. Notebooks are increasingly being used as interactive textbooks in computing education, as they enable students to easily interact with code and test their ideas (Smith et al., 2021). Educators also use notebooks for assigning and grading programming assignments (Hull et al., 2023). In this use case, notebooks serve as worksheets where students write and run their code in specific cells. We observe a growing trend of notebook visualization tools that are specifically developed for educators and students. For example, GILP (Robbins et al., 2023) visualizes simplex algorithms in notebooks, allowing educators to design interactive textbooks and assignments (Fig. 3). VizProg (Zhang et al., 2023a) helps instructors monitor students’ coding progress during in-class exercises through interactive visualizations.

Our findings highlight that computational notebooks are a popular medium among diverse user groups. In addition to data scientists, scientists, educators, and students also use notebooks in their workflows. This provides visualization researchers and designers with exciting opportunities to develop tools that can be easily adopted. However, we find different user groups have distinct notebook workflows. For example, scientists use notebooks for collaboration and reproducible research, while educators use them as textbooks and worksheets. Therefore, researchers should engage with targeted user groups in the early design process (Sedlmair et al., 2012) to investigate users’ notebook workflows and ground their designs.

Implication on domain-specific design: Designing notebook visualization tools requires researchers to engage with targeted user groups to develop tailored tools, as different user groups have distinct notebook usage patterns.

4.2. Easy Access to Read and Refine Artifacts

Notebook visualization tools not only benefit from easy adoption but also access to programming artifacts, including code, raw data, and models. These tools can be categorized into two groups based on their uses of artifacts.

Artifacts $\rightarrow$ Visualization Generation. To create visualizations in non-notebook environments, data scientists often need to manually specify chart types and input data. However, notebook tools have access to all artifacts needed to create visualizations. For example, B2 (Wu et al., 2020) uses dataframes and code queries in notebooks to automatically synthesize interactive visualizations. Similarly, Lux (Lee et al., 2021) and Solas (Epperson et al., 2022) provide automatic visualization recommendations based on a user’s dataframe and analysis history (Fig. 4A). Through accessing ML models that are being trained in notebooks, TensorBoard (Abadi et al., 2016) can visualize the model’s performance in real time.

Visualizations $\rightarrow$ Artifact Refinement. After gaining insights from visualizations, data scientists often manually refine their code, data, and models outside of notebooks. Notebooks can accelerate this process by directly updating artifacts. For example, Mage (Kery et al., 2020) automatically generates code to reflect the change caused by a user’s interaction with visualizations (e.g., deleting a column from a table). Similarly, GAM Changer (Wang et al., 2022b) enables users to modify ML model weights by direct manipulation on visualizations (Fig. 4B).

When designing notebook visualization tools, it is crucial to consider integrating the input and output in the visualization workflow (e.g., Chen and Golan, 2016; Cashman et al., 2019; Upson et al., 1989) into the notebook environment. Take Keim et al. (2008)’s visual analytics pipeline as an example, the input data can be notebook runtime artifacts, text, and usage logs (§ 5.2), and the output knowledge can be directly operationalized to synthesize code, transform data, and update ML models in the notebook (§ 5.1).

Implication on new opportunities enabled by easy artifact access: Computational notebooks provide unique opportunities for researchers to integrate the input of a visualization pipeline (e.g., notebook runtime artifacts and text) and operationalize its output (e.g., transforming data and updating ML models) within the users’ existing workflow.

4.3. Portability and Shareability

The notebook community has developed a vibrant ecosystem to convert notebooks into a wide range of mediums. This includes the ability for users to publish notebooks containing interactive visualizations as slides (Wang et al., 2023a), interactive books (Community, 2020), and dashboards (Bäuerle et al., 2022). Therefore, given the portability of notebooks, notebook visualization tools have the potential to reach a more diverse audience. For instance, InterpretML (Nori et al., 2019) leverages Jupyter Book (Community, 2020) to incorporate in-notebook visualizations into its documentation, providing readers with an engaging way to learn about ML model explanations (Fig. 5). However, different visualization modalities may present unique design challenges, such as potential accessibility concerns for interactive visualizations in presentation slides (Yip et al., 2021) and the need to consider social contexts for dashboard design (Sarikaya et al., 2019). Thus, it is crucial for researchers to carefully consider specific design constraints associated with different modalities if they decide to use notebooks as a bridge to other visualization mediums.

Implication on cross-modality design: The notebook ecosystem offers various options for distributing and sharing notebook visualization tools with diverse stakeholders through various modalities (e.g., interactive books, slides, dashboards). However, researchers need to consider unique design challenges associated with the targeted modalities.

4.4. Ease of Implementation

There exist multiple methods, varying in difficulty, for implementing notebook visualization tools. Some methods are simple and attract researchers to add notebook support for existing visualizations. For example, the ML library CatBoost (Prokhorenkova et al., 2019) uses Jupyter Notebook’s native ipywidgets to add checkboxes and sliders to help users customize simple loss function plots. Recent researchers have introduced NOVA workflow (Wang et al., 2022d), which enables easy conversion of web-based visualization apps into notebook widgets (e.g., Wang et al., 2022e; Munechika et al., 2022; Wang et al., 2022b). Moreover, we observe that some developers use notebooks as a platform for rapidly prototy** and deploying GUI applications. For instance, Pigeon (Germanidis, 2017) leverages ipywidgets to implement a simple visualization tool that allows annotators to label text and image data. Computational notebooks are web-based systems, and the low barrier to authoring notebook visualization tools reflects and contributes to the trend of web-based interactive visualizations (Battle et al., 2018, 2022). With the increasing ease of develo** notebook visualization tools, we anticipate a growing number of such tools catering to various notebook user groups (§ 4.1).

Implication on growing trend of notebook visualization tools: As the implementation is becoming increasingly accessible, the trend of using computational notebooks as a flexible platform for deploying and develo** web-based interactive visualization tools will continue.

5. How to Design Notebook Vis Tools

This section discusses the design patterns of existing notebook visualization tools. To organize these patterns, we construct a four-dimensional design space based on the tool users’ needs.

5.1. Notebook-Visulization Integration

The level of integration between notebook environments and visualization tools can vary widely. We characterize this integration continuum by the data communication channels between these two parties, where loosely integrated visualization tools have fewer communication channels than more tightly integrated tools.

[Uncaptioned image]

No Direct Communication. A few notebook visualization tools do not directly receive data from the notebook environment, as their data source is not available within users’ notebooks. Nevertheless, notebooks allow these tools to retrieve data from external sources (§ C.2.1), thereby allowing users to enjoy these tools in their workflows. For example, TensorBoard reads log files from the file system, and StatCast (Lage et al., 2016) reads data from a separate database server. Argo Lite (Li et al., 2020) allows notebook users to view graph visualizations that are created from a separate website (Fig. 6A).

[Uncaptioned image]

One-way Communication. Most notebook visualization tools have a one-way communication with the notebook environment: they receive input from the notebook but do not send data back to the notebook (§ C.2.2). (1) Users can explicitly specify the input. For example, users can write code to feed an ML model and data into Visual Auditor (Munechika et al., 2022), which generates interactive visualizations for auditing model biases (Fig. 6B). (2) Some tools also leverage implicit input. For instance, Solas provides situated visualization recommendations by analyzing a user’s historical analysis code. With a one-way communication, users can follow the familiar input-output notebook pattern (Kluyver and others, 2016) to customize visualization tools.

[Uncaptioned image]

Bidirectional Communication. Tools with high notebook integration not only receive input from the notebook but also update its content (§ C.2.3). (1) These tools can add new code or text to the notebook. For example, B2 (Wu et al., 2020) adds a user’s interaction history to the notebook cells, and Mage (Kery et al., 2020) generates code that can lead to the same consequence as user interactions. (2) Some tools directly modify the runtime states in a notebook. For instance, the spatial visualization tool pydeck (Uber, 2016) stores the user’s selected data from the visualization in a runtime variable, which users can access in other code cells (Fig. 6C). Bidirectional communication in notebooks can be a powerful and unique feature that help interactive visualization users operationalize visualization insights (§ 4.2).

Notebooks enable researchers to integrate both input ( [Uncaptioned image] one-way communication) and output ( bidirectional communication) of a visualization pipeline into the users’ existing workflow (Easy Access to Read and Refine Artifacts). However, designing bidirectional communication requires caution. Chattopadhyay et al. (2020) find that notebook users often struggle to keep track of the states in different cells. Therefore, automatically modifying notebook states through a visualization tool could cause further confusion. Similarly, in Wu et al. (2020)’s study, some participants found it ”annoying” when notebook content was populated from a visualization tool. Thus, it is crucial to offer users clear feedback and allow users to configure state-updating behaviors.

Trade-off on data communication: Designing data communication channels (e.g., one-way vs. bidirectional communication) in notebook visualization tools requires careful balance: while bidirectional communication enriches user workflow, it also risks confusion, highlighting the need for clear user feedback and configurable content update policies.

5.2. Data Source and Type

Notebook environments offer rich and multimodal data sources that a visualization tool can use to meet user needs.

[Uncaptioned image]

Runtime Artifacts. The most common visualization data source is a notebook’s runtime artifacts. Visualization tools have access to any data specified by notebook users; existing notebook visualization tools support many data modalities, such as tables (Brugman, 2019), spatial data (Uber, 2016), and 3D images (Abraham et al., 2014). Some tools also leverage ML models in a notebook runtime, hel** users interpret transformers (Vig, 2019), curate decision trees (Wang et al., 2022e), calibrate generalized additive models (Xenopoulos et al., 2023), and explore counterfactual explanations (Wexler et al., 2019).

[Uncaptioned image]

Code and Text. Notebooks combine code and text documentation, which visualization tools can exploit to enhance visualizations (§ C.3). For example, Anteater (Faust et al., 2022) leverages trace-based visualization to help notebook users debug their analysis code. Jigsaw (Kluyver and others, 2016) uses variable names in a notebook to validate and correct code generated by AI models. More recently, researchers also use code and text in notebooks to create interactive slides to communicate data insights (Wang et al., 2023a; Li et al., 2023a). Moreover, to help users write high-quality ML model documentation, DocML (Bhat et al., 2023) links a model card visualization to both code and text cells in a notebook.

[Uncaptioned image]

External Data. Moreover, notebook visualization tools can access data beyond the notebook environment, such as the file system, networks, and hardware information. For example, TensorBoard and StatCast visualize data from a local directory and a database server, respectively. NVDashboard (NVIDIA, 2021) provides notebook users with an interactive dashboard to monitor real-time GPU usage.

Although notebooks provide unique and valuable data for designing interactive visualization tools, accessing various data types requires different implementation strategies. While it is relatively easy to read [Uncaptioned image] runtime artifacts (§ 4.4), it requires more engineering effort to read code and text or implement bidirectional communication (see ‡ C for detailed discussion on implementation strategies). Certain strategies are only compatible with specific notebook platforms; for example, tools implemented with Jupyter extensions cannot be used in Google Colab. Thus, there is a trade-off between accessing powerful notebook features and ensuring compatibility with diverse notebook platforms.

Trade-off on compatibility: Notebooks provide access to unique data types, including runtime artifacts, code, text and external data. However, there is a trade-off between leveraging powerful, yet platform-specific features like reading code and text and bidirectional communication, and ensuring broader compatibility across various notebook platforms.

5.3. Display Style & Sensemaking Context

Notebook visualization tools’ display styles can vary based on the user’s sensemaking context (Liu and Stasko, 2010). On-demand displays can be used for situational contexts, while always-on displays are suitable for continuous contexts.

[Uncaptioned image]

On-demand display. Most visualization tools show visualizations below a code cell (e.g., Wexler et al., 2019; Wang et al., 2022b; Xenopoulos et al., 2023). These visualizations are part of the cell flow—they move vertically with the cells when a user scrolls through the notebook. With this layout, users can easily create multiple instances of the same visualization tool with different input data. For example, users can create multiple instances of TimberTrek (Wang et al., 2022e) in different cells with different collections of decision trees and compare across these collections (Fig. 7A).

[Uncaptioned image]

Always-on display. Notebook tools can also display visualizations outside of notebook cells (§ C.4), leading to an always-on display detached from the cell flow. For instance, AutoProfiler (Wu et al., 2020) continuously updates data distribution visualizations in a resizable dashboard pane to the right of the notebook UI, allowing users to view persistent data profiling information while exploring their datasets (Fig. 7B). Similarly, NVDashboard (NVIDIA, 2021) displays multiple charts outside of the notebook UI so that users can monitor their GPU usage in real time while interacting with the notebook.

The design choice of visualization display style in the notebook depends on the users’ needs and the sensemaking context. Based on Liu and Stasko (2010)’s sensemaking model, visualizations provide external anchoring, cognitive offloading and information foraging. They suggest that visualization designers should minimize the “semantic distance” (Hutchins et al., 1985) between the tasks users want to perform and the physical form of visualizations. In computational notebooks, an [Uncaptioned image] on-demand display can assist users with situational sensemaking and temporary anchoring for comparisons. On the other hand, an always-on display can be beneficial for ongoing monitoring and tasks that require continuous cognitive offloading.

Trade-off on display style: Researchers need to consider the trade-off between on-demand and always-on displays of interactive visualizations in notebooks based on the users’ needs. On-demand displays aid situational sensemaking and comparisons, while always-on displays support continuous monitoring and cognitive offloading.

5.4. Modularity

Modularity in notebook visualization tools is a critical consideration when catering to different analysis needs, such as exploratory and exploitative (Batch and Elmqvist, 2018), and user’s programming proficiency. This ensures a balance between the code and the graphical user interface.

[Uncaptioned image]

Monolithic System. Most notebook visualization tools are monolithic, presenting the entire system all at once. For example, when a user calls ydata-profilling (Brugman, 2019) in a notebook cell, the tool displays a panel beneath the cell that contains all exploratory data analysis visualizations (Fig. 8A). These visualizations are organized into multiple tabs based on their tasks, such as variable interactions, correlations, and missing values.

[Uncaptioned image]

Modular Components. Modular visualization tools accommodate the fragmentary nature of notebook code and allow users to easily customize and compose visualizations. For example, Aequitas (Saleiro et al., 2019), an ML fairness auditing toolkit, provides different interactive visualizations for different fairness metrics. These visualizations are modularized into separate functions, enabling users to write code to generate and compose visualizations that meet specific needs (Fig. 8B). For instance, with Aequitas, a user can create and inspect a fairness overview in a notebook cell and delve into specific fairness metrics in other separate cells.

Monolithic and modular architectures have been extensively discussed in the software engineering literature for decades (Aoyama, 1998). Within the visual analytics research community, there is a recent trend towards shifting from designing “over-complicated” monolithic systems to simpler and reusable modular components (Wu et al., 2022; Bertini, 2022). The use of [Uncaptioned image] modular components aligns well with computational notebooks, as notebook users can easily display and customize different components in separate notebook cells. Additionally, users can take advantage of dashboard authoring tools (Wang et al., 2022a; Bäuerle et al., 2022) to compose different visualization components into a dashboard directly in their notebooks (§ 4.3). However, [Uncaptioned image] modular components require the users to know their visualization goals (i.e., exploitative analysis) and know how to write code to display the appropriate components. In contrast, a monolithic system is more friendly to beginner users and suitable for exploratory analysis, where it can guide users to uncover data patterns and insights.

Trade-off on modularity: Modular visualization tools are composable and reusable, particularly in notebooks where users can easily display and customize them. While modular components offer flexibility for users with clear analysis goals and coding skills, monolithic systems remain more beginner-friendly and ideal for exploratory analysis.

6. Analysis

Leveraging our organizational framework as a lens, we conduct a quantitative analysis to study the relationship between the design of notebook visualization tools and the impacts of these tools (e.g., GitHub star count and publication citation count). Our analysis offers additional insights into future design decisions.

Data Collection. We characterize all 163 notebook visualization tools using our framework (Table 1). Then, we collect the GitHub star count, first commit date, publication year, and citation count of

all available tools via the GitHub API (GitHub, 2023) and Semantic Scholar API (Kinney et al., 2023). Among all 163 tools, 135 have GitHub repositories and 76 have Semantic Scholar entries.

Correlation Analysis. We analyze the correlations across different design dimensions by conducting pair-wise $X^{2}$ independence tests (Fig. 10). Unsurprisingly, our results highlight that implementation strategies correlate with many other design dimensions. For example, tools that support [Uncaptioned image] always-on display are more likely to be implemented using extensions. Interestingly, data source is correlated with both communication and display styles. In particular, tools that access code and text from the notebook are more likely to support bidirectional communication and [Uncaptioned image] always-on display. We hypothesize that this is because designers often use text and code from a notebook for generative tasks (e.g., automatic visualization generation), and they prefer always-on displays to provide notebook users with continuous feedback (Display Style & Sensemaking Context). For example, visualization recommendation tools B2 (Wu et al., 2020) and PI2 (Chen and Wu, 2022) leverage existing code and text in a notebook to generate new visualization code in the notebook and display synthesized visualizations on an always-on panel. Finally, we observe that tools with [Uncaptioned image] bidirectional communication support much less notebook platforms than tools with one-way communication. This empirical finding reflects the trade-off between notebook integration and platform dependency (Data Source and Type).

Regression Analysis. We conduct two regression analyses to examine the effects of design factors on the impact of notebook visualization tools, as measured by GitHub star and paper citation counts (Fig. 9). Since implementation strategies correlate with many other design dimensions, we do not include it in both regression models. We include time as an independent variable and use dummy variables to encode categorical variables. The results highlight that tools supporting more notebook platforms have significantly more GitHub stars and paper citations. Other design dimensions do not significantly affect the popularity and recognition of notebook visualization tools. This result implies that future researchers and developers should prioritize notebook platform compatibility to maximize the impact of their tools.

7. Discussion and Future Work

By analyzing 163 interactive notebook visualization tools identified from 8.6 million public notebooks and 64 academic papers (§ 3), we present an organization framework to characterize these tools (§ 4, § 5). We provide practice design implications and trade-offs as well as insights from statistical analyses (§ 6). Based on our findings, we discuss future research opportunities and limitations of our study.

Democratizing Notebook Visualization Tool Creation. We discover a spectrum of methods, varying in difficulty, for authoring notebook visualization tools (§ 4.4). In particular, accessing [Uncaptioned image] code and text and supporting bidirectional communication require significant engineering effort (§ 5.2). Furthermore, some implementation strategies are only compatible with specific notebook platforms (Data Source and Type). Therefore, we see research opportunities to lower the barrier to authoring notebook interactive visualization tools that harness the full potential of notebook platforms. First, practitioners often use libraries such as D3 (Bostock et al., 2011) and VegaLite (Satyanarayan et al., 2017) to develop web-based interactive visualizations. It would be valuable if these libraries integrated native support for notebook platforms or new libraries specifically targeted authoring notebook visualizations. On the other hand, researchers can also enhance notebook platforms to better support interactive visualizations. For example, similar to browser vendors sharing the same web standard, researchers can develop a universal notebook protocol that enables developers to access and communicate data using a standardized method across notebook platforms.

Enriching Fluid Notebook-Vis Integration. The design trade-offs regarding visualization display styles (Display Style & Sensemaking Context) and modularity (Modularity) partially arise from the rigid layout of the popular cell-based notebooks (Lau et al., 2020). For example, most notebook platforms present cells in a linear manner, thereby requiring designers to decide whether to display their visualization tools within the flow ( [Uncaptioned image] on-demand display) of the cell or detach them from the flow ( always-on display). To address this trade-off, researchers can explore alternative notebook layouts. For example, researchers have introduced sticky cells (Wang et al., 2022a) to break the linear presentation of notebook cells. These sticky cells provide visualization designers with the flexibility to seamlessly switch between [Uncaptioned image] on-demand and always-on displays (Fig. 11). Similarly, regarding the modularity of visualization tools, future researchers could develop intelligent notebook interfaces that automatically adapt a visualization tool between modular and monolithic modes based on the users’ current tasks and requirements.

Promoting Responsible AI through Notebook Workflows. We observe an interesting trend that researchers exploit notebooks as a means to promote responsible AI practices (e.g., Aequitas (Saleiro et al., 2019), Fairlearn (Dudík et al., 2020), Farsight (Wang et al., 2024), and MLDoc (Bhat et al., 2023)). We identify two motivations for this emerging trend. First, AI practitioners often lack incentives to adopt responsible AI practices (Rakova et al., 2021; Schiff et al., 2020), such as fairness assessment and model documentation. By integrating responsible AI practices directly into practitioners’ existing notebook workflows (§ 4.1), researchers aim to minimize adoption friction and “nudge” (Bhat et al., 2023) practitioners to follow these practices. For example, Farsight alerts users to potential harms of their large language model-powered apps while they are develo** prompts in a notebook (Fig. 11). Similarly, MLDoc automatically creates and shows an AI “model card” (Mitchell et al., 2019) using content from a notebook.

Secondly, responsible AI requires collaboration across disciplines and teams within an organization (Rakova et al., 2021; Wang et al., 2023b). Because AI practitioners have already been using notebooks to collaborate with diverse stakeholders (e.g., designers and managers) (Zhang et al., 2020), researchers leverage notebooks as a boundary object to facilitate responsible AI practices across teams. For example, in Deng et al. (2022)’s study on ML fairness toolkits, a participant highlighted “a simple notebook format and compelling visualizations are needed for [organizational] leadership to adopt the toolkits.” Thus, as the mitigation of AI harms has become increasingly crucial, we see exciting research opportunities for researchers to design, develop, and evaluate notebook visualization tools to promote responsible AI.

Limitations. In this study, to keep our review manageable and focused, we focus on computational notebooks designed for Python, the most commonly used programming language among data scientists (Kaggle, 2022). Future work can explore notebooks designed for other languages, such as R Markdown (Studio, 2016) for R and Observable (Observable, 2021) for JavaScript. As notebook visualization tools are still nascent, there are limited user studies evaluating the effectiveness of these tools. In addition, although there are many different notebook user groups (§ 4.1), the existing HCI notebook research focuses on data scientists (Lau et al., 2020). To broaden the understanding of notebook visualization tools, future research endeavors can involve engaging with diverse user groups, including scientists, educators, students, and users with accessibility needs.

8. Conclusion

We collect a total of 163 notebook visualization tools, including 64 from academic papers and 103 sourced from a pool of 55k notebooks containing interactive visualizations that we obtain by scra** 8.6 million notebooks on GitHub. Based on our review, we introduce a framework for characterizing these tools in terms of their motivation for supporting notebooks, targeted users, and design patterns. We further discuss key design implications and trade-offs as well as research opportunities for notebook visualization. Finally, we present SuperNOVA to help researchers and developers easily explore existing notebook visualization tools. We hope that our work contributes to a more comprehensive understanding of notebook visualization tools and helps researchers design and develop visualization tools that are easy to use and adopt.

Acknowledgements.

This work was supported by a J.P. Morgan PhD Fellowship, Apple Scholars in AI/ML PhD fellowship, gifts from Bosch and Cisco. We thank anonymous reviewers for their valuable feedback.

References

(1)
AaltoGIS (2020) AaltoGIS. 2020. Spatial Data Science for Sustainable Development. AaltoGIS. https://github.com/AaltoGIS/Sustainability-GIS
Abadi et al. (2016) Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, Manjunath Kudlur, Josh Levenberg, Rajat Monga, Sherry Moore, Derek G. Murray, Benoit Steiner, Paul Tucker, Vijay Vasudevan, Pete Warden, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2016. TensorFlow: A System for Large-Scale Machine Learning. In OSDI. https://dl.acm.org/doi/10.5555/3026877.3026899
Abraham et al. (2014) Alexandre Abraham, Fabian Pedregosa, Michael Eickenberg, Philippe Gervais, Andreas Mueller, Jean Kossaifi, Alexandre Gramfort, Bertrand Thirion, and Gaël Varoquaux. 2014. Machine Learning for Neuroimaging with Scikit-Learn. Front. Neuroinform (2014). https://doi.org/10.3389/fninf.2014.00014
AI (2022) Evidently AI. 2022. Evidently: Evaluate and Monitor ML Models from Validation to Production. Evidently AI. https://github.com/evidentlyai/evidently
Angriman et al. (2022) Eugenio Angriman, Fabian Brandt-Tumescheit, Leon Franke, Alexander van der Grinten, and Henning Meyerhenke. 2022. Interactive Visualization of Protein RINs Using NetworKit in the Cloud. In 2022 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW). https://doi.org/10.1109/IPDPSW55747.2022.00055
Aoyama (1998) M. Aoyama. 1998. Agile Software Process and Its Experience. In Proceedings of the 20th International Conference on Software Engineering. https://doi.org/10.1109/ICSE.1998.671097
Apache (2019) Apache. 2019. Apache Beam: Unified Programming Model for Batch and Streaming Data Processing. The Apache Software Foundation. https://github.com/apache/beam
Araya et al. (2018) M. Araya, M. Osorio, M. Díaz, C. Ponce, M. Villanueva, C. Valenzuela, and M. Solar. 2018. JOVIAL: Notebook-based Astronomical Data Analysis in the Cloud. Astronomy and Computing 25 (2018). https://doi.org/10.1016/j.ascom.2018.09.001
Aroussi (2019) Ran Aroussi. 2019. Quantstats: Portfolio Analytics for Quants, Written in Python. https://github.com/ranaroussi/quantstats
Autodesk (2016) Autodesk. 2016. Notebook Molecular Visualization. https://github.com/Autodesk/notebook-molecular-visualization
AutoViML (2020) AutoViML. 2020. AutoViz: Automatically Visualize Any Dataset, Any Size with a Single Line of Code. https://github.com/AutoViML/AutoViz
Batch and Elmqvist (2018) Andrea Batch and Niklas Elmqvist. 2018. The Interactive Visualization Gap in Initial Exploratory Data Analysis. IEEE Transactions on Visualization and Computer Graphics 24 (2018). https://doi.org/10.1109/TVCG.2017.2743990
Battle et al. (2018) Leilani Battle, Peitong Duan, Zachery Miranda, Dana Mukusheva, Remco Chang, and Michael Stonebraker. 2018. Beagle: Automated Extraction and Interpretation of Visualizations from the Web. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. https://doi.org/10.1145/3173574.3174168
Battle et al. (2022) Leilani Battle, Danni Feng, and Kelli Webber. 2022. Exploring D3 Implementation Challenges on Stack Overflow, In 2022 IEEE Visualization Conference (VIS). arXiv 2108.02299. http://arxiv.longhoe.net/abs/2108.02299
Bäuerle et al. (2022) Alex Bäuerle, Ángel Alexander Cabrera, Fred Hohman, Megan Maher, David Koski, Xavier Suau, Titus Barik, and Dominik Moritz. 2022. Symphony: Composing Interactive Interfaces for Machine Learning. In CHI. https://doi.org/10.1145/3491102.3502102
Baum (2020) Antoni Baum. 2020. PyCaret: An Open-Source, Low-Code Machine Learning Library in Python. PyCaret. https://github.com/pycaret/pycaret
Bavishi et al. (2021) Rohan Bavishi, Shadaj Laddad, Hiroaki Yoshida, Mukul R. Prasad, and Koushik Sen. 2021. VizSmith: Automated Visualization Synthesis by Mining Data-Science Notebooks. In 2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE). https://doi.org/10.1109/ASE51524.2021.9678696
Bertini (2022) Enrico Bertini. 2022. Building (Easy-To-Adopt) Software While Doing Visualization Research. https://filwd.substack.com/p/building-easy-to-adopt-software-while
Bertrand (2020) Francois Bertrand. 2020. SweetViz: In-depth EDA in Two Lines of Code. https://github.com/fbdesignpro/sweetviz
Bhat et al. (2023) Avinash Bhat, Austin Coursey, Grace Hu, Sixian Li, Nadia Nahar, Shurui Zhou, Christian Kästner, and ** L. C. Guo. 2023. Aspirations and Practice of Model Documentation: Moving the Needle with Nudging and Traceability, In CHI. arXiv 2204.06425. https://doi.org/10.1145/3544548.3581518
Bloomberg (2019) Bloomberg. 2019. Ipydatagrid: Fast Datagrid Widget for the Jupyter Notebook and JupyterLab. Bloomberg. https://github.com/bloomberg/ipydatagrid
Bokeh Development Team (2014) Bokeh Development Team. 2014. Bokeh: Python Library for Interactive Visualization. http://www.bokeh.pydata.org
Borelli (2019) Centre Borelli. 2019. Pypotree: Potree for Jupyter Notebooks and Colab. https://github.com/centreborelli/pypotree
Bostock et al. (2011) Michael Bostock, Vadim Ogievetsky, and Jeffrey Heer. 2011. D³ Data-Driven Documents. IEEE TVCG 17 (2011). https://doi.org/10.1109/TVCG.2011.185
Boucas (2015) Jorge Boucas. 2015. Py2cytoscape: Python Utilities for Cytoscape and Cytoscape.Js. Cytoscape Consortium. https://github.com/cytoscape/py2cytoscape
Bouysset (2021) Cédric Bouysset. 2021. Mols2grid - Interactive Molecule Viewer for 2D Structures. https://doi.org/10.5281/zenodo.6591473
Bqplot (2016) Bqplot. 2016. Bqplot: Plotting Library for IPython/Jupyter Notebooks. https://github.com/bqplot/bqplot
Braun and Clarke (2006) Virginia Braun and Victoria Clarke. 2006. Using Thematic Analysis in Psychology. Qualitative Research in Psychology 3 (2006). https://doi.org/10.1191/1478088706qp063oa
Breddels (2016) M. A. Breddels. 2016. Interactive (Statistical) Visualisation and Exploration of a Billion Objects with Vaex. Proceedings of the International Astronomical Union 12 (2016). https://doi.org/10.1017/S1743921316012795
Brugman (2019) Simon Brugman. 2019. Pandas-Profiling: Exploratory Data Analysis. https://github.com/pandas-profiling/pandas-profiling
Cashman et al. (2019) Dylan Cashman, Shah Rukh Humayoun, Florian Heimerl, Kendall Park, Subhajit Das, John Thompson, Bahador Saket, Abigail Mosca, John Stasko, Alex Endert, Michael Gleicher, and Remco Chang. 2019. A User-based Visual Analytics Workflow for Exploratory Model Analysis. Computer Graphics Forum 38 (2019). https://doi.org/10.1111/cgf.13681
Chattopadhyay et al. (2020) Souti Chattopadhyay, Ishita Prasad, Austin Z. Henley, Anita Sarma, and Titus Barik. 2020. What’s Wrong with Computational Notebooks? Pain Points, Needs, and Design Opportunities. In CHI. https://doi.org/10.1145/3313831.3376729
Chatzimparmpas et al. (2024) Angelos Chatzimparmpas, Kostiantyn Kucher, and Andreas Kerren. 2024. Visualization for Trust in Machine Learning Revisited: The State of the Field in 2023. IEEE Computer Graphics and Applications (2024). https://doi.org/10.1109/MCG.2024.3360881
Chegini et al. (2021) Taher Chegini, Hong-Yi Li, and L. Ruby Leung. 2021. HyRiver: Hydroclimate Data Retriever. Journal of Open Source Software 6 (2021). https://doi.org/10.21105/joss.03175
Chen and Golan (2016) Min Chen and Amos Golan. 2016. What May Visualization Processes Optimize? IEEE Transactions on Visualization and Computer Graphics 22 (2016). https://doi.org/10.1109/TVCG.2015.2513410
Chen and Wu (2022) Yiru Chen and Eugene Wu. 2022. PI2: End-to-end Interactive Visualization Interface Generation from Queries. In Proceedings of the 2022 International Conference on Management of Data. https://doi.org/10.1145/3514221.3526166
Chollet (2015) François Chollet. 2015. Keras. (2015). https://keras.io
Community (2020) Executable Books Community. 2020. Jupyter Book. Zenodo. https://doi.org/10.5281/ZENODO.4539666
Crockett (2021) Damon Crockett. 2021. Ivpy: Iconographic Visualization Inside Computational Notebooks. International Journal for Digital Art History (2021). https://doi.org/10.11588/DAH.2019.4.66401
Cuemacro (2016) Cuemacro. 2016. Chartpy: Easy to Use Python API Wrapper to Plot Charts with Matplotlib, Plotly, Bokeh and More. https://github.com/cuemacro/chartpy
Datapane (2023) Datapane. 2023. Datapane: Build Full-Stack Data Apps in 100% Python. Datapane. https://github.com/datapane/datapane
Dawson-Haggerty et al. (2019) Dawson-Haggerty et al. 2019. Trimesh. https://github.com/mikedh/trimesh
Deng et al. (2022) Wesley Hanwen Deng, Manish Nagireddy, Michelle Seng Ah Lee, Jatinder Singh, Zhiwei Steven Wu, Kenneth Holstein, and Haiyi Zhu. 2022. Exploring How Machine Learning Practitioners (Try To) Use Fairness Toolkits. In 2022 ACM Conference on Fairness, Accountability, and Transparency. https://doi.org/10.1145/3531146.3533113
Drosos et al. (2020) Ian Drosos, Titus Barik, Philip J. Guo, Robert DeLine, and Sumit Gulwani. 2020. Wrex: A Unified Programming-by-Example Interaction for Synthesizing Readable Code for Data Scientists. In CHI. https://doi.org/10.1145/3313831.3376442
Dudík et al. (2020) Miro Dudík, Sarah Bird, Hanna Wallach, and Kathleen Walker. 2020. Fairlearn: A Toolkit for Assessing and Improving Fairness in AI. (2020). https://www.microsoft.com/en-us/research/publication/fairlearn-a-toolkit-for-assessing-and-improving-fairness-in-ai/
dupré (2016) xavier dupré. 2016. Jyquickhelper: Helpers for Jupyter Notebooks around Javascript. https://github.com/sdpython/jyquickhelper
Durant (2018) Martin Durant. 2018. Intake: A General Interface for Loading Data. Intake. https://github.com/intake/intake
Enthought (2015) Enthought. 2015. Mayavi: 3D Visualization of Scientific Data in Python. Enthought, Inc.. https://github.com/enthought/mayavi
Epperson et al. (2023) Will Epperson, Vaishnavi Gorantla, Dominik Moritz, and Adam Perer. 2023. Dead or Alive: Continuous Data Profiling for Interactive Data Science. IEEE Transactions on Visualization and Computer Graphics (2023). https://doi.org/10.1109/TVCG.2023.3327367
Epperson et al. (2022) Will Epperson, Doris Jung-Lin Lee, Leijie Wang, Kunal Agarwal, Aditya G. Parameswaran, Dominik Moritz, and Adam Perer. 2022. Leveraging Analysis History for Improved In Situ Visualization Recommendation. Computer Graphics Forum 41 (2022). https://doi.org/10.1111/cgf.14529
Facebook (2019) Facebook. 2019. Ax: Adaptive Experimentation Platform. Meta. https://github.com/facebook/Ax
Facebook (2020) Facebook. 2020. HiPlot Makes Understanding High Dimensional Data Easy. https://github.com/facebookresearch/hiplot
Faust et al. (2022) Rebecca Faust, Carlos Scheidegger, Katherine Isaacs, William Z. Bernstein, Michael Sharp, and Chris North. 2022. Interactive Visualization for Data Science Scripts. In 2022 IEEE Visualization in Data Science (VDS). https://doi.org/10.1109/VDS57266.2022.00009
Fernandes (2019) Filipe Fernandes. 2019. Folium: Python Data. Leaflet.Js Maps. https://github.com/python-visualization/folium
Fernandez et al. (2017) Nicolas F. Fernandez, Gregory W. Gundersen, Adeeb Rahman, Mark L. Grimes, Klarisa Rikova, Peter Hornbeck, and Avi Ma’ayan. 2017. Clustergrammer, a Web-Based Heatmap Visualization and Analysis Tool for High-Dimensional Biological Data. Scientific Data 4 (2017). https://doi.org/10.1038/sdata.2017.151
Franz et al. (2022) Max Franz, Manfred Cheung, Onur Sumer, Gerardo Huck, Dylan Fong, R-Ba, Josejulio Martínez, Jan Žák, Tony Mullen, Bogdan Chadkin, Ayhun, Metincansiper, Chris, Jan Hartmann, Joseph Stahl, Paolo Parlapiano, Eli Sherer, Mélanie Gauthier, Rich Trott, Yaroslav Sidlovsky, Bumbu, Alexander Li, Christian Lopes, TexKiller, Mike Beynon, Gui Meira, Janit Mehta, and Mike Dias. 2022. Cytoscape/Cytoscape.Js. Zenodo. https://doi.org/10.5281/ZENODO.6828253
Freeman et al. (2021) C. Daniel Freeman, Erik Frey, Anton Raichuk, Sertan Girgin, Igor Mordatch, and Olivier Bachem. 2021. Brax: A Differentiable Physics Engine. http://github.com/google/brax
Fujiwara et al. (2022) Takanori Fujiwara, Xinhai Wei, Jian Zhao, and Kwan-Liu Ma. 2022. Interactive Dimensionality Reduction for Comparative Analysis. IEEE Transactions on Visualization and Computer Graphics 28 (2022). https://doi.org/10.1109/TVCG.2021.3114807
Fuller (2013a) Patrick Fuller. 2013a. Imolecule: An Embeddable webGL Molecule Viewer and File Format Converter. https://github.com/patrickfuller/imolecule
Fuller (2013b) Patrick Fuller. 2013b. Jgraph: An Embeddable webGL Graph Visualization Library. https://github.com/patrickfuller/jgraph
Furmanova et al. (2020) Katarina Furmanova, Samuel Gratzl, Holger Stitz, Thomas Zichner, Miroslava Jaresova, Alexander Lex, and Marc Streit. 2020. Taggle: Combining Overview and Details in Tabular Data Visualizations. Information Visualization 19 (2020). https://doi.org/10.1177/1473871619878085
Germanidis (2017) Anastasis Germanidis. 2017. Pigeon: Quickly Annotate Data on Jupyter. https://github.com/agermanidis/pigeon
GitHub (2023) GitHub. 2023. GitHub GraphQL API Documentation. https://ghdocs-prod.azurewebsites.net/en/graphql
Gonzalez (2019) Carlos Gonzalez. 2019. Hciplot: Library for Visualizing High-Contrast Imaging Multidimensional Datacubes on JupyterLab. https://github.com/carlos-gg/hciplot
Google (2018) Google. 2018. TensorFlow Model Analysis. https://github.com/tensorflow/model-analysis
Google (2021) Google. 2021. Brax: Massively Parallel Rigidbody Physics Simulation on Accelerator Hardware. https://github.com/google/brax
Graphistry (2016) Graphistry. 2016. PyGraphistry: Explore Relationships. https://github.com/graphistry/pygraphistry
Graser and Dragaschnig (2020) Anita Graser and Melitta Dragaschnig. 2020. Exploring Movement Data in Notebook Environments. In IEEE VIS 2020 Workshop on Information Visualization of Geospatial Networks, Flows and Movement (MoVis). http://move.geog.ucsb.edu/wp-content/uploads/2020/10/MoVIS20_paper_4.pdf
Grootendorst (2022) Maarten Grootendorst. 2022. BERTopic: Neural Topic Modeling with a Class-Based TF-IDF Procedure. arXiv preprint arXiv:2203.05794 (2022). https://doi.org/10.48550/arXiv.2203.05794
Guo et al. (2021) Grace Guo, Maria Glenski, ZhuanYi Shaw, Emily Saldanha, Alex Endert, Svitlana Volkova, and Dustin Arendt. 2021. VAINE: Visualization and AI for Natural Experiments. In 2021 IEEE Visualization Conference (VIS). https://doi.org/10.1109/VIS49827.2021.9623285
Guo et al. (2023) Grace Guo, Ehud Karavani, Alex Endert, and Bum Chul Kwon. 2023. Causalvis: Visualizations for Causal Inference. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems. https://doi.org/10.1145/3544548.3581236
Gupta (2021) Abhishek Gupta. 2021. Data-Purifier: A Python Library for Automated Exploratory Data Analysis. https://github.com/Elysian01/Data-Purifier
Gurvich and Geller (2023) Alexander B. Gurvich and Aaron M. Geller. 2023. Firefly: A Browser-based Interactive 3D Data Visualization Tool for Millions of Data Points. The Astrophysical Journal Supplement Series 265 (2023). https://doi.org/10.3847/1538-4365/acb59f
Haas (2021) Robert Haas. 2021. Gravis: Interactive Graph Visualizations with Python and HTML/CSS/JS. https://github.com/robert-haas/gravis
Hackl (2019) Jürgen Hackl. 2019. Pathpy: An OpenSource Python Package for the Analysis of Time Series Data on Networks Using Higher-Order and Multi-Order Graphical Models. https://github.com/pathpy/pathpy
Herwig et al. (2018) Falk Herwig, Robert Andrassy, Nic Annau, Ondrea Clarkson, Benoit Côté, Aaron D’Sa, Sam Jones, Belaid Moa, Jericho O’Connell, David Porter, Christian Ritter, and Paul Woodward. 2018. Cyberhubs: Virtual Research Environments for Astronomy. The Astrophysical Journal Supplement Series 236 (2018). https://doi.org/10.3847/1538-4365/aab777
Hlobil (2018) Patrik Hlobil. 2018. Pandas-Bokeh: Bokeh Plotting Backend for Pandas and GeoPandas. https://github.com/PatrikHlobil/Pandas-Bokeh
Huang et al. (2023) Z. Huang, D. Witschard, K. Kucher, and A. Kerren. 2023. VA + Embeddings STAR: A State-of-the-Art Report on the Use of Embeddings in Visual Analytics. Computer Graphics Forum 42 (2023). https://doi.org/10.1111/cgf.14859
Hull et al. (2023) Matthew Hull, Vivian Pednekar, Hannah Murray, Nimisha Roy, Emmanuel Tung, Susanta Routray, Connor Guerin, Justin Chen, Zijie J. Wang, Seongmin Lee, Mahdi Roozbahani, and Duen Horng Chau. 2023. VISGRADER: Automatic Grading of D3 Visualizations. IEEE Transactions on Visualization and Computer Graphics (2023). https://doi.org/10.1109/TVCG.2023.3327181
Hunter (2007) J. D. Hunter. 2007. Matplotlib: A 2D Graphics Environment. Computing in Science & Engineering 9 (2007). https://doi.org/10.1109/MCSE.2007.55
Hutchins et al. (1985) Edwin L Hutchins, James D Hollan, and Donald A Norman. 1985. Direct Manipulation Interfaces. (1985).
Jain et al. (2022) Naman Jain, Skanda Vaidyanath, Arun Iyer, Nagarajan Natarajan, Suresh Parthasarathy, Sriram Rajamani, and Rahul Sharma. 2022. Jigsaw: Large Language Models Meet Program Synthesis. In Proceedings of the 44th International Conference on Software Engineering. https://doi.org/10.1145/3510003.3510203
Jordahl et al. (2022) Kelsey Jordahl, Joris Van Den Bossche, Martin Fleischmann, James McBride, Jacob Wasserman, Matt Richards, Adrian Garcia Badaracco, Alan D. Snow, Jeffrey Gerard, Jeff Tratner, Matthew Perry, Brendan Ward, Carson Farmer, Geir Arne Hjelle, Mike Taves, Ewout Ter Hoeven, Micah Cochran, Rraymondgh, Sean Gillies, Giacomo Caria, Lucas Culbertson, Matt Bartos, Nick Eubank, Ray Bell, Sangarshanan, John Flavin, Sergio Rey, Maxalbert, Aleksey Bilogur, and Christopher Ren. 2022. Geopandas/Geopandas: V0.12.2. Zenodo. https://doi.org/10.5281/ZENODO.7422493
Kaggle (2022) Kaggle. 2022. State of Machine Learning and Data Science 2022. https://www.kaggle.com/kaggle-survey-2022
Ke et al. (2017) Guolin Ke, Qi Meng, Thomas Finely, Taifeng Wang, Wei Chen, Weidong Ma, Qiwei Ye, and Tie-Yan Liu. 2017. LightGBM: A Highly Efficient Gradient Boosting Decision Tree. In Advances in Neural Information Processing Systems 30 (NIP 2017). https://www.microsoft.com/en-us/research/publication/lightgbm-a-highly-efficient-gradient-boosting-decision-tree/
Keim et al. (2008) Daniel Keim, Gennady Andrienko, Jean-Daniel Fekete, Carsten Görg, Jörn Kohlhammer, and Guy Melançon. 2008. Visual Analytics: Definition, Process, and Challenges.
Keplergl (2019) Keplergl. 2019. Kepler.Gl: A Powerful Open Source Geospatial Analysis Tool for Large-Scale Data Sets. https://github.com/keplergl/kepler.gl
Kerren et al. (2017) Andreas Kerren, Kostiantyn Kucher, Yuan-Fang Li, and Falk Schreiber. 2017. BioVis Explorer: A Visual Guide for Biological Data Visualization Techniques. PLOS ONE 12 (2017). https://doi.org/10.1371/journal.pone.0187341
Kery et al. (2018) Mary Beth Kery, Marissa Radensky, Mahima Arya, Bonnie E. John, and Brad A. Myers. 2018. The Story in the Notebook: Exploratory Data Science Using a Literate Programming Tool. In CHI. https://doi.org/10.1145/3173574.3173748
Kery et al. (2020) Mary Beth Kery, Donghao Ren, Fred Hohman, Dominik Moritz, Kanit Wongsuphasawat, and Kayur Patel. 2020. Mage: Fluid Moves Between Code and Graphical Work in Computational Notebooks. In CHI. https://doi.org/10.1145/3379337.3415842
Kerzel et al. (2023) Dominik Kerzel, Birgitta König-Ries, and Samuel Sheeba. 2023. MLProvLab: Provenance Management for Data Science Notebooks. (2023). https://doi.org/10.18420/BTW2023-66
King (2016) Zak King. 2016. Escher: Build, Share, and Embed Visualizations of Metabolic Pathways. https://github.com/zakandrewking/escher
Kinney et al. (2023) Rodney Kinney, Chloe Anastasiades, Russell Authur, Iz Beltagy, Jonathan Bragg, Alexandra Buraczynski, Isabel Cachola, Stefan Candra, Yoganand Chandrasekhar, Arman Cohan, Miles Crawford, Doug Downey, Jason Dunkelberger, Oren Etzioni, Rob Evans, Sergey Feldman, Joseph Gorney, David Graham, Fangzhou Hu, Regan Huff, Daniel King, Sebastian Kohlmeier, Bailey Kuehl, Michael Langan, Daniel Lin, Haokun Liu, Kyle Lo, Jaron Lochner, Kelsey MacMillan, Tyler Murray, Chris Newell, Smita Rao, Shaurya Rohatgi, Paul Sayre, Zejiang Shen, Amanpreet Singh, Luca Soldaini, Shivashankar Subramanian, Amber Tanaka, Alex D. Wade, Linda Wagner, Lucy Lu Wang, Chris Wilhelm, Caroline Wu, Jiangjiang Yang, Angele Zamarron, Madeleine Van Zuylen, and Daniel S. Weld. 2023. The Semantic Scholar Open Data Platform. arXiv 2301.10140 (2023). http://arxiv.longhoe.net/abs/2301.10140
Kissinger and van de Wetering (2020) Aleks Kissinger and John van de Wetering. 2020. PyZX: Large Scale Automated Diagrammatic Reasoning. In Proceedings 16th International Conference on Quantum Physics and Logic, Chapman University, Orange, CA, USA., 10-14 June 2019 (Electronic Proceedings in Theoretical Computer Science, Vol. 318). https://doi.org/10.4204/EPTCS.318.14
Klein (2016) Almar Klein. 2016. Flexx: Write Desktop and Web Apps in Pure Python. https://github.com/flexxui/flexx
Kluyver and others (2016) Thomas Kluyver and others. 2016. Jupyter Notebooks - a Publishing Format for Reproducible Computational Workflows. ELPUB (2016). https://doi.org/10.3233/978-1-61499-649-1-87
Korobov (2016) Mikhail Korobov. 2016. ELI5: A Library for Debugging/Inspecting Machine Learning Classifiers and Explaining Their Predictions. eli5-org. https://github.com/eli5-org/eli5
Krabel (2019) Tobias Krabel. 2019. Bamboolib: GUI for Pandas DataFrames. https://github.com/tkrabel/bamboolib
Krause et al. (2021) Claire Krause, Bex Dunn, Robbi Bishop-Taylor, Caitlin Adams, Chad Burton, Matthew Alger, Sean Chua, Claire Phillips, Vanessa Newey, Kirill Kouzoubov, Alex Leith, Damien Ayers, Andrew Hicks, and DEA Notebooks contributors. 2021. Digital Earth Australia Notebooks and Tools Repository. https://doi.org/10.26186/145234
Kucher and Kerren (2015) Kostiantyn Kucher and Andreas Kerren. 2015. Text Visualization Techniques: Taxonomy, Visual Survey, and Community Insights. In PacificVis. https://doi.org/10.1109/PACIFICVIS.2015.7156366
Kukushkin (2018) Alexander Kukushkin. 2018. Ipyannotate: Jupyter Widget for Data Annotation. https://github.com/ipyannotate/ipyannotate
Kwon et al. (2023) Nahyun Kwon, Hannah Kim, Sajjadur Rahman, Dan Zhang, and Estevam Hruschka. 2023. Weedle: Composable Dashboard for Data-Centric NLP in Computational Notebooks. In Companion Proceedings of the ACM Web Conference 2023. https://doi.org/10.1145/3543873.3587330
Lab (2020) Jupyter Physical Science Lab. 2020. JupyterPiDAQ: Interactive Analog Data Acquisition and Analysis within Jupyter Notebooks Using GUI Tools. Jupyter Physical Science Lab. https://github.com/JupyterPhysSciLab/JupyterPiDAQ
Laboratories (2022) Sandia National Laboratories. 2022. Toyplot: Interactive Plotting for Python. Sandia National Laboratories. https://github.com/sandialabs/toyplot
Lage et al. (2016) Marcos Lage, Jorge Piazentin Ono, Daniel Cervone, Justin Chiang, Carlos Dietrich, and Claudio T. Silva. 2016. StatCast Dashboard: Exploration of Spatiotemporal Baseball Data. IEEE Computer Graphics and Applications 36 (2016). https://doi.org/10.1109/MCG.2016.101
Lam et al. (2023) Michelle S. Lam, Zixian Ma, Anne Li, Izequiel Freitas, Dakuo Wang, James A. Landay, and Michael S. Bernstein. 2023. Model Sketching: Centering Concepts in Early-Stage Machine Learning Model Design. arXiv 2303.02884 (2023). https://doi.org/10.1145/3544548.3581290
Lau et al. (2020) Sam Lau, Ian Drosos, Julia M. Markel, and Philip J. Guo. 2020. The Design Space of Computational Notebooks: An Analysis of 60 Systems in Academia and Industry. In 2020 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC). https://doi.org/10.1109/VL/HCC50065.2020.9127201
Lau and Hug (2018) Samuel Lau and Joshua Hug. 2018. Nbinteract: Generate Interactive Web Pages from Jupyter Notebooks. Master’s thesis. University of California at Berkeley. https://www.nbinteract.com/#
Lee et al. (2021) Doris Jung-Lin Lee, Dixin Tang, Kunal Agarwal, Thyne Boonmark, Caitlyn Chen, Jake Kang, Ujjaini Mukhopadhyay, Jerry Song, Micah Yong, Marti A. Hearst, and Aditya G. Parameswaran. 2021. Lux: Always-on Visualization Recommendations for Exploratory Dataframe Workflows. VLDB Endowment 15 (2021). https://doi.org/10.14778/3494124.3494151
Li et al. (2023a) Haotian Li, Lu Ying, Haidong Zhang, Yingcai Wu, Huamin Qu, and Yun Wang. 2023a. Notable: On-the-fly Assistant for Data Storytelling in Computational Notebooks. In CHI. https://doi.org/10.1145/3544548.3580965
Li et al. (2020) Siwei Li, Zhiyan Zhou, Anish Upadhayay, Omar Shaikh, Scott Freitas, Haekyu Park, Zijie J. Wang, Susanta Routray, Matthew Hull, and Duen Horng Chau. 2020. Argo Lite: Open-Source Interactive Graph Exploration and Visualization in Browsers. In CIKM. https://doi.org/10.1145/3340531.3412877
Li et al. (2023b) Xingjun Li, Yizhi Zhang, Justin Leung, Chengnian Sun, and Jian Zhao. 2023b. EDAssistant: Supporting Exploratory Data Analysis in Computational Notebooks with In Situ Code Search and Recommendation. ACM TiiS 13 (2023). https://doi.org/10.1145/3545995
Lightkurve Collaboration et al. (2018) Lightkurve Collaboration, J. V. d. M. Cardoso, C. Hedges, M. Gully-Santiago, N. Saunders, A. M. Cody, T. Barclay, O. Hall, S. Sagear, E. Turtelboom, J. Zhang, A. Tzanidakis, K. Mighell, J. Coughlin, K. Bell, Z. Berta-Thompson, P. Williams, J. Dotson, and G. Barentsen. 2018. Lightkurve: Kepler and TESS Time Series Analysis in Python. Astrophysics Source Code Library. http://adsabs.harvard.edu/abs/2018ascl.soft12013L
Lin et al. (2023) Yanna Lin, Haotian Li, Leni Yang, Aoyu Wu, and Huamin Qu. 2023. InkSight: Leveraging Sketch Interaction for Documenting Chart Findings in Computational Notebooks. IEEE Transactions on Visualization and Computer Graphics (2023). https://doi.org/10.1109/TVCG.2023.3327170
Liu and Stasko (2010) Zhicheng Liu and J T Stasko. 2010. Mental Models, Visual Reasoning and Interaction in Information Visualization: A Top-down Perspective. IEEE Transactions on Visualization and Computer Graphics 16 (2010). https://doi.org/10.1109/TVCG.2010.177
Logan (2023) Logan. 2023. Nbtutor: Visualize Python Code Execution (Line-by-Line) in Jupyter Notebook Cells. https://github.com/lgpage/nbtutor
Lundberg and Lee (2017) Scott M. Lundberg and Su-In Lee. 2017. A Unified Approach to Interpreting Model Predictions. In Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS’17). https://doi.org/10.48550/arXiv.1705.07874
Maeztu (2016) Gabi Maeztu. 2016. Neo4jupyter: A Quick Visualization Tool for Jupyter and Neo4J. https://github.com/merqurio/neo4jupyter
Mauricio (2017) Juan Manuel Mauricio. 2017. Pydgrid: Python Distribution Grid Simulator. https://github.com/pydgrid/pydgrid
McCormick et al. (2022) Matt McCormick, Brianna Major, Laryssa Abdala, Paul Elliott, and Stephen R. Aylward. 2022. InsightSoftwareConsortium/Itkwidgets: Itkwidgets 0.32.5. Zenodo. https://doi.org/10.5281/ZENODO.7489693
Mcnutt et al. (2023) Andrew M Mcnutt, Chenglong Wang, Robert A Deline, and Steven M. Drucker. 2023. On the Design of AI-powered Code Assistants for Notebooks. In CHI. https://doi.org/10.1145/3544548.3580940
Merriam et al. (2002) Sharan B Merriam et al. 2002. Introduction to Qualitative Research. Qualitative research in practice: Examples for discussion and analysis 1 (2002).
Microsoft (2019) Microsoft. 2019. Interpret Community SDK. https://github.com/interpretml/interpret-community
Microsoft (2020) Microsoft. 2020. Responsible AI Toolbox. Microsoft. https://github.com/microsoft/responsible-ai-toolbox
Mining (2019) Intuitive Text Mining. 2019. D3fdgraph: D3 Interactive Animated Force-Directed Graphs in a Jupyter Notebook. https://github.com/intuitivetextmining/d3fdgraph
Mitchell et al. (2019) Margaret Mitchell, Simone Wu, Andrew Zaldivar, Parker Barnes, Lucy Vasserman, Ben Hutchinson, Elena Spitzer, Inioluwa Deborah Raji, and Timnit Gebru. 2019. Model Cards for Model Reporting. In Proceedings of the Conference on Fairness, Accountability, and Transparency. https://doi.org/10.1145/3287560.3287596
Moi and Patry (2023) Anthony Moi and Nicolas Patry. 2023. HuggingFace’s Tokenizers. https://github.com/huggingface/tokenizers
Munechika et al. (2022) David Munechika, Zijie J. Wang, Jack Reidy, Josh Rubin, Krishna Gade, Krishnaram Kenthapadi, and Duen Horng Chau. 2022. Visual Auditor: Interactive Visualization for Detection and Summarization of Model Biases. In VIS. https://doi.org/10.1109/VIS54862.2022.00018
Narechania et al. (2021) Arpit Narechania, Arjun Srinivasan, and John Stasko. 2021. NL4DV: A Toolkit for Generating Analytic Specifications for Data Visualization from Natural Language Queries. IEEE Transactions on Visualization and Computer Graphics 27 (2021). https://doi.org/10.1109/TVCG.2020.3030378
Nengo (2019) Nengo. 2019. Nengo: A Python Library for Creating and Simulating Large-Scale Brain Models. Nengo. https://github.com/nengo/nengo
Nguyen et al. (2018) Hai Nguyen, David A Case, and Alexander S Rose. 2018. NGLview–Interactive Molecular Graphics for Jupyter Notebooks. Bioinformatics 34 (2018). https://doi.org/10.1093/bioinformatics/btx789
Nori et al. (2019) Harsha Nori, Samuel Jenkins, Paul Koch, and Rich Caruana. 2019. InterpretML: A Unified Framework for Machine Learning Interpretability. arXiv (2019). http://arxiv.longhoe.net/abs/1909.09223
NVIDIA (2021) NVIDIA. 2021. NVDashboard: A JupyterLab Extension for Displaying Dashboards of GPU Usage. RAPIDS. https://github.com/rapidsai/jupyterlab-nvdashboard
Observable (2021) Observable. 2021. Observable: Data Visualization Platform. https://observablehq.com/
Ono et al. (2021) Jorge Piazentin Ono, Sonia Castelo, Roque Lopez, Enrico Bertini, Juliana Freire, and Claudio Silva. 2021. PipelineProfiler: A Visual Analytics Tool for the Exploration of AutoML Pipelines. TVCG 27 (2021). https://doi.org/10.1109/TVCG.2020.3030361
Org (2019) Intelligent Systems Lab Org. 2019. Open3D: Open3D: A Modern Library for 3D Data Processing. https://github.com/isl-org/Open3D
Palmeiro et al. (2022) João Palmeiro, Beatriz Malveiro, Rita Costa, David Polido, Ricardo Moreira, and Pedro Bizarro. 2022. Data+Shift: Supporting Visual Investigation of Data Distribution Shifts by Data Scientists. (2022). https://doi.org/10.2312/EVS.20221097
Parmer (2020) Chris Parmer. 2020. Dash: Data Apps & Dashboards for Python. Plotly. https://github.com/plotly/dash
Peng et al. (2021) **glin Peng, Weiyuan Wu, Brandon Lockhart, Song Bian, **g Nathan Yan, Linghao Xu, Zhixuan Chi, Jeffrey M. Rzeszotarski, and Jiannan Wang. 2021. DataPrep.EDA: Task-Centric Exploratory Data Analysis for Statistical Modeling in Python. In Proceedings of the 2021 International Conference on Management of Data. https://doi.org/10.1145/3448016.3457330
Perrone et al. (2020) Giancarlo Perrone, Jose Un**co, and Haw-minn Lu. 2020. Network Visualizations with Pyvis and VisJS. arXiv 2006.04951 (2020). http://arxiv.longhoe.net/abs/2006.04951
Petrak (2020) Johann Petrak. 2020. Python-Gatenlp: Python Text Processing, Pattern Matching, and NLP Framework. GateNLP. https://github.com/GateNLP/python-gatenlp
Piazentin Ono et al. (2021) Jorge Piazentin Ono, Juliana Freire, and Claudio T. Silva. 2021. Interactive Data Visualization in Jupyter Notebooks. Comput Sci Eng 23 (2021). https://doi.org/10.1109/MCSE.2021.3052619
Pielawski et al. (2022) Nicolas Pielawski, Axel Andersson, Christophe Avenel, Andrea Behanova, Eduard Chelebian, Anna Klemm, Fredrik Nysjö, Leslie Solorzano, and Carolina Wählby. 2022. TissUUmaps 3: Improvements in Interactive Visualization, Exploration, and Quality Assessment of Large-Scale Spatial Omics Data. Preprint. Bioinformatics. https://doi.org/10.1101/2022.01.28.478131
PixieDust (2016) PixieDust. 2016. PixieDust: Python Helper Library for Jupyter Notebooks. Pixiedust development. https://github.com/pixiedust/pixiedust
Poliastro (2019) Poliastro. 2019. Czml3: Python 3 Library to Write CZML. https://github.com/poliastro/czml3
Prokhorenkova et al. (2019) Liudmila Prokhorenkova, Gleb Gusev, Aleksandr Vorobev, Anna Veronika Dorogush, and Andrey Gulin. 2019. CatBoost: Unbiased Boosting with Categorical Features. arXiv (2019). http://arxiv.longhoe.net/abs/1706.09516
PyPathway (2022) PyPathway. 2022. PyPathway: A Python Package for Pathway Visualization. https://github.com/iseekwonderful/PyPathway
QuantStack (2017) QuantStack. 2017. Ipysheet: Jupyter Handsontable Integration. QuantStack. https://github.com/QuantStack/ipysheet
QuantStack (2022) QuantStack. 2022. Ipytree: A Tree Widget Using Jupyter-widgets Protocol and jsTree. QuantStack. https://github.com/QuantStack/ipytree
QuSTaR (2019) QuSTaR. 2019. Kaleidoscope: Visualizations for Quantum Computing. https://github.com/QuSTaR/kaleidoscope
Rakova et al. (2021) Bogdana Rakova, **gying Yang, Henriette Cramer, and Rumman Chowdhury. 2021. Where Responsible AI Meets Reality: Practitioner Perspectives on Enablers for Shifting Organizational Practices. Proceedings of the ACM on Human-Computer Interaction 5 (2021). https://doi.org/10.1145/3449081
Robbins et al. (2023) Henry W. Robbins, Samuel C. Gutekunst, David B. Shmoys, and David P. Williamson. 2023. GILP: An Interactive Tool for Visualizing the Simplex Algorithm. In SIGCSE. https://doi.org/10.1145/3545945.3569815
Robinson (2022) Jim Robinson. 2022. Module for Embedding Igv.Js in an IPython Notebook. https://github.com/igvteam/igv-notebook
Rose (2020) Adam Rose. 2020. PandasGUI: A GUI for Pandas DataFrames. https://github.com/adamerose/PandasGUI
Rosenthal et al. (2018) Sara Brin Rosenthal, Julia Len, Mikayla Webster, Aaron Gary, Amanda Birmingham, and Kathleen M Fisch. 2018. Interactive Network Visualization in Jupyter Notebooks: visJS2jupyter. Bioinformatics 34 (2018). https://doi.org/10.1093/bioinformatics/btx581
Rudiger (2016) Philipp Rudiger. 2016. Geoviews: Simple, Concise Geographical Visualization in Python. HoloViz. https://github.com/holoviz/geoviews
Rudiger (2021) Philipp Rudiger. 2021. Panel: A High-Level App and Dashboarding Solution for Python. HoloViz. https://github.com/holoviz/panel
Rule et al. (2018) Adam Rule, Aurélien Tabard, and James D. Hollan. 2018. Exploration and Explanation in Computational Notebooks. In CHI. https://doi.org/10.1145/3173574.3173606
Saleiro et al. (2019) Pedro Saleiro, Benedict Kuester, Loren Hinkson, Jesse London, Abby Stevens, Ari Anisfeld, Kit T. Rodolfa, and Rayid Ghani. 2019. Aequitas: A Bias and Fairness Audit Toolkit. arXiv 1811.05577 (2019). http://arxiv.longhoe.net/abs/1811.05577
Sampaio (2018) Matheus Xavier Sampaio. 2018. PyMove: Python Library to Simplify Queries and Visualization of Trajectories and Other Spatial-Temporal Data. Insight Data Science Lab. https://github.com/InsightLab/PyMove
Sarikaya et al. (2019) Alper Sarikaya, Michael Correll, Lyn Bartram, Melanie Tory, and Danyel Fisher. 2019. What Do We Talk About When We Talk About Dashboards? IEEE Transactions on Visualization and Computer Graphics 25 (2019). https://doi.org/10.1109/TVCG.2018.2864903
Satyanarayan et al. (2017) Arvind Satyanarayan, Dominik Moritz, Kanit Wongsuphasawat, and Jeffrey Heer. 2017. Vega-Lite: A Grammar of Interactive Graphics. IEEE Transactions on Visualization & Computer Graphics (Proc. InfoVis) (2017). https://doi.org/10.1109/tvcg.2016.2599030
Sbailò et al. (2022) Luigi Sbailò, Ádám Fekete, Luca M. Ghiringhelli, and Matthias Scheffler. 2022. The NOMAD Artificial-Intelligence Toolkit: Turning Materials-Science Data into Knowledge and Understanding. Computational Materials (2022). https://doi.org/10.1038/s41524-022-00935-z
Schiff et al. (2020) Daniel Schiff, Bogdana Rakova, Aladdin Ayesh, Anat Fanti, and Michael Lennon. 2020. Principles to Practices for Responsible AI: Closing the Gap. arXiv 2006.04707 (2020). http://arxiv.longhoe.net/abs/2006.04707
Scully-Allison et al. (2022) Connor Scully-Allison, Ian Lumsden, Katy Williams, Jesse Bartels, Michela Taufer, Stephanie Brink, Abhinav Bhatele, Olga Pearce, and Katherine E. Isaacs. 2022. Designing an Interactive, Notebook-Embedded, Tree Visualization to Support Exploratory Performance Analysis. arXiv 2205.04557 (2022). http://arxiv.longhoe.net/abs/2205.04557
Sedlmair et al. (2012) Michael Sedlmair, Miriah Meyer, and Tamara Munzner. 2012. Design Study Methodology: Reflections from the Trenches and the Stacks. IEEE Transactions on Visualization and Computer Graphics 18 (2012). https://doi.org/10.1109/TVCG.2012.213
Shawver (2017) Tim Shawver. 2017. Qgrid: An Interactive Grid for Sorting, Filtering, and Editing DataFrames in Jupyter Notebooks. https://github.com/quantopian/qgrid
Sievert et al. (2017) Carson Sievert, Chris Parmer, Toby Hocking, Scott Chamberlain, Karthik Ram, Marianne Corvellec, and Pedro Despouy. 2017. Plotly: Create Interactive Web Graphics via ‘Plotly.Js’. 4 (2017). https://github.com/plotly/plotly.py
Sievert and Shirley (2014) Carson Sievert and Kenneth Shirley. 2014. LDAvis: A Method for Visualizing and Interpreting Topics. In Proceedings of the Workshop on Interactive Language Learning, Visualization, and Interfaces. https://doi.org/10.3115/v1/W14-3110
Simonne et al. (2022) David Simonne, Jérôme Carnis, Clément Atlan, Corentin Chatelier, Vincent Favre-Nicolin, Maxime Dupraz, Steven J. Leake, Edoardo Zatterin, Andrea Resta, Alessandro Coati, and Marie-Ingrid Richard. 2022. Gwaihir : Jupyter Notebook Graphical User Interface for Bragg Coherent Diffraction Imaging. Journal of Applied Crystallography 55 (2022). https://doi.org/10.1107/S1600576722005854
Sivarajah et al. (2020) Seyon Sivarajah, Silas Dilkes, Alexander Cowtan, Will Simmons, Alec Edgington, and Ross Duncan. 2020. TKET: A Retargetable Compiler for NISQ Devices. Quantum Science and Technology 6 (2020). https://doi.org/10.1088/2058-9565/ab8e92
Sivaraman et al. (2022) Venkatesh Sivaraman, Yiwei Wu, and Adam Perer. 2022. Emblaze: Illuminating Machine Learning Representations through Interactive Comparison of Embedding Spaces. In ACM IUI. https://doi.org/10.1145/3490099.3511137
Smith et al. (2021) David H. Smith, Qiang Hao, Christopher D. Hundhausen, Filip Jagodzinski, Josh Myers-Dean, and Kira Jaeger. 2021. Towards Modeling Student Engagement with Interactive Computing Textbooks: An Empirical Study. In SIGCSE. https://doi.org/10.1145/3408877.3432361
Sohns et al. (2022) Jan-Tobias Sohns, Michaela Schmitt, Fabian Jirasek, Hans Hasse, and Heike Leitte. 2022. Attribute-Based Explanation of Non-Linear Embeddings of High-Dimensional Data. IEEE Transactions on Visualization and Computer Graphics 28 (2022). https://doi.org/10.1109/TVCG.2021.3114870
Stein (2022) Andrew Stein. 2022. Perspective: Interactive Analytics and Data Visualization Component. https://github.com/finos/perspective
Studio (2016) R Studio. 2016. R Markdown. https://rmarkdown.rstudio.com/
Tenney et al. (2020) Ian Tenney, James Wexler, Jasmijn Bastings, Tolga Bolukbasi, Andy Coenen, Sebastian Gehrmann, Ellen Jiang, Mahima Pushkarna, Carey Radebaugh, Emily Reif, and Ann Yuan. 2020. The Language Interpretability Tool: Extensible, Interactive Visualizations and Analysis for NLP Models. In EMNLP Demo. https://doi.org/10.18653/v1/2020.emnlp-demos.15
Tritsarolis et al. (2021) Andreas Tritsarolis, Christos Doulkeridis, Nikos Pelekis, and Yannis Theodoridis. 2021. ST_VISIONS: A Python Library for Interactive Visualization of Spatio-temporal Data. In 2021 22nd IEEE International Conference on Mobile Data Management (MDM). https://doi.org/10.1109/MDM52706.2021.00048
Uber (2016) Uber. 2016. Deck.Gl: WebGL2 Powered Geospatial Visualization Layers. https://deck.gl
Upson et al. (1989) C. Upson, T.A. Faulhaber, D. Kamins, D. Laidlaw, D. Schlegel, J. Vroom, R. Gurwitz, and A. Van Dam. 1989. The Application Visualization System: A Computational Environment for Scientific Visualization. IEEE Computer Graphics and Applications 9 (1989). https://doi.org/10.1109/38.31462
Van Der Donckt et al. (2022) Jonas Van Der Donckt, Jeroen Van der Donckt, Emiel Deprost, and Sofie Van Hoecke. 2022. Plotly-Resampler: Effective Visual Analytics for Large Time Series. In 2022 IEEE Visualization and Visual Analytics (VIS). https://doi.org/10.1109/VIS54862.2022.00013
VanderPlas et al. (2018) Jacob VanderPlas, Brian Granger, Jeffrey Heer, Dominik Moritz, Kanit Wongsuphasawat, Arvind Satyanarayan, Eitan Lees, Ilia Timofeev, Ben Welsh, and Scott Sievert. 2018. Altair: Interactive Statistical Visualizations for Python. Journal of Open Source Software 3 (2018). https://doi.org/10.21105/joss.01057
Venkatachalapathi (2020) Sidheswar Venkatachalapathi. 2020. Quick-EDA: Simple & Easy-to-use Python Modules to Perform Quick Exploratory Data Analysis for Any Structured Dataset. https://github.com/sid-the-coder/QuickDA
Verano Merino et al. (2020) Mauricio Verano Merino, Jurgen Vinju, and Tijs van der Storm. 2020. Bacatá: Notebooks for DSLs, Almost for Free. The Art, Science, and Engineering of Programming 4 (2020). https://doi.org/10.22152/programming-journal.org/2020/4/11
Vig (2019) Jesse Vig. 2019. A Multiscale Visualization of Attention in the Transformer Model. In ACL: System Demonstrations. https://doi.org/10.18653/v1/P19-3007
Vizzu (2022) Vizzu. 2022. Ipyvizzu: Build Animated Charts in Jupyter Notebook and Similar Environments with a Simple Python Syntax. Vizzu. https://github.com/vizzuhq/ipyvizzu
Voxel51 (2020) Voxel51. 2020. Fiftyone: Building High-Quality Datasets and Computer Vision Models. Voxel51. https://github.com/voxel51/fiftyone
Wang et al. (2019) Changhan Wang, Anirudh Jain, Danlu Chen, and Jiatao Gu. 2019. VizSeq: A Visual Analysis Toolkit for Text Generation Tasks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP): System Demonstrations. https://doi.org/10.18653/v1/D19-3043
Wang et al. (2023a) Fengjie Wang, Xuye Liu, Ou**g Liu, Ali Neshati, Tengfei Ma, Min Zhu, and Jian Zhao. 2023a. Slide4N: Creating Presentation Slides from Computational Notebooks with Human-AI Collaboration. In CHI. https://doi.org/10.1145/3544548.3580753
Wang et al. (2023b) Qiaosi Wang, Michael Madaio, Shaun Kane, Shivani Kapania, Michael Terry, and Lauren Wilcox. 2023b. Designing Responsible AI: Adaptations of UX Practice to Meet Responsible AI Challenges. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems. https://doi.org/10.1145/3544548.3581278
Wang et al. (2021) Yihan Wang, Yutong Shao, and Ndapa Nakashole. 2021. Interactive Plot Manipulation Using Natural Language. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Demonstrations. https://doi.org/10.18653/v1/2021.naacl-demos.11
Wang et al. (2022a) Zijie J. Wang, Katie Dai, and W. Keith Edwards. 2022a. StickyLand: Breaking the Linear Presentation of Computational Notebooks. CHI EA (2022). https://doi.org/10.1145/3491101.3519653
Wang et al. (2022b) Zijie J. Wang, Alex Kale, Harsha Nori, Peter Stella, Mark E. Nunnally, Duen Horng Chau, Mihaela Vorvoreanu, Jennifer Wortman Vaughan, and Rich Caruana. 2022b. Interpretability, Then What? Editing Machine Learning Models to Reflect Human Knowledge and Values. In KDD. https://doi.org/10.1145/3534678.3539074
Wang et al. (2022c) Zijie J. Wang, Alex Kale, Harsha Nori, Peter Stella, Mark E. Nunnally, Duen Horng Chau, Mihaela Vorvoreanu, Jennifer Wortman Vaughan, and Rich Caruana. 2022c. Interpretability, Then What? Editing Machine Learning Models to Reflect Human Knowledge and Values. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD ’22). https://doi.org/10.1145/3534678.3539074
Wang et al. (2024) Zijie J. Wang, Chinmay Kulkarni, Lauren Wilcox, Michael Terry, and Michael Madaio. 2024. Farsight: Fostering Responsible AI Awareness During AI Application Prototy**. In CHI Conference on Human Factors in Computing Systems.
Wang et al. (2022d) Zijie J. Wang, David Munechika, Seongmin Lee, and Duen Horng Chau. 2022d. NOVA: A Practical Method for Creating Notebook-Ready Visual Analytics. arXiv (2022). http://arxiv.longhoe.net/abs/2205.03963
Wang et al. (2022e) Zijie J. Wang, Chudi Zhong, Rui Xin, Takuya Takagi, Zhi Chen, Duen Horng Chau, Cynthia Rudin, and Margo Seltzer. 2022e. TimberTrek: Exploring and Curating Sparse Decision Trees with Interactive Visualization. In VIS. https://doi.org/10.1109/VIS54862.2022.00021
Warmerdam et al. (2020) Vincent Warmerdam, Thomas Kober, and Rachael Tatman. 2020. Going beyond T-SNE: Exposing Whatlies in Text Embeddings. In Proceedings of Second Workshop for NLP Open Source Software (NLP-OSS). https://doi.org/10.18653/v1/2020.nlposs-1.8
Weights and Biases (2021) Weights and Biases. 2021. Weights & Biases: A Tool for Visualizing and Tracking Your Machine Learning Experiments. Weights & Biases. https://github.com/wandb/wandb
Weinman et al. (2021) Nathaniel Weinman, Steven M. Drucker, Titus Barik, and Robert DeLine. 2021. Fork It: Supporting Stateful Alternatives in Computational Notebooks. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. https://doi.org/10.1145/3411764.3445527
Wexler et al. (2019) James Wexler, Mahima Pushkarna, Tolga Bolukbasi, Martin Wattenberg, Fernanda Viegas, and Jimbo Wilson. 2019. The What-If Tool: Interactive Probing of Machine Learning Models. TVCG 26 (2019). https://doi.org/10.1109/TVCG.2019.2934619
Williams et al. (2019) Katy Williams, Alex Bigelow, and Kate Isaacs. 2019. Visualizing a Moving Target: A Design Study on Task Parallel Programs in the Presence of Evolving Data and Concerns. IEEE Transactions on Visualization and Computer Graphics (2019). https://doi.org/10.1109/TVCG.2019.2934285
Wouts (2019) Marc Wouts. 2019. Itables: Pandas DataFrames as Interactive DataTables. https://github.com/mwouts/itables
Wu et al. (2022) Aoyu Wu, Dazhen Deng, Furui Cheng, Yingcai Wu, Shixia Liu, and Huamin Qu. 2022. In Defence of Visual Analytics Systems: Replies to Critics. IEEE Transactions on Visualization and Computer Graphics (2022). https://doi.org/10.1109/TVCG.2022.3209360
Wu et al. (2019) Tongshuang Wu, Marco Tulio Ribeiro, Jeffrey Heer, and Daniel Weld. 2019. Errudite: Scalable, Reproducible, and Testable Error Analysis. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. https://doi.org/10.18653/v1/P19-1073
Wu et al. (2020) Yifan Wu, Joseph M. Hellerstein, and Arvind Satyanarayan. 2020. B2: Bridging Code and Interactive Visualization in Computational Notebooks. In UIST. https://doi.org/10.1145/3379337.3415851
Xenopoulos et al. (2023) Peter Xenopoulos, Joao Rulff, Luis Gustavo Nonato, Brian Barr, and Claudio Silva. 2023. Calibrate: Interactive Analysis of Probabilistic Model Output. TVCG 29 (2023). https://doi.org/10.1109/TVCG.2022.3209489
Yip et al. (2021) Carmen Yip, Jie Mi Chong, Sin Yee Kwek, Yong Wang, and Kotaro Hara. 2021. Visionary Caption: Improving the Accessibility of Presentation Slides Through Highlighting Visualization. In The 23rd International ACM SIGACCESS Conference on Computers and Accessibility. https://doi.org/10.1145/3441852.3476539
Yu et al. (2017) W. Yu, M. Carrasco Kind, and R.J. Brunner. 2017. Vizic: A Jupyter-based Interactive Visualization Tool for Astronomical Catalogs. Astronomy and Computing 20 (2017). https://doi.org/10.1016/j.ascom.2017.06.004
Zhang et al. (2023a) Ashley Zhang, Yan Chen, and Steve Oney. 2023a. VizProg: Identifying Misunderstandings By Visualizing Students’ Coding Progress. In CHI. https://doi.org/10.1145/3544548.3581516
Zhang et al. (2020) Amy X. Zhang, Michael Muller, and Dakuo Wang. 2020. How Do Data Science Workers Collaborate? Roles, Workflows, and Tools. Proceedings of the ACM on Human-Computer Interaction 4 (2020). https://doi.org/10.1145/3392826
Zhang et al. (2023b) Dan Zhang, Hannah Kim, Rafael Li Chen, Eser Kandogan, and Estevam Hruschka. 2023b. MEGAnno: Exploratory Labeling for NLP in Computational Notebooks. arXiv 2301.03095 (2023). http://arxiv.longhoe.net/abs/2301.03095
Zhao et al. (2022) Zhiming Zhao, Spiros Koulouzis, Riccardo Bianchi, Siamak Farshidi, Zeshun Shi, Ruyue Xin, Yuandou Wang, Na Li, Yifang Shi, Joris Timmermans, and W. Daniel Kissling. 2022. Notebook-as-a-VRE (NaaVRE): From Private Notebooks to a Collaborative Cloud Virtual Research Environment. Software: Practice and Experience 52 (2022). https://doi.org/10.1002/spe.3098

Appendix A Characterizing Notebook Visualization Tools

Below we characterize 163 collected notebook visualization tools using our organizational framework (targeted users and a four-dimensional design space described in § 3), as well as their supported notebook platforms and implementation methods. See SuperNOVA for an interactive version with more details about each entry.

[Uncaptioned image] — Table 1. This table describes the characterization of 163 notebook visualization tools using our organizational framework. The columns include notebook visualization tools’ names, intended users (data scientists , scientist , educators and students ); visualization-notebook communication styles (no direct communication , one-way , bidirectional ); data source (runtime , text and code , external ); display style (on-demand , always-on ); modularity (monolithic , modular ); their supported notebook platforms (Jupyter Notebook only , JupyterLab only , Jupyter Notebook + JupyterLab , all popular platforms ); and their implementation methods (NOVA , HTML display , ipywidget , Lab Extension , custom servers ).

Appendix B Data Collection Details

To study how researchers and practitioners design interactive visualization tools for computational notebooks, we collected and analyzed 64 academic papers and 105 systems in the wild. We define notebook visualization tools as systems that can display interactive visualizations in Python computational notebooks.

Literature Collection. We searched Google Scholar for notebook visualization tools and performed forward and backward reference searches to snowball the results. The venues of collected papers range from scientific journals (e.g., Bioinformatics andFrontiers in Neuroinformatics) to human-computer interaction and machine learning conferences (e.g., VIS, CHI, and NeurIPS).

Visualization Package Collection. We first scraped 8.6 million notebooks with .ipynb extension from GitHub. Each notebook file is a JSON file containing metadata about the notebook and the notebook cells. The notebook cells contain information about the cell type, the source code or text of the cell, and any output generated by the cell. We pruned the scraped notebooks to only those containing interactive components by searching for script tags in cell outputs of type text/html. If a cell was deemed a potential candidate, we extracted the associated source code for that cell. Next, for each candidate interactive notebook, we identified all modules in the notebook by parsing it as an abstract syntax tree and looking for import statements. Finally, we spliced the last line of the source code for the candidate cell into its individual variable components and checked if these matched any of the imported modules or their aliases.

Automating this procedure across 8.6 million notebooks, we built a comprehensive list of 984 Python packages that were potential visualization tools. Since this list of packages contained false positives (not all identified packages were interactive visualization tools), we manually examined each package to verify if it was an interactive visualization tool by looking at the source code and documentation for the package and its usage in notebooks. In total, we identified 105 packages that were also interactive visualization tools.

Appendix C Implementation Details

Depending on the need for a backend server, visualization-notebook communication, needed data types, and display styles, there are multiple methods with varying difficulties to implement notebook visualization tools. Note that some methods are only compatible with specific notebook platforms (e.g., JupyterLab, Colab, VSCode, and Kaggle Notebook).

C.1. With Backend Servers

To implement notebook visualization tools that require a backend server, the developer needs to configure the server to support notebooks and establish callback functions to share states with the notebook. The server can either be run directly from the notebook environment or externally. The front-end of the tool can then be displayed in the notebook using the notebook’s native HTML display. It is important to separate the server from the main thread if it is run directly from the notebook to avoid blocking the Python kernel. For example, Jupyter-Dash (Parmer, 2020) and LIT (Tenney et al., 2020) use this method with a Flask backend server and a direct WSGI server, respectively.

C.2. Without Backend Servers

If the tool does not require a server, several implementation methods depend on the visualization-notebook communications.

C.2.1. No Direct Communication.

If a web-based visualization tool does not communicate with the notebook environment, the developer can simply use the notebook’s native HTML display to show the tool a notebook cell. The HTML display internally uses iframe to embed any web documents.

C.2.2. One-way Communication.

To pass data from the notebook Python kernel to the visualization tool, one can use the Web standard’s postMessage method to send serialized Python objects as JSON text to the visualization tool’s iframe. See NOVA (Wang et al., 2022d) for more details and examples about this approach. Example tools include GAM Changer (Wang et al., 2022b) and TimberTrek (Wang et al., 2022e).

Alternatively, developers can use existing interactive visualization packages such as Plotly (Sievert et al., 2017), Bokeh (Bokeh Development Team, 2014), Altair (VanderPlas et al., 2018), and Panel (Rudiger, 2021) as building blocks to implement their visualization tools. Then, the developer can use these packages’ APIs to pass data from notebooks to the visualization tools. However, this approach is less customizable, and it is best suited for simpler tools. Example tools include InterpretML (Nori et al., 2019) and Nilearn (Abraham et al., 2014).

C.2.3. Bidirectional Communication.

To send data back from the visualization tool to the Python kernel, the developer needs to use platform-specific solutions, which vary across platforms because notebook platforms have different security protocols. For Jupyter Notebook and JupyterLab, one can use ipywidget with the comm protocol to synchronize states between the visualization tool and the notebook. Example tools include Mage (Kery et al., 2020) and pydec (Uber, 2016).

C.3. Access and Modify Code and Text

To access and modify notebook content outside of the Python kernel, such as raw code and text (§ 5.2), visualization tool developers need to use platform-specific APIs. For Jupyter notebooks, the developer can use Jupyter Notebook extension and JupyterLab extension APIs to read and write the notebook content. visualization tools using this method include B2 (Wu et al., 2020) and Wrex (Drosos et al., 2020).

C.4. Always-on Display

If a developer intends to implement an always-on display (§ 5.3) for their notebook visualization tool, they can use platform-specific APIs. For JupyterLab, the developer can implement the tool as a JupyterLab extension, which enables the display on persistent panels outside of the notebook’s main UI. Examples of such implementations include NVDashboard (NVIDIA, 2021) and AutoProfiler (Epperson et al., 2023). If the visualization tool does not require extensive visualization customization, the developer can also use existing visualization packages that support persistent display (e.g., Jupyter-Dash (Parmer, 2020)) to implement the tool. Alternatively, the developer can develop their visualization tool using a traditional on-demand display and instruct users to use StickyLand (Wang et al., 2022a) to enable persistent display. StickyLand allows users to easily create persistent “sticky” cells and dashboards by dragging any notebook cell to the edge of the notebook’s UI (Fig. 11).

Notebook Vis Tool	User	Com.	Data	Disp.	Mod.	Plat.	Imp.			Notebook Vis Tool	User	Com.	Data	Disp.	Mod.	Plat.	Imp.
Aequitas (Saleiro et al., 2019)										Altair (VanderPlas et al., 2018)
Anteater (Faust et al., 2022)										Apache Beam (Apache, 2019)
Argo Lite (Li et al., 2020)										Atria (Williams et al., 2019)
AutoProfiler (Epperson et al., 2023)										AutoViz (AutoViML, 2020)
Ax (Facebook, 2019)										B2 (Wu et al., 2020)
Bacata (Verano Merino et al., 2020)										Bamboolib (Krabel, 2019)
BERTopic (Grootendorst, 2022)										BertViz (Vig, 2019)
Bokeh (Bokeh Development Team, 2014)										bqplot (Bqplot, 2016)
Brax (Google, 2021)										Calibrate (Xenopoulos et al., 2023)
Calling Context Tree (Scully-Allison et al., 2022)										CatBoost (Prokhorenkova et al., 2019)
CausalVis (Guo et al., 2023)										ChartPy (Cuemacro, 2016)
Clustergrammar (Fernandez et al., 2017)										Cyberhubs (Herwig et al., 2018)
Cytoscapejs (Franz et al., 2022)										CZML3 (Poliastro, 2019)
d3fdgraph (Mining, 2019)										Data+Shift (Palmeiro et al., 2022)
Data-Purifier (Gupta, 2021)										datapane (Datapane, 2023)
DataPrep (Peng et al., 2021)										DEA Tools (Krause et al., 2021)
DocML (Bhat et al., 2023)										EDAssistant (Li et al., 2023b)
ELI5 (Korobov, 2016)										Emblaze (Sivaraman et al., 2022)
Errudite (Wu et al., 2019)										Escher (King, 2016)
Evidently (AI, 2022)										Farsight (Wang et al., 2024)
FiftyOne (Voxel51, 2020)										Firefly (Gurvich and Geller, 2023)
Flexx (Klein, 2016)										Folium (Fernandes, 2019)
GAM Changer (Wang et al., 2022b)										GateNLP (Petrak, 2020)
GeoPandas (Jordahl et al., 2022)										GeoViews (Rudiger, 2016)
GILP (Robbins et al., 2023)										Graphistry (Graphistry, 2016)
gravis (Haas, 2021)										Gwaihir (Simonne et al., 2022)
HCIplot (Gonzalez, 2019)										HiPlot (Facebook, 2020)
igv (Robinson, 2022)										imolecule (Fuller, 2013a)
InkSight (Lin et al., 2023)										Intake (Durant, 2018)
Interpret-Community (Microsoft, 2019)										InterpretML (Nori et al., 2019)
ipyannotate (Kukushkin, 2018)										ipydatagrid (Bloomberg, 2019)
ipysheet (QuantStack, 2017)										IPython Vega (Satyanarayan et al., 2017)
ipytree (QuantStack, 2022)										ipyvizzu (Vizzu, 2022)
itables (Wouts, 2019)										itkwidgets (McCormick et al., 2022)
ivpy (Crockett, 2021)										jgraph (Fuller, 2013b)
Jigsaw (Jain et al., 2022)										Jupyter Dash (Parmer, 2020)
JupyterPiDAQ (Lab, 2020)										jyquickhelper (dupré, 2016)
Kaleidoscope (QuSTaR, 2019)										KeplerGL (Keplergl, 2019)
Keras (Chollet, 2015)										LightGBM (Ke et al., 2017)
Lightkurve (Lightkurve Collaboration et al., 2018)										LIT Tool (Tenney et al., 2020)
Lux (Lee et al., 2021)										mage (Kery et al., 2020)
Matplotlib (Hunter, 2007)										Mayavi (Enthought, 2015)
MEGAnno (Zhang et al., 2023b)										MLProvLab (Kerzel et al., 2023)
ModelSketchBook (Lam et al., 2023)										mols2grid (Bouysset, 2021)
Moving Pandas (Graser and Dragaschnig, 2020)										NaaVRE (Zhao et al., 2022)
nbinteract (Lau and Hug, 2018)										Nbtutor (Logan, 2023)
Nengo (Nengo, 2019)										neo4jupyter (Maeztu, 2016)
Networkit (Angriman et al., 2022)										NGLview (Nguyen et al., 2018)
Nilearn (Abraham et al., 2014)										NL4DV (Narechania et al., 2021)
NoLiES (Sohns et al., 2022)										NOMAD (Sbailò et al., 2022)
Notable (Li et al., 2023a)										NVDashboard (NVIDIA, 2021)
Open3d (Org, 2019)										Pandas-Bokeh (Hlobil, 2018)
PandasGUI (Rose, 2020)										Panel (Rudiger, 2021)
Pathpy (Hackl, 2019)										Perspective (Stein, 2022)
PI2 (Chen and Wu, 2022)										Pigeon (Germanidis, 2017)
PipelineProfiler (Ono et al., 2021)										PixieDust (PixieDust, 2016)
Plotly (Sievert et al., 2017)										Plotly-Resampler (Van Der Donckt et al., 2022)
Plotting Agent (Wang et al., 2021)										Py2cytoscape (Boucas, 2015)
py3Dmol (Autodesk, 2016)										PyCaret (Baum, 2020)
pydeck (Uber, 2016)										pydgrid (Mauricio, 2017)
PyGeoHydro (Chegini et al., 2021)										pyLDAvis (Sievert and Shirley, 2014)
PyMove (Sampaio, 2018)										PyPathway (PyPathway, 2022)
PyPotree (Borelli, 2019)										Pytket (Sivarajah et al., 2020)
Pyvis (Perrone et al., 2020)										PyZX (Kissinger and van de Wetering, 2020)
Qgrid (Shawver, 2017)										Quantstats (Aroussi, 2019)
Quick-EDA (Venkatachalapathi, 2020)										RAI Widgets (Microsoft, 2020)
SHAP (Lundberg and Lee, 2017)										Slide4N (Wang et al., 2023a)
Smoothy (Araya et al., 2018)										Solas (Epperson et al., 2022)
Spatialtis (AaltoGIS, 2020)										ST-VISIONS (Tritsarolis et al., 2021)
StatCast Dashboard (Lage et al., 2016)										SweetViz (Bertrand, 2020)
Symphony (Bäuerle et al., 2022)										Taggle (Furmanova et al., 2020)
TensorBoard (Abadi et al., 2016)										TF Model Analysis (Google, 2018)
TimberTrek (Wang et al., 2022e)										TissUUmaps (Pielawski et al., 2022)
Toyplot (Laboratories, 2022)										Trimesh (Dawson-Haggerty et al., 2019)
ULCA (Fujiwara et al., 2022)										VAEX (Breddels, 2016)
VAINE (Guo et al., 2021)										visJS2jupyter (Rosenthal et al., 2018)
Visual Auditor (Munechika et al., 2022)										Vizic (Yu et al., 2017)
VizProg (Zhang et al., 2023a)										VizSeq (Wang et al., 2019)
VizSmith (Bavishi et al., 2021)										Wandb (Weights and Biases, 2021)
Weedle (Kwon et al., 2023)										What-if Tool (Wexler et al., 2019)
Whatlies (Warmerdam et al., 2020)										Wrex (Drosos et al., 2020)
ydata-profiling (Brugman, 2019)