Computer Science > Digital Libraries
[Submitted on 11 Mar 2021]
Title:Document Towers: A MATLAB software implementing a three-dimensional architectural paradigm for the visual exploration of digital documents and libraries
View PDFAbstract:This article introduces the generic Document Towers paradigm, visualization, and software for visualizing the structure of paginated documents, based on the metaphor of documents-as-architecture. The Document Towers visualizations resemble three-dimensional building models and represent the physical boundaries of logical (e.g., titles, images), semantic (e.g., topics, named entities), graphical (e.g., typefaces, colors), and other types of information with spatial extent as a stack of rooms and floors. The software takes as input user-supplied JSON-formatted coordinates and labels of document entities, or extracts them itself from ALTO and InDesign IDML files. The Document Towers paradigm and visualization enable information systems to support information behaviors other than goal-oriented searches. Visualization encourages exploration by generating panoramic overviews and fostering serendipitous insights, while the use of metaphors assists with comprehension of the representations through the application of a familiar cognitive model. Document Towers visualizations also provide access to types of information other than textual content, specifically by means of their physical structure, which corresponds to the material, logical, semantic, and contextual aspects of documents. Visualization renders documents transparent, making the invisible visible and facilitating analysis at a glance and without the need for physical manipulation. Keyword searches and other language-based interactions with documents must be clearly expressed and will return only answers to questions asked; by contrast, visual observation is well suited to fuzzy goals and uncovering unexpected aspects of the data.
References & Citations
Bibliographic and Citation Tools
Bibliographic Explorer (What is the Explorer?)
Litmaps (What is Litmaps?)
scite Smart Citations (What are Smart Citations?)
Code, Data and Media Associated with this Article
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub (What is DagsHub?)
Gotit.pub (What is GotitPub?)
Papers with Code (What is Papers with Code?)
ScienceCast (What is ScienceCast?)
Demos
Recommenders and Search Tools
Influence Flower (What are Influence Flowers?)
Connected Papers (What is Connected Papers?)
CORE Recommender (What is CORE?)
arXivLabs: experimental projects with community collaborators
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.
Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.
Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.