-
Deep Learning for Technical Document Classification
Authors:
Shuo Jiang,
Jie Hu,
Christopher L. Magee,
Jianxi Luo
Abstract:
In large technology companies, the requirements for managing and organizing technical documents created by engineers and managers have increased dramatically in recent years, which has led to a higher demand for more scalable, accurate, and automated document classification. Prior studies have only focused on processing text for classification, whereas technical documents often contain multimodal…
▽ More
In large technology companies, the requirements for managing and organizing technical documents created by engineers and managers have increased dramatically in recent years, which has led to a higher demand for more scalable, accurate, and automated document classification. Prior studies have only focused on processing text for classification, whereas technical documents often contain multimodal information. To leverage multimodal information for document classification to improve the model performance, this paper presents a novel multimodal deep learning architecture, TechDoc, which utilizes three types of information, including natural language texts and descriptive images within documents and the associations among the documents. The architecture synthesizes the convolutional neural network, recurrent neural network, and graph neural network through an integrated training process. We applied the architecture to a large multimodal technical document database and trained the model for classifying documents based on the hierarchical International Patent Classification system. Our results show that TechDoc presents a greater classification accuracy than the unimodal methods and other state-of-the-art benchmarks. The trained model can potentially be scaled to millions of real-world multimodal technical documents, which is useful for data and knowledge management in large technology companies and organizations.
△ Less
Submitted 19 February, 2022; v1 submitted 27 June, 2021;
originally announced June 2021.
-
Technological improvement rate estimates for all technologies: Use of patent data and an extended domain description
Authors:
Anuraag Singh,
Giorgio Triulzi,
Christopher L. Magee
Abstract:
In this work, we attempt to provide a comprehensive granular account of the pace of technological change. More specifically, we survey estimated yearly performance improvement rates for nearly all definable technologies for the first time. We do this by creating a correspondence of all patents within the US patent system to a set of technology domains. A technology domain is a body of patented inv…
▽ More
In this work, we attempt to provide a comprehensive granular account of the pace of technological change. More specifically, we survey estimated yearly performance improvement rates for nearly all definable technologies for the first time. We do this by creating a correspondence of all patents within the US patent system to a set of technology domains. A technology domain is a body of patented inventions achieving the same technological function using the same knowledge and scientific principles. We obtain a set of 1757 domains using an extension of the previously defined classification overlap method (COM). These domains contain 97.14% of all patents within the entire US patent system. From the identified patent sets, we calculated the average centrality of the patents in each domain to estimate their improvement rates, following a methodology tested in prior work. The estimated improvement rates vary from a low of 1.9% per year for the Mechanical Skin treatment - Hair Removal and wrinkles domain to a high of 228.8% per year for the Network management - client-server applications domain. We developed a one-line descriptor identifying the technological function achieved and the underlying knowledge base for the largest 50, fastest 20 as well as slowest 20 of these domains, which cover more than forty percent of the patent system. In general, the rates of improvement were not a strong function of the patent set size and the fastest improving domains are predominantly software-based. We make available an online system that allows for automated searching for domains and improvement rates corresponding to any technology of interest to researchers, strategists and policy formulators.
△ Less
Submitted 28 April, 2020;
originally announced April 2020.
-
A Convolutional Neural Network-based Patent Image Retrieval Method for Design Ideation
Authors:
Shuo Jiang,
Jianxi Luo,
Guillermo Ruiz Pava,
Jie Hu,
Christopher L. Magee
Abstract:
The patent database is often used in searches of inspirational stimuli for innovative design opportunities because of its large size, extensive variety and rich design information in patent documents. However, most patent mining research only focuses on textual information and ignores visual information. Herein, we propose a convolutional neural network (CNN)-based patent image retrieval method. T…
▽ More
The patent database is often used in searches of inspirational stimuli for innovative design opportunities because of its large size, extensive variety and rich design information in patent documents. However, most patent mining research only focuses on textual information and ignores visual information. Herein, we propose a convolutional neural network (CNN)-based patent image retrieval method. The core of this approach is a novel neural network architecture named Dual-VGG that is aimed to accomplish two tasks: visual material type prediction and international patent classification (IPC) class label prediction. In turn, the trained neural network provides the deep features in the image embedding vectors that can be utilized for patent image retrieval and visual map**. The accuracy of both training tasks and patent image embedding space are evaluated to show the performance of our model. This approach is also illustrated in a case study of robot arm design retrieval. Compared to traditional keyword-based searching and Google image searching, the proposed method discovers more useful visual information for engineering design.
△ Less
Submitted 19 May, 2020; v1 submitted 10 March, 2020;
originally announced March 2020.
-
Forecasting the value of battery electric vehicles compared to internal combustion engine vehicles: the influence of driving range and battery technology
Authors:
JongRoul Woo,
Christopher L. Magee
Abstract:
Battery electric vehicles (BEVs) are now clearly a promising candidate in addressing the environmental problems associated with conventional internal combustion engine vehicles (ICEVs). However, BEVs, unlike ICEVs, are still not widely accepted in the automobile market but continuing technological change could overcome this barrier. The aim of this study is to assess and forecast whether and when…
▽ More
Battery electric vehicles (BEVs) are now clearly a promising candidate in addressing the environmental problems associated with conventional internal combustion engine vehicles (ICEVs). However, BEVs, unlike ICEVs, are still not widely accepted in the automobile market but continuing technological change could overcome this barrier. The aim of this study is to assess and forecast whether and when design changes and technological improvements related to major challenges in driving range and battery cost will make the user value of BEVs greater than the user value of ICEVs. Specifically, we estimate the relative user value of BEVs and ICEVs resulting after design modifications to achieve different driving ranges by considering the engineering trade-offs based on a vehicle simulation. Then, we analyze when the relative user value of BEVs is expected to exceed ICEVs as the energy density and cost of batteries improve because of ongoing technological change. Our analysis demonstrates that the relative value of BEVs is lower than that of ICEVs because BEVs have high battery cost and high cost of time spent recharging despite high torque, high fuel efficiency, and low fuel cost. Moreover, we found the relative value differences between BEVs and ICEVs are found to be less in high performance large cars than in low performance compact cars because BEVs can achieve high acceleration performance more easily than ICEVs. In addition, this study predicts that in approximately 2050, high performance large BEVs could have higher relative value than high performance large ICEVs because of technological improvements in batteries; however low performance compact BEVs are still very likely to have significantly lower user value than comparable ICEVs until well beyond 2050.
△ Less
Submitted 31 May, 2018;
originally announced June 2018.
-
Dynamic patterns of knowledge flows across technological domains: empirical results and link prediction
Authors:
Jieun Kim,
Christopher L. Magee
Abstract:
The purpose of this study is to investigate the structure and evolution of knowledge spillovers across technological domains. Specifically, dynamic patterns of knowledge flow among 29 technological domains, measured by patent citations for eight distinct periods, are identified and link prediction is tested for capability for forecasting the evolution in these cross-domain patent networks. The ove…
▽ More
The purpose of this study is to investigate the structure and evolution of knowledge spillovers across technological domains. Specifically, dynamic patterns of knowledge flow among 29 technological domains, measured by patent citations for eight distinct periods, are identified and link prediction is tested for capability for forecasting the evolution in these cross-domain patent networks. The overall success of the predictions using the Katz metric implies that there is a tendency to generate increased knowledge flows mostly within the set of previously linked technological domains. This study contributes to innovation studies by characterizing the structural change and evolutionary behaviors in dynamic technology networks and by offering the basis for predicting the emergence of future technological knowledge flows.
△ Less
Submitted 21 June, 2017;
originally announced June 2017.
-
Testing the science/technology relationship by analysis of patent citations of scientific papers after decomposition of both science and technology
Authors:
Fang Han,
Christopher L. Magee
Abstract:
The relationship of scientific knowledge development to technological development is widely recognized as one of the most important and complex aspects of technological evolution. This paper adds to our understanding of the relationship through use of a more rigorous structure for differentiating among technologies based upon technological domains (defined as consisting of the artifacts over time…
▽ More
The relationship of scientific knowledge development to technological development is widely recognized as one of the most important and complex aspects of technological evolution. This paper adds to our understanding of the relationship through use of a more rigorous structure for differentiating among technologies based upon technological domains (defined as consisting of the artifacts over time that fulfill a specific generic function using a specific body of technical knowledge).
△ Less
Submitted 29 April, 2017;
originally announced May 2017.
-
Quantitative identification of technological discontinuities using simulation modeling
Authors:
Hyunseok Park,
Christopher L. Magee
Abstract:
The aim of this paper is to develop and test metrics to quantitatively identify technological discontinuities in a knowledge network. We developed five metrics based on innovation theories and tested the metrics by a simulation model-based knowledge network and hypothetically designed discontinuity. The designed discontinuity is modeled as a node which combines two different knowledge streams and…
▽ More
The aim of this paper is to develop and test metrics to quantitatively identify technological discontinuities in a knowledge network. We developed five metrics based on innovation theories and tested the metrics by a simulation model-based knowledge network and hypothetically designed discontinuity. The designed discontinuity is modeled as a node which combines two different knowledge streams and whose knowledge is dominantly persistent in the knowledge network. The performances of the proposed metrics were evaluated by how well the metrics can distinguish the designed discontinuity from other nodes on the knowledge network. The simulation results show that the persistence times # of converging main paths provides the best performance in identifying the designed discontinuity: the designed discontinuity was identified as one of the top 3 patents with 96~99% probability by Metric 5 and it is, according to the size of a domain, 12~34% better than the performance of the second best metric. Beyond the simulation analysis, we tested the metrics using a patent set representative of the Magnetic information storage domain. The three representative patents associated with a well-known breakthrough technology in the domain, the giant magneto-resistance (GMR) spin valve sensor, were selected based on the qualitative studies, and the metrics were tested by how well the metrics identify the selected patents as top-ranked patents. The empirical results fully support the simulation results and therefore the persistence times # of converging main paths is recommended for identifying technological discontinuities for any technology.
△ Less
Submitted 13 September, 2016;
originally announced September 2016.
-
Tracing technological development trajectories: A genetic knowledge persistence-based main path approach
Authors:
Hyunseok Park,
Christopher L. Magee
Abstract:
The aim of this paper is to propose a new method to identify main paths in a technological domain using patent citations. Previous approaches for using main path analysis have greatly improved our understanding of actual technological trajectories but nonetheless have some limitations. They have high potential to miss some dominant patents from the identified main paths; nonetheless, the high netw…
▽ More
The aim of this paper is to propose a new method to identify main paths in a technological domain using patent citations. Previous approaches for using main path analysis have greatly improved our understanding of actual technological trajectories but nonetheless have some limitations. They have high potential to miss some dominant patents from the identified main paths; nonetheless, the high network complexity of their main paths makes qualitative tracing of trajectories problematic. The proposed method searches backward and forward paths from the high-persistence patents which are identified based on a standard genetic knowledge persistence algorithm. We tested the new method by applying it to the desalination and the solar photovoltaic domains and compared the results to output from the same domains using a prior method. The empirical results show that the proposed method overcomes the aforementioned drawbacks defining main paths that are almost 10x less complex while containing more of the relevant important knowledge than the main path networks defined by the existing method.
△ Less
Submitted 26 August, 2016;
originally announced August 2016.
-
Decomposition and Analysis of Technological domains for better understanding of Technological Structure
Authors:
Xin Guo,
Hyunseok Park,
Christopher L. Magee
Abstract:
Patents represent one of the most complete sources of information related to technological change. This paper presents three months of research on U.S. patents in the field of patent analysis. The methodology consists of using search terms to locate the most representative international and US patent classes and determines the overlap of those classes to arrive at the final set of patents and usin…
▽ More
Patents represent one of the most complete sources of information related to technological change. This paper presents three months of research on U.S. patents in the field of patent analysis. The methodology consists of using search terms to locate the most representative international and US patent classes and determines the overlap of those classes to arrive at the final set of patents and using the prediction model developed by Benson and Magee to calculate the technological improvement rate for the technological domains. My research focused on the Biochemical Pharmacology technological area and selecting relevant patents for technological domains and sub-domains within this area. The goal is to better understand structure of technology domain and understand how fast the domains and their sub-domains progress. The method I used is developed by Benson and Magee which is called the Classification Overlap Method1, it provides a reliable and largely automated way to break the patent database into understandable technological domains where progress can be measured.
△ Less
Submitted 19 April, 2016;
originally announced April 2016.
-
Modeling of technological performance trends using design theory
Authors:
Subarna Basnet,
Christopher L. Magee
Abstract:
Functional technical performance usually follows an exponential dependence on time but the rate of change (the exponent) varies greatly among technological domains. This paper presents a simple model that provides an explanatory foundation for these phenomena based upon the inventive design process.
The model assumes that invention - novel and useful design- arises through probabilistic analogic…
▽ More
Functional technical performance usually follows an exponential dependence on time but the rate of change (the exponent) varies greatly among technological domains. This paper presents a simple model that provides an explanatory foundation for these phenomena based upon the inventive design process.
The model assumes that invention - novel and useful design- arises through probabilistic analogical transfers that combine existing knowledge by combining existing individual operational ideas to arrive at new individual operating ideas. The continuing production of individual operating ideas relies upon injection of new basic individual operating ideas that occurs through coupling of science and technology simulations.
The individual operational ideas that result from this process are then modeled as being assimilated in components of artifacts characteristic of a technological domain. According to the model, two effects (differences in interactions among components for different domains and differences in scaling laws for different domains) account for the differences found in improvement rates among domains whereas the analogical transfer process is the source of the exponential behavior. The model is supported by a number of known empirical facts: further empirical research is suggested to independently assess further predictions made by the model.
△ Less
Submitted 11 February, 2016;
originally announced February 2016.