. Automatic Discrimination between Biomedical-Engineering and Clinical-Medicine Papers Based on Decision-Tree Algorithms: How Does the Term Usage Differ?

Authors: Motoki Sakai
DIN
IJOER-JUL-2016-2
Abstract

Biomedical engineering (BM) is a successful example of integrated research. This research area is concerned with solving problems in clinical-medicine (CM) research using techniques such as information engineering. In this research field, novice investigators sometimes have difficulty in searching for and retrieving BM papers, because both BM and CM research papers contain common terms, such as disease names, so a novice researcher cannot retrieve only BM papers from the search results. Thus, this research proposes a decision-tree and random-forest-based method to automatically discriminate between BM and CM papers, and reveals a difference in term usage between BM and CM papers. The discrimination between BM and CM papers was examined by collecting papers containing five common terms: obstructive sleep apnea syndrome (OSAS), T-wave alternans (TWA), late potential (LP), epilepsy (EPY), and event-related potential (ERP). The gathered BM and CM papers were converted into document-term (D-T) matrices, and were discriminated with the decision-tree or random-forest algorithm. Results showed that the decision tree discriminated them with approximately 80% averaged accuracy and sensitivity and approximately 70% specificity, and the random forest discriminated them with approximately 90% averaged accuracy, sensitivity, and specificity. In addition, it was revealed that the terms “signal”, “detection”, “method”, “based”, “patient”, and “with” were effective for discriminating between BM and CM papers.

Keywords
Biomedical engineering clinical medicine discrimination decision tree random forest
Introduction

Sometimes, researchers in segmented academic fields have difficulty finding new ideas or solutions only from their field. A research field integrating various academic disciplines holds the possibility of discovering new academic insights or offering a new effective solution for a certain problem. Collaborative research among different academic fields has broadened researchers‘ horizons, connected different logics or techniques, and overcome limitations in a research field.

Cooperation between clinical-medicine (CM) and engineering research areas is a typical successful example of integrated research; this collaborative research field is called biomedical engineering (BM). (To be exact, BM research is also a type of segmented academic field and is an entrenched research area. However, this is actually nothing more than collaborative research between CM and engineering study areas.) BM is a research field that resolves problems in the CM research area using information technology, mechanical engineering, or electrical and electronic engineering techniques. (BM studies relate extensively to several engineering-research areas, but this study focuses only on the information-engineering research field.) Concretely, BM researchers attempt to automate medical diagnoses or extract information about an imperceptible phenomenon using different signal-processing techniques. In general, BM researchers first collect information on the background of a certain CM research area because they have problems in common with CM researchers. Next, they survey and read previous BM research papers related to the CM research to crystallize their study at an early stage. 

As it is, novice BM researchers, such as undergraduate students, have trouble surveying related BM research papers, because the BM area is relevant to both information engineering and CM studies, and BM and CM studies have common academic terms. Let us assume that a common academic term is "arrhythmia.‖ For instance, one CM paper may present a case report on arrhythmia patients. On the other hand, a BM paper may propose an effective signal-processing algorithm to detect an arrhythmia episode from the electrocardiogram (ECG) signal. Thus, a common academic term appears in different contexts in both BM and CM journal papers.

Conclusion

The goal of this study was to discriminate between BM and CM journal papers. The decision-tree and random-forest algorithms were adopted for this purpose. The results showed that BM and CM papers were discriminated with approximately 80% through 90% precision. In addition, the proposed methods revealed effective terms to discriminate between BM and CM papers, e.g., ―signal‖, ―detection‖, ―patient‖, ―method‖, etc.

Article Preview