TY - JOUR
T1 - Can feature structure improve model’s precision? A novel prediction method using artificial image and image identification
AU - He, Yupeng
AU - Sun, Qiwen
AU - Matsunaga, Masaaki
AU - Ota, Atsuhiko
N1 - Publisher Copyright:
© 2024 Oxford University Press. All rights reserved.
PY - 2024/4/1
Y1 - 2024/4/1
N2 - Objectives: This study aimed to develop an approach to enhance the model precision by artificial images. Materials and Methods: Given an epidemiological study designed to predict 1 response using f features with M samples, each feature was converted into a pixel with certain value. Permutated these pixels into F orders, resulting in F distinct artificial image sample sets. Based on the experience of image recognition techniques, appropriate training images results in higher precision model. In the preliminary experiment, a binary response was predicted by 76 features, the sample set included 223 patients and 1776 healthy controls. Results: We randomly selected 10 000 artificial sample sets to train the model. Models’ performance (area under the receiver operating characteristic curve values) depicted a bell-shaped distribution. Conclusion: The model construction strategy developed in the research has potential to capture feature order related information and enhance model predictability. Lay Summary We aimed to demonstrate a novel method to investigate the effect of feature structure on model predictability with epidemiological data. The concept was inspired from image identification. Pixels in digital images are used as features when training the identification model. The quality of a given digital image will be damaged when pixels’ position and their values changed arbitrarily, which obstructs the model training and model’s precision. We assume the structure-related relationship exists in epidemiological data. Given a certain dataset, features are transformed to pixel values for generating artificial images. To explore the effect of feature structure, orders of pixels are randomly permutated and the model is trained using pixel-permutated artificial image sample sets. In the preliminary experiment, one binary response was designed to be predicted by 76 features. We randomly selected 10 000 artificial image sample sets to train the model. Models’ performance (area under the receiver operating characteristic curve values) depicted a bell-shaped distribution. Namely, the performance of each model’s predictability was studied and the feature structure information had a strong impact on model performance. Our novel model construction strategy has potential to capture feature order related information and enhance model predictability.
AB - Objectives: This study aimed to develop an approach to enhance the model precision by artificial images. Materials and Methods: Given an epidemiological study designed to predict 1 response using f features with M samples, each feature was converted into a pixel with certain value. Permutated these pixels into F orders, resulting in F distinct artificial image sample sets. Based on the experience of image recognition techniques, appropriate training images results in higher precision model. In the preliminary experiment, a binary response was predicted by 76 features, the sample set included 223 patients and 1776 healthy controls. Results: We randomly selected 10 000 artificial sample sets to train the model. Models’ performance (area under the receiver operating characteristic curve values) depicted a bell-shaped distribution. Conclusion: The model construction strategy developed in the research has potential to capture feature order related information and enhance model predictability. Lay Summary We aimed to demonstrate a novel method to investigate the effect of feature structure on model predictability with epidemiological data. The concept was inspired from image identification. Pixels in digital images are used as features when training the identification model. The quality of a given digital image will be damaged when pixels’ position and their values changed arbitrarily, which obstructs the model training and model’s precision. We assume the structure-related relationship exists in epidemiological data. Given a certain dataset, features are transformed to pixel values for generating artificial images. To explore the effect of feature structure, orders of pixels are randomly permutated and the model is trained using pixel-permutated artificial image sample sets. In the preliminary experiment, one binary response was designed to be predicted by 76 features. We randomly selected 10 000 artificial image sample sets to train the model. Models’ performance (area under the receiver operating characteristic curve values) depicted a bell-shaped distribution. Namely, the performance of each model’s predictability was studied and the feature structure information had a strong impact on model performance. Our novel model construction strategy has potential to capture feature order related information and enhance model predictability.
KW - artificial image
KW - image identification
KW - machine learning
KW - neural network
KW - prediction model
UR - http://www.scopus.com/inward/record.url?scp=85184931288&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85184931288&partnerID=8YFLogxK
U2 - 10.1093/jamiaopen/ooae012
DO - 10.1093/jamiaopen/ooae012
M3 - Article
AN - SCOPUS:85184931288
SN - 2574-2531
VL - 7
JO - JAMIA Open
JF - JAMIA Open
IS - 1
M1 - 7
ER -