刘玉航,曲 媛,蒋嘉铭,宗万里,朱习军.基于优化随机森林算法预测食品检验不合格指标[J].食品安全质量检测学报,2021,12(18):7467-7472
基于优化随机森林算法预测食品检验不合格指标
Prediction of unqualified index of food inspection based on optimized random forest algorithm
投稿时间:2021-06-09  修订日期:2021-08-24
DOI:
中文关键词:  食品安全数据  决策树  随机森林  参数优化  超参数网格搜索
英文关键词:food safety data  decision tree  random forest  parameter optimization  hyper parametric grid search
基金项目:山东省产教融合研究生联合培养示范基地项目(2020-19)
作者单位
刘玉航 青岛科技大学信息科学技术学院 
曲 媛 青岛科技大学信息科学技术学院 
蒋嘉铭 青岛科技大学信息科学技术学院 
宗万里 威海市食品药品检验检测中心 
朱习军 青岛科技大学信息科学技术学院 
AuthorInstitution
LIU Yu-Hang College of Information Science and Technology, Qingdao University of Science and Technology 
QU Yuan College of Information Science and Technology, Qingdao University of Science and Technology 
JIANG Jia-Ming College of Information Science and Technology, Qingdao University of Science and Technology 
ZONG Wan-Li Weihai Food and Drug Inspection and Testing Center 
ZHU Xi-Jun College of Information Science and Technology, Qingdao University of Science and Technology 
摘要点击次数: 732
全文下载次数: 240
中文摘要:
      目的 建立基于优化的随机森林算法模型实现对食品不合格指标的分类预测。方法 通过收集山东省食品药品监督管理局2015—2019年食品安全抽样检验产生的不合格数据, 并对其进行多项数据预处理操作, 采用超参数网格搜索和10折交叉验证方法建立基于随机森林的食品不合格指标的分类预测模型, 并通过对传统随机森林模型的参数优化, 将其与决策树(decision tree, DT)、逻辑回归(logistic regression, LR)和梯度提升决策树(gradient boosting decision tree, GBDT)算法分类预测结果进行对比。结果 实验表明经过参数优化后的随机森林模型对食品中不合格指标的预测准确率能够达到89.4%, 比DT算法提高了11.0%, 比LR算法提高了9.0%, 比GBDT算法提高了8.1%。结论 基于优化的随机森林模型可以完成食品不合格指标分类预测任务, 有广阔的应用前景。
英文摘要:
      Objective To establish a random forest algorithm model based on optimization and realize the classification and prediction of food unqualified indexes. Methods Through the collection of unqualified data generated by the food safety sampling inspection from 2015 to 2019 issued by the official website of Shandong Food and Drug Administration, and a number of data preprocessing operations, the hyper parameter grid search and 10-folds cross-validation method were used to establish a classification prediction model based on random forest-based food unqualified indicators. In addition, the parameters of the traditional random forest model was optimized, and compared with algorithm classification prediction results of decision tree (DT), logistic regression (LR) and gradient boosting decision tree (GBDT). Results Experiments showed that the random forest model after parameter optimization could achieve 89.4% prediction accuracy of unqualified indicators in food, which was 11.0% higher than the DT algorithm, 9.0% higher than the LR algorithm, and 8.1% higher than the GBDT algorithm. Conclusion The optimized random forest model can complete the classification and prediction task of food unqualified indicators, and has broad application prospects.
查看全文  查看/发表评论  下载PDF阅读器