Advances in Economics, Management and Political Sciences
- The Open Access Proceedings Series for Conferences
Series Vol. 46 , 01 December 2023
* Author to whom correspondence should be addressed.
Real estate price prediction is one of the key research topics contemporarily. Based on the rapid development of Big Data, machine learning has gradually become the mainstream tool for housing price prediction. The XGboost and LightGBM models, as new advanced mod-els in recent years, have received widespread attention in the application in housing price prediction. Therefore, this study identifies the house price prediction based on XGboost model and LightGBM model and compares them with other models in order to obtain an analysis of the advantages and disadvantages of these two models in housing price predic-tion. According to the analysis, both models have ad-vantages such as high accuracy, high efficiency, and fast training speed. However, although XGboost has the smallest error pre-diction, it requires more computational time, thereby increasing computational costs. In ad-dition, LightGBM has disadvantages such as high overfitting risk in small sample sizes and increased sensitivity in noisy datasets. Therefore, besides the model studied in this article, feature selection methods such as Filter and Wrapper can also be introduced in subsequent studies to further improve the prediction accuracy.
house price prediction, LightGBM, XGboost
1. Yin, Q., Shen, X., Xia, Y.: The application of machine learning in data mining in the context of Big data Digital Tech-nology and Applications, 5, 21-23 (2022).
2. Zhao, J., Bai, Z., Zhao, J.: Machine Learning data analysis and processing methods for Big data technology Shanxi Electronic Technology, 3, 9-11+17 (2022).
3. Tan, C., Zhou, X., Zhu Y.: Research on Data Mining Methods Based on Weka and Collaborative Machine Learning Technology Journal of Changchun University 12, 5-9 (2020).
4. Pan, Z.: Discuss the application and development of machine learning in the era of Big data Electronic Components and Information Technology, 4, 66-69 (2022).
5. Li, T., Xu, C., Cao, L., Wang Y.: Sales volume prediction based on BP neural network China New Communications, 1, 137 (2020).
6. Zhang, J., Du, J.: A housing price prediction model based on XGBoost and multiple machine learning methods Mod-ern Information Technology 10, 15-18 (2020).
7. Shen, J., Zhao, X.: Research on Data Resource Value Evaluation Method Based on Dynamic Stacked GBDT Algorithm Research on Science and Technology Management, 1, 53-61 (2023).
8. Chen, T., He, T., Benesty, M., et al.: Xgboost: extreme gradient boosting R package version 0.4-2, 1(4), 1-4 (2021).
9. Ke, G., Meng, Q., Finley, T., et al.: Lightgbm: A highly effective gradient boosting decision tree Advances in neural information processing systems, 30 (2017).
10. Peng, Z., Huang, Q., Han, Y.: Model research on forecast of second-hand house price in Chengdu based on XGboost algorithm. In 2019 IEEE 11th International Conference on Advanced Infocomm Technology (ICAIT), 168-172, (2019, October).
11. Rolli, C. S.: Zillow Home Value Prediction (Zestimate) By Using XGBoost (2020).
12. Cao, B., Yang, B.: Research on ensemble learning-based housing price prediction model. Big Geospatial Data and Data Science, 1(1), 1-8, (2018).
13. Rampini, L., Re Cecconi, F.: Artificial intelligence algorithms to predict Italian real estate market prices. Journal of Property Investment & Finance, 40(6), 588-611 (2022).
14. Sibindi, R., Mwangi, R. W., Waititu, A. G.: A boosting ensemble learning based hybrid light gradient boosting machine and extreme gradient boosting model for predicting house prices. Engineering Reports, e12599 (2022).
15. Li, T., Akiyama, T., Wei, L.: Constructing a highly accurate price prediction model in real estate investment using LightGBM. In 2021 IEEE 4th International Conference on Multimedia Information Processing and Retrieval (MIPR), 273-276 (2021).
16. John, L., Shinde, R., Shaikh, S., Ashar, D.: Predicting House Prices using Machine Learning and LightGBM. SSRN, 8, 1 (2022).
The datasets used and/or analyzed during the current study will be available from the authors upon reasonable request.
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. Authors who publish this series agree to the following terms:
1. Authors retain copyright and grant the series right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this series.
2. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the series's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this series.
3. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See Open Access Instruction).