Kosan, Muhammed AliKaracan, HacerÜrgen, Burcu Ayşen2024-03-152024-03-152023-04-230941-0643https://hdl.handle.net/11693/114786Users' personality traits can provide different clues about them in the Internet environment. Some areas where these clues can be used are law enforcement, advertising agencies, recruitment processes, and e-commerce applications. In this study, it is aimed to create a dataset and a prediction model for predicting the personality traits of Internet users who produce Turkish content. The main contribution of the study is the personality traits dataset composed of the Turkish Twitter content. In addition, the preprocessing, vectorization, and deep learning model comparisons made in the proposed prediction system will contribute to both current usages and future studies in the relevant literature. It has been observed that the success of the Bidirectional Encoder Representations from Transformers vectorization method and the Stemming preprocessing step on the Turkish personality traits dataset is high. In the previous studies, the effects of these processes on English datasets were reported to have lower success rates. In addition, the results show that the Bidirectional Long Short-Term Memory deep learning method has a better level of success than other methods both for the Turkish dataset and English datasets. © 2023, The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature.enCC BY 4.0 Deed (Attribution 4.0 International)https://creativecommons.org/licenses/by/4.0/Personality datasetPersonality prediction modelPreprocessingTurkish Twitter contentPersonality traits prediction model from Turkish contents with semantic structuresArticle10.1007/s00521-023-08603-z1433-3058