Cited by Lee Sonogan
Abstract by Kohei Watanabe
Many social scientists recognize that quantitative text analysis is a useful research methodology, but its application is still concentrated in documents written in European languages, especially English, and few sub-fields of political science, such as comparative politics and legislative studies. This seems to be due to the absence of flexible and cost-efficient methods that can be used to analyze documents in different domains and languages. Aiming to solve this problem, this paper proposes a semisupervised document scaling technique, called Latent Semantic Scaling (LSS), which can locate documents on various pre-defined dimensions. LSS achieves this by combining user-provided seed words and latent semantic analysis (word embedding). The article demonstrates its flexibility and efficiency in large-scale sentiment analysis of New York Times articles on the economy and Asahi Shimbun articles on politics. These examples show that LSS can produce results comparable to that of the Lexicoder Sentiment Dictionary (LSD) in both English and Japanese with only small sets of sentiment seed words. A new heuristic method that assists LSS users to choose a near-optimal number of singular values to obtain word vectors that best capture differences between documents on target dimensions is also presented.
Publication: Communication Methods and Measures (Peer-Reviewed Journal)
Pub Date: 1 Nov, 2021 Doi: https://doi.org/10.1080/19312458.2020.1832976
Keywords: Latent, Semantic Scaling, Text Analysis, Domains and Language
https://www.tandfonline.com/doi/full/10.1080/19312458.2020.1832976 (Plenty more sections and references in this research article)