Computational approaches to understanding stylistic variation in online writing
MetadataShow full item record
Language use in online interactions varies from community to community, from individual to individual, and even for individuals in different contexts. While prior work has identified these differences, far less is understood about why these differences have arisen in online writing. My dissertation focuses on this why question. The reasons for linguistic diversity in online writing could be multifold. As more and more interpersonal social interactions are conducted through technology-mediated channels, there is an increasing need to express multiple social meanings in varied social situations through linguistic means. In the absence of non-verbal cues, the technology-mediated channels provide several affordances to conduct interpersonal interactions. How do factors that are unique to online writing, such as the need to convey varied social meanings and the affordances in technology-mediated channels, shape online writing? My dissertation investigates this interplay through a series of large-scale computational studies of linguistic style variation in online writing. Using unsupervised methods and causal statistical analysis, I have investigated the social meaning of varied non-standard language usage in social media and the effects of new technological affordances in online social platforms on individuals' writing style. To quantitatively study community-level stylistic variation at scale, I have developed a multi-dimensional style lexicon using unsupervised techniques and used it to study style-shifting in online multi-communities. Further, I have investigated how writing style norm enforcement in online platforms affects stylistic variation in online writing. My dissertation will advance our understanding of how individuals utilize the affordances in online social platforms and shift style to achieve varied social goals in online interpersonal interactions. Understanding the social dimensions of linguistic style variation in online writing has important consequences for the design of language technology and social computing systems, and beyond.