• Subject List
  • Take a Tour
  • For Authors
  • Subscriber Services
  • Publications
  • African American Studies
  • African Studies
  • American Literature
  • Anthropology
  • Architecture Planning and Preservation
  • Art History
  • Atlantic History
  • Biblical Studies
  • British and Irish Literature
  • Childhood Studies
  • Chinese Studies
  • Cinema and Media Studies
  • Communication
  • Criminology
  • Environmental Science
  • Evolutionary Biology
  • International Law
  • International Relations
  • Islamic Studies
  • Jewish Studies
  • Latin American Studies
  • Latino Studies

Linguistics

  • Literary and Critical Theory
  • Medieval Studies
  • Military History
  • Political Science
  • Public Health
  • Renaissance and Reformation
  • Social Work
  • Urban Studies
  • Victorian Literature
  • Browse All Subjects

How to Subscribe

  • Free Trials

In This Article Expand or collapse the "in this article" section Linguistic Profiling and Language-Based Discrimination

Introduction, general overviews.

  • Foundations and Extensions
  • Phonetic and Acoustical Analyses
  • Linguistic Discrimination in the Workplace
  • Evidence of Linguistic Bias in Housing Markets
  • Educational Studies: Bilingual and Bidialectal Considerations
  • Language and Gender: Evidence of Bias in Linguistic Style and Conversation
  • Language Usage among Racial and Ethnic Minorities
  • Linguistic Prejudice and the Law
  • Linguistic Human Rights
  • Alternative Perspectives

Related Articles Expand or collapse the "related articles" section about

About related articles close popup.

Lorem Ipsum Sit Dolor Amet

Vestibulum ante ipsum primis in faucibus orci luctus et ultrices posuere cubilia Curae; Aliquam ligula odio, euismod ut aliquam et, vestibulum nec risus. Nulla viverra, arcu et iaculis consequat, justo diam ornare tellus, semper ultrices tellus nunc eu tellus.

  • Bilingualism and Multilingualism
  • Contrastive Analysis in Linguistics
  • Conversation Analysis
  • Critical Applied Linguistics
  • Cross-Cultural Pragmatics
  • Cross-Language Speech Perception and Production
  • Dialectology
  • Educational Linguistics
  • English as a Lingua Franca
  • Euphemisms and Dysphemisms
  • Heritage Languages
  • Institutional Pragmatics
  • Language and Law
  • Language Contact
  • Language, Gender, and Sexuality
  • Language Ideologies and Language Attitudes
  • Language Maintenance
  • Language Revitalization
  • Linguistic Complexity
  • Linguistic Prescriptivism
  • Minority Languages
  • Positive Discourse Analysis
  • Sociolinguistic Fieldwork
  • Sociopragmatics

Other Subject Areas

Forthcoming articles expand or collapse the "forthcoming articles" section.

  • Attention and Salience
  • Edward Sapir
  • Text Comprehension
  • Find more forthcoming articles...
  • Export Citations
  • Share This Facebook LinkedIn Twitter

Linguistic Profiling and Language-Based Discrimination by John Baugh LAST REVIEWED: 12 January 2021 LAST MODIFIED: 12 January 2021 DOI: 10.1093/obo/9780199772810-0267

Linguistic profiling and other forms of linguistic discrimination were first attested in the Old Testament. The coinage of the word shibboleth traces its origin to the Book of Judges 12:6, where the inability to pronounce that word correctly would result in death; “They told him, ‘Please say shibboleth.’ If he said, ‘sibboleth,’ because he could not pronounce it correctly, they seized him and killed him at the fords of the Jordan.” Since then, other manifestations of human conflict and discrimination frequently exhibit linguistic demarcation in one form or another, and these shibboleths evolve over time. Warring factions may eventually make peace, as old rivalries come to be displaced or resolved. Advances in technology exacerbated these trends, as more-rapid modes of transportation increased contact and conflict among speakers of mutually unintelligible languages, accompanied by the development of increasingly efficient deadly weaponry that coincided with global expansionism along with sporadic conquests and the ensuing oppression of human enclaves throughout the world. The advent of global markets and multinational immigration has further accelerated circumstances where diverse human factions may use linguistic (dis)similarities as one of several means through which individuals formulate perceptual boundaries between groups that are familiar or unfamiliar. When compared to the historical longevity of discrimination based on language, linguistic evaluations of this phenomenon are relatively recent.

Haugen 1972 is among the first works by many professional linguists to call attention to the stigmatization of bilingualism as experienced by various immigrant groups in the United States. Lambert 1972 and Tucker and Lambert 1969 are experimental studies that expose further evidence of linguistic bias in bilingual (e.g., French and English in Canada) and bidialectal circumstances (e.g., mainstream Standard American English and African American vernacular English in the United States). Preston 1989 is a formulation of perceptual dialectology that provides orthogonal confirmation of such biases, through elicitations of opinions about superior-to-inferior varieties of American English. Explicit accounts of linguistic profiling are described in Purnell, et al. 1999 regarding housing discrimination and in Baugh 2000 in relation to testimony about different speakers’ dialects and racial identities during murder trials. Squires and Chadwick 2006 produces complementary analyses by uncovering differential dialect discrimination against minorities seeking to purchase homeowners’ insurance. In Zentella 2014 , evaluations of linguistic profiling share echoes of Einar Haugen’s early observations about prejudice against bilinguals, albeit with specific relevance to native speakers of Spanish who were obliged to speak English exclusively at their places of employment. Jones, et al. 2019 discovers unintended bias against black Americans by professional court reporters who regularly mischaracterized their statements during trials. When viewed collectively, matters of linguistic profiling and language discrimination persist in many social domains, thereby confirming the existence of one demographic dimension of human inequality.

Baugh, John. 2000. Racial identification by speech. American Speech 75.4: 362–364.

DOI: 10.1215/00031283-75-4-362

A survey of legal cases devoted to murder trials where the dialect of suspects or defendants was central to witness testimony. The phrase “linguistic profiling” first appeared in this article, and that concept has since expanded to include prejudicial, and often illegal, reactions to the speech or writing of individuals whose language usage was used as the basis of discrimination against them.

Haugen, Einar. 1972. The ecology of language . Stanford, CA: Stanford Univ. Press.

A collection of interdisciplinary chapters that describe the intersection between linguistic diversity and America’s expanding global immigrant population. “The Stigmata of Bilingualism” is most relevant to linguistic discrimination, and it stands out among an array of other chapters dealing with a wide range of bilingual considerations that address linguistic prejudice against those who are not native speakers of English, with special relevance to the United States.

Jones, Taylor, Jessica Rose Kalbfeld, Ryan Hancock, and Robin Clark. 2019. Testifying while black: An experimental study of court reporter accuracy in transcription of African American English. Language 95.2: 216–252.

DOI: 10.1353/lan.2019.0042

Court stenographers in Philadelphia repeatedly produced inaccurate transcriptions of African American English (AAE), including different morphosyntactic and phonological features of AAE. These errors were consequential, changing official court records that would have significant legal repercussions for typical speakers of AAE.

Lambert, Wallace. 1972. Language, psychology and culture: Essays . Selected and introduced by Anwar S. Dil. Stanford, CA: Stanford Univ. Press.

Foundational research on societal and personal dimensions of bilingualism and biculturalism, including results from carefully controlled matched-guise experiments that revealed differential attitudes toward French and English in Canada. These attitudinal differences were measured with Likert scales that considered friendliness, trustworthiness, and educational status, among other traits.

Preston, Dennis. 1989. Perceptual dialectology: Nonlinguists’ views of areal linguistics . Dordrecht, The Netherlands: Foris.

Opinions about language usage were solicited in Indiana and Hawaii regarding perceptions about dialect differences, with primary emphasis on preferred manners of speaking in contrast to speech that was deemed less desirable, or incorrect. The field of perceptual dialectology is introduced and formulated in this volume.

Purnell, Thomas, William Idsardi, and John Baugh. 1999. Perceptual and phonetic experiments on American English dialect identification. Journal of Language and Social Psychology 18.1: 10–30.

DOI: 10.1177/0261927X99018001002

Controlled experiments were conducted that showed bias against vernacular African American and vernacular Mexican American varieties of English in contrast to the dominant Standard American English dialect, which was preferred by landlords. Accurate identification of the race of speakers was determined with high rates of accuracy upon hearing the single word “hello.”

Squires, Greg, and Jan Chadwick. 2006. Linguistic profiling: A continuing tradition of discrimination in the home insurance industry? Urban Affairs Review 41.3: 400–415.

DOI: 10.1177/1078087405281064

An audit-pair study that exposed linguistic and racial bias against minority speakers who were seeking to purchase homeowners’ insurance. These findings confirm that some insurance agents would deduce the race of prospective clients during telephone calls. In many instances, these agents then denied minority prospects from purchasing homeowners’ insurance that would otherwise be available.

Tucker, Richard, and Wallace Lambert. 1969. White and Negro listeners’ reactions to various American-English dialects. Social Forces 47.4: 463–468.

DOI: 10.2307/2574535

An experimental study of diverse AAE speech styles that includes well-educated and less well-educated black speakers who were evaluated by diverse white and black (i.e., Negro) listeners. Each listener evaluated individual speech samples with a Likert scale that included various demographic categories such as education, wealth, and friendliness.

Zentella, Ana Celia. 2014. TWB (talking while bilingual): Linguistic profiling of Latina/os and other linguistic torquemadas . Latino Studies 12.4: 620–635.

DOI: 10.1057/lst.2014.63

Bilingual New Yorkers who were native speakers of Spanish were forced to speak English at their workplace under all circumstances, including personal conversations that were unrelated to their job. A survey was conducted, revealing differences of opinion about the importance and appropriateness of demanding exclusive usage of English.

back to top

Users without a subscription are not able to see the full content on this page. Please subscribe or login .

Oxford Bibliographies Online is available by subscription and perpetual access to institutions. For more information or to contact an Oxford Sales Representative click here .

  • About Linguistics »
  • Meet the Editorial Board »
  • Acceptability Judgments
  • Accessibility Theory in Linguistics
  • Acquisition, Second Language, and Bilingualism, Psycholin...
  • Adpositions
  • African Linguistics
  • Afroasiatic Languages
  • Algonquian Linguistics
  • Altaic Languages
  • Ambiguity, Lexical
  • Analogy in Language and Linguistics
  • Animal Communication
  • Applicatives
  • Applied Linguistics, Critical
  • Arawak Languages
  • Argument Structure
  • Artificial Languages
  • Australian Languages
  • Austronesian Linguistics
  • Auxiliaries
  • Balkans, The Languages of the
  • Baudouin de Courtenay, Jan
  • Berber Languages and Linguistics
  • Biology of Language
  • Borrowing, Structural
  • Caddoan Languages
  • Caucasian Languages
  • Celtic Languages
  • Celtic Mutations
  • Chomsky, Noam
  • Chumashan Languages
  • Classifiers
  • Clauses, Relative
  • Clinical Linguistics
  • Cognitive Linguistics
  • Colonial Place Names
  • Comparative Reconstruction in Linguistics
  • Comparative-Historical Linguistics
  • Complementation
  • Complexity, Linguistic
  • Compositionality
  • Compounding
  • Comprehension, Sentence
  • Computational Linguistics
  • Conditionals
  • Conjunctions
  • Connectionism
  • Consonant Epenthesis
  • Constructions, Verb-Particle
  • Conversation, Maxims of
  • Conversational Implicature
  • Cooperative Principle
  • Coordination
  • Creoles, Grammatical Categories in
  • Critical Periods
  • Cyberpragmatics
  • Default Semantics
  • Definiteness
  • Dementia and Language
  • Dene (Athabaskan) Languages
  • Dené-Yeniseian Hypothesis, The
  • Dependencies
  • Dependencies, Long Distance
  • Derivational Morphology
  • Determiners
  • Distinctive Features
  • Dravidian Languages
  • Endangered Languages
  • English, Early Modern
  • English, Old
  • Eskimo-Aleut
  • Evidentials
  • Exemplar-Based Models in Linguistics
  • Existential
  • Existential Wh-Constructions
  • Experimental Linguistics
  • Fieldwork, Sociolinguistic
  • Finite State Languages
  • First Language Attrition
  • Formulaic Language
  • Francoprovençal
  • French Grammars
  • Gabelentz, Georg von der
  • Genealogical Classification
  • Generative Syntax
  • Genetics and Language
  • Grammar, Categorial
  • Grammar, Cognitive
  • Grammar, Construction
  • Grammar, Descriptive
  • Grammar, Functional Discourse
  • Grammars, Phrase Structure
  • Grammaticalization
  • Harris, Zellig
  • History of Linguistics
  • History of the English Language
  • Hmong-Mien Languages
  • Hokan Languages
  • Humor in Language
  • Hungarian Vowel Harmony
  • Idiom and Phraseology
  • Imperatives
  • Indefiniteness
  • Indo-European Etymology
  • Inflected Infinitives
  • Information Structure
  • Interface Between Phonology and Phonetics
  • Interjections
  • Iroquoian Languages
  • Isolates, Language
  • Jakobson, Roman
  • Japanese Word Accent
  • Jones, Daniel
  • Juncture and Boundary
  • Khoisan Languages
  • Kiowa-Tanoan Languages
  • Kra-Dai Languages
  • Labov, William
  • Language Acquisition
  • Language Documentation
  • Language, Embodiment and
  • Language for Specific Purposes/Specialized Communication
  • Language Geography
  • Language in Autism Spectrum Disorders
  • Language Nests
  • Language Shift
  • Language Standardization
  • Language, Synesthesia and
  • Languages of Africa
  • Languages of the Americas, Indigenous
  • Languages of the World
  • Learnability
  • Lexical Access, Cognitive Mechanisms for
  • Lexical Semantics
  • Lexical-Functional Grammar
  • Lexicography
  • Lexicography, Bilingual
  • Linguistic Accommodation
  • Linguistic Anthropology
  • Linguistic Areas
  • Linguistic Landscapes
  • Linguistic Profiling and Language-Based Discrimination
  • Linguistic Relativity
  • Linguistics, Educational
  • Listening, Second Language
  • Literature and Linguistics
  • Machine Translation
  • Maintenance, Language
  • Mande Languages
  • Mass-Count Distinction
  • Mathematical Linguistics
  • Mayan Languages
  • Mental Health Disorders, Language in
  • Mental Lexicon, The
  • Mesoamerican Languages
  • Mixed Languages
  • Mixe-Zoquean Languages
  • Modification
  • Mon-Khmer Languages
  • Morphological Change
  • Morphology, Blending in
  • Morphology, Subtractive
  • Munda Languages
  • Muskogean Languages
  • Nasals and Nasalization
  • Niger-Congo Languages
  • Non-Pama-Nyungan Languages
  • Northeast Caucasian Languages
  • Oceanic Languages
  • Papuan Languages
  • Penutian Languages
  • Philosophy of Language
  • Phonetics, Acoustic
  • Phonetics, Articulatory
  • Phonological Research, Psycholinguistic Methodology in
  • Phonology, Computational
  • Phonology, Early Child
  • Policy and Planning, Language
  • Politeness in Language
  • Possessives, Acquisition of
  • Pragmatics, Acquisition of
  • Pragmatics, Cognitive
  • Pragmatics, Computational
  • Pragmatics, Cross-Cultural
  • Pragmatics, Developmental
  • Pragmatics, Experimental
  • Pragmatics, Game Theory in
  • Pragmatics, Historical
  • Pragmatics, Institutional
  • Pragmatics, Second Language
  • Pragmatics, Teaching
  • Prague Linguistic Circle, The
  • Presupposition
  • Psycholinguistics
  • Quechuan and Aymaran Languages
  • Reading, Second-Language
  • Reciprocals
  • Reduplication
  • Reflexives and Reflexivity
  • Register and Register Variation
  • Relevance Theory
  • Representation and Processing of Multi-Word Expressions in...
  • Salish Languages
  • Sapir, Edward
  • Saussure, Ferdinand de
  • Second Language Acquisition, Anaphora Resolution in
  • Semantic Maps
  • Semantic Roles
  • Semantic-Pragmatic Change
  • Semantics, Cognitive
  • Sentence Processing in Monolingual and Bilingual Speakers
  • Sign Language Linguistics
  • Sociolinguistics
  • Sociolinguistics, Variationist
  • Sound Change
  • South American Indian Languages
  • Specific Language Impairment
  • Speech, Deceptive
  • Speech Perception
  • Speech Production
  • Speech Synthesis
  • Switch-Reference
  • Syntactic Change
  • Syntactic Knowledge, Children’s Acquisition of
  • Tense, Aspect, and Mood
  • Text Mining
  • Tone Sandhi
  • Transcription
  • Transitivity and Voice
  • Translanguaging
  • Translation
  • Trubetzkoy, Nikolai
  • Tucanoan Languages
  • Tupian Languages
  • Usage-Based Linguistics
  • Uto-Aztecan Languages
  • Valency Theory
  • Verbs, Serial
  • Vocabulary, Second Language
  • Voice and Voice Quality
  • Vowel Harmony
  • Whitney, William Dwight
  • Word Classes
  • Word Formation in Japanese
  • Word Recognition, Spoken
  • Word Recognition, Visual
  • Word Stress
  • Writing, Second Language
  • Writing Systems
  • Zapotecan Languages
  • Privacy Policy
  • Cookie Policy
  • Legal Notice
  • Accessibility

Powered by:

  • [185.80.149.115]
  • 185.80.149.115
  • Across the sea of dreams: Alvvays + The Beths at The Sound
  • To the stars with Olivia Rodrigo: GUTS world tour
  • Comment from Professor Allan Havis: A Message supporting Shared Discourse
  • Voices from Inside the Encampment
  • A review of UCSD classrooms
  • Court orders for UAW strike to pause
  • The Social Security tax is almost certainly rigged against you
  • Reports of arrested students receive holds on their accounts, some concerned about Graduation
  • Statement from the UC San Diego writing community: “You wrote them in the cruelest way”
  • Letter from UC San Diego Faculty to the Students of the Gaza Solidarity Encampment

The Student News Site of University of California - San Diego

The UCSD Guardian

The Student News Site of University of California - San Diego

Linguistic Profiling Rooted in Implicit Bias, Stereotyping

Linguistic Profiling Rooted in Implicit Bias, Stereotyping

There is a decent pool of Americans who will insist they do not have an accent. Of course, there are accents that are found within Southern regions of America or on the East Coast, but still, some Americans may describe their way of speaking to not have a specific “flair” — it’s just “normal.” “Accentless” Americans do, contrary to their belief, have an accent. The reason they might have never realized it is because it does not inconvenience them.

The denial of otherwise available goods or services by phone is what has been coined “linguistic profiling,” and it is the auditory correspondent to racial profiling, which relies on visual cues. Over the phone, employers or landlords, for sake of example, can utilize accents of potential job applicants or tenants to decide whether or not they want to turn them away. 

It is not linguistic profiling for someone to acknowledge a caller sounds Latino, Black, Indian, or any other racial or ethnic background. It becomes linguistic profiling once someone attaches their implicit biases to the fact that callers sound as if they are part of a certain group or community and decides to discriminate against them. It is an easy way for people to quickly make assumptions about a caller and decide if they are part of a community they have a prejudice against. While turning away a caller on the phone may sound mundane, linguistic profiling is another way for those with implicit biases against people of color to potentially put minorities in disadvantaged situations. What is exceptionally upsetting, is that people will linguistically profile communities for the way they sound, but will still actively use ways of speech (such as terms or vernacular) used by POC once it is trending. 

Within America, t he Civil Rights Act of 1964 and Fair Housing Act of 1968 both account for linguistic profiling and the Civil Rights Act has avowed discrimination based on “linguistic characteristics of a national origin group” are forbidden. While these laws were a step in the right direction, linguistic profiling has proven difficult to fight within the court of law; especially as it can be challenging to document speech was the basis for discrimination. Does that mean it is not happening? No. It means someone can discriminate against  someone else on the basis of an accent they hold as distasteful and potentially get off with no consequence. Lack of consequence allows for people to not be held accountable for when they do discriminate on the basis of linguistic profiling. That means not only are victims left without justice, but perpetrators are left without seeing reason to change their mentality nor behavior. 

Lack of consequence can also perpetuate the notion that linguistic profiling does affect anyone. Studies that have focused on linguistic profiling have been able to find a correlation between accents used and whether or not individuals get a call back. The correlation does not give the full picture, but it does encourage further research and investigation. 

Current data on linguistic profiling remains conservative, as not many cases can be represented and few may admit to actively doing it. But in reality, linguistic profiling does happen, and all while someone can disadvantage POC for the way they speak, they can be using popularized POC terms. African-American Vernacular English (AAVE), Mexican Slang, Japanese words, and plenty of other culture’s speech have been highlighted by trends, allowing for anyone to suddenly be taking pieces of that culture’s vernacular and incorporate it into their own verbal communication. “Accentless” Americans have found themselves within a group who enjoys playing with words and phrases like “woke,” “por favor,” or even ”sugoi.” It might make you feel quirky and it might feel harmless. In reality, it is tone deaf. POC, as humans, can be turned away, but their culture and way of speaking can be accepted because it’s “fun.” 

Using POC terms while linguistically profiling them only supports the notion that minorities are accepted in capacities that are convenient for others. The issue is not entirely about accents or the way someone speaks, it’s about trying to distance yourself from who has the accent. 

To reduce someone down to an unappealing voice and then turn around and try to speak in manners specific to their culture is a display of your privilege. But again, the issue does not solely lie within accents — it’s an issue with acting on implicit biases. That’s what disadvantages people. The fear and overgeneralizations of POC plays into why there are many disparities within the healthcare system , within the housing market , and within the education system . 

Implicit bias is defined as attitudes towards people or associated stereotypes with them without our conscious knowledge. Some psychologists say even those aware of implicit biases do not really see behavioral changes, as they are not in an environment where they would unlearn their assumptions about others. However, turning away POC over the phone to distance yourself from them is not going to put you in an environment where you reconstruct your generalizations about minorities. 

For many college students, moving to their universities tends to be a cultural shock, as it is the first instance where individuals are exposed to students who have varying identities to their own. These instances put students in situations where they are seeing more POC than they may have previously, and that is vital to getting people to recognize that these populations are not too different from them. Exposure doesn’t solve the problem, but it does create a space where individuals have to think about their perception of POC. Better understanding POC needs to be an active effort, one where you are willing to relearn what you may have thought to know about them. You cannot just try to avoid encounters with them, mimic their vernacular, and hope that something changes within yourself.

Art by Ava Bayley for the UC San Diego Guardian.

Ava Bayley

  • discrimination
  • linguistic barriers

Your donation will support the student journalists at University of California, San Diego. Your contribution will allow us to purchase equipment, keep printing our papers, and cover our annual website hosting costs.

Voices from Inside the Encampment

Comments (1)

Cancel reply

Your email address will not be published. Required fields are marked *

Elisabeth • Mar 25, 2021 at 10:54 am

You can have the most beautiful, easiest-to-use website in the world, continue reading about web design and development , but if no one can find it, then what was the effort to create it for?

Linguistic Profiling across International Geopolitical Landscapes

  • August 2023
  • Daedalus 152(3):167-177
  • 152(3):167-177
  • CC BY-NC 4.0
  • This person is not on ResearchGate, or hasn't claimed this research yet.

Discover the world's research

  • 25+ million members
  • 160+ million publication pages
  • 2.3+ billion citations
  • Career Dev Int

Chinaza Solomon Ironsi

  • DAEDALUS-US

Sharese King

  • Walt Wolfram

Anne Charity-Hudley

  • Guadalupe Valdés
  • Maja Stojanović

Petra A. Robinson

  • Matthew Hunt
  • Rosina Lippi-Green
  • M.A. Turner
  • R. Pitingolo
  • Jane H. Hill

Gregory D. Squires

  • Jan Chadwick
  • Recruit researchers
  • Join for free
  • Login Email Tip: Most researchers use their institutional email address as their ResearchGate login Password Forgot password? Keep me logged in Log in or Continue with Google Welcome back! Please log in. Email · Hint Tip: Most researchers use their institutional email address as their ResearchGate login Password Forgot password? Keep me logged in Log in or Continue with Google No account? Sign up
  • DOI: 10.1093/OXFORDHB/9780190212896.013.13
  • Corpus ID: 151437404

Linguistic Profiling and Discrimination

  • Published 5 January 2017
  • Linguistics

24 Citations

Career challenges of international female faculty in us universities: from a linguistic profiling perspective, centering heritage speaker perspectives in undergraduate linguistics education, beyond the front yard: the dehumanizing message of accent-altering technology, 6. needed research in american sign language variation, the philosophical debate on linguistic bias: a critical perspective, perceptions of regional origin and social attributes of phonetic variants used in iberian spanish, a scientific communication mentoring intervention benefits diverse mentees with language variety related discomfort, reflecting on the role of gender and race in speech-language pathology, linguistic profiling and shifting standards, transforming disinformation on minorities into a pedagogical resource: towards a critical intercultural news literacy, related papers.

Showing 1 through 3 of 0 Related Papers

The sound of racial profiling: When language leads to discrimination

The problem isn't with the speech itself but with attitudes that interpret the speech.

(Author's note: Recent events have highlighted the need to be introspective about our role in systematic and institutionalized racism, but research on linguistic discrimination has long sought to understand how language use and bias are connected.  The following is my article on linguistic profiling that recently appeared in Psychology Today , where I have recently been invited to contribute regularly on various linguistic topics.)

Given the recent protests and riots stemming from the killing of George Floyd, many have been left wondering how racism might be steeped into less obvious facets of our lives.  While his death certainly throws stark light on the continuing danger of being black in America, it also makes us face the inequities that pervade our society.  But bias is not only based on how we look, but also, often, on how we speak, and this begs the question of how the way we react to what people say is influenced by the color of their skin? 

As a sociolinguist that specializes in how social identity and language are connected, this is a question that has very much been on my mind lately.  Racial profiling is not just about what people look like, but very much also about what they sound like, and the credibility, employability or criminality we assign to voices has a very real impact on those who happen to speak (or even just look like they speak) non-standard dialects.

Linguistic hallucinations?

If you happen to be one of the roughly 25% of Americans who identifies as an ethnic minority, you have probably been the victim of linguistic bias, even if you speak standard English.  Much like the experience of Chris Cooper, the birdwatcher in Central Park falsely accused of threatening a white dog-walker just because he happened to be black, linguistic research on the impact of stereotypes shows we can hallucinate an accent just by seeing an ethnic looking face. 

A number of studies have looked at how simply telling someone they are listening to an ethnic speaker or showing them a photo of an ethnic face, though actually listening to a standard speech recording, influences the perception of accentedness or non-standardness and lowers scores on intelligibility and competence scales.  This, further research suggests, influences the scores non-native instructors receive on teaching evaluations and lowers the expectations teachers have for educational achievement of African-American children.   

What exactly do we mean by linguistic profiling?

More broadly, research both in linguistics and social psychology has looked at how subtle and often unconscious linguistic practices predispose us to react to and think about people differently depending on their race.  Current Washington University and former Stanford professor John Baugh coined the term ‘linguistic profiling’ several years ago in response to the discrimination he himself experienced when looking for a house in majority white Palo Alto.

In a study he designed with colleagues, Baugh, using either African-American, Chicano or Standard accented English (called linguistic guises), made calls inquiring about property for rent.  When using non-standard accents, the property listed was somehow no longer available, in contrast to when he was using his standard English guise.  His work was the basis for a widely seen public service campaign to advocate for fair housing practices.  

But, this begs the question: what exactly are we tuning into when we ‘hear’ ethnicity in voices?  We might think people sound black or latino, but, since we clearly sometimes imagine accents where there isn’t one, are we even very good at recognizing race on the basis of speech alone?  In short, yes.

When we hear certain linguistic clues, things like pronouncing ‘th’ sounds as a ‘v’ or deleting ‘r’ sounds (i.e. ‘brovah’ for brother) or 3 rd person singular deletion (he go), we do often identify those features as part of ethnic varieties.  But, even less salient aspects of our speech seem to signal ethnic identity and, potentially, trigger activation of stereotyped associations. 

In a follow up study to the one described above, linguists Purnell, Isardi and Baugh found that listeners were able to determine speakers’ ethnicity as quickly as the first word said in the phone conversation at around 70% accuracy.  In other words, listeners had them at ‘hello.’  The researchers discovered, even without the use of widely recognizable linguistic features like those just mentioned, listeners were sensitive to very subtle phonetic cues such as how the ‘e’ vowel in ‘hello’ was pronounced. 

Of course, recognizing that some language features might indicate someone’s gender, age or race itself is not problematic, and, in fact, something we all do.  But associating them with generalized negative traits or discriminating against them on the basis of these traits is where the inherent danger lies.  And often these negative associations are not overt, and we might not even be aware of doing it, but instead are implicit in how we make decisions about how we are going to interact with or evaluate those we hear.

Such consequences are magnified in the criminal justice system

Relevant to George Floyd’s death, linguistic discrimination contributes to significantly adverse outcomes in interactions with police.  In work examining the language in body camera footage from the Oakland police, Voigt et. al. 2017 found police officers used less respectful language during routine traffic stops when the driver was black. 

In the criminal justice system more widely, research by linguists John Rickford and Sharese King illustrated how linguistic bias might have affected the outcome in the trial of George Zimmerman for the shooting of Trayvon Martin in 2012.  Zimmerman, who claimed the shooting was in self-defense, was found not guilty of second degree murder in large part because of the prosecution’s key witness Rachel Jeantel’s use of African-American English.  She was ridiculed as inarticulate, not credible and incomprehensible, and, due to unfamiliarity with the dialect, court transcripts of her testimony were highly inaccurate.

The consequences of such linguistic prejudice are very real.  For one, as Rickford and King’s work highlights, the lack of credibility and unintelligibility associated with disfavored varieties can affect judicial rulings.  Driving that point home, subsequent research on juror appraisals found an increase in negative evaluations and guilty verdicts when witnesses spoke African-American English. 

This problem, of course, is not limited to contexts where African-American English varieties are involved, as non-native speakers are also unfairly disadvantaged in court and other institutional settings as a result of linguistic barriers.  And even for those who never see the inside of a courtroom, speaking a disfavored variety has been shown to lead to increased discrimination in housing practices, in the educational system and in hiring contexts.  

Some might suggest this is simply a call for speakers to adopt standard dialects, but, as discussed above, just looking like you might speak something other than Standard English predisposes listeners to hear an accent, even if it doesn’t exist.  So, the problem is not really with the speech itself, but with the attitudes we hold about the speakers of these dialects.

The only real solution is, of course, to work to reduce and, eventually, eradicate linguistic prejudice by spending the time to understand the socio-historical and linguistic underpinnings of non-standard varieties.  After all, recalling the keen observation of linguist Max Weinreich, “A language is a dialect with an army and a navy.”  Thus, for a non-standard dialect speaker, the fight is far from fair.

Kurinec, Courtney and Charles Weaver III. 2019. Dialect on trial: use of African American Vernacular English influences juror appraisals. Psychology, Crime & Law, 25:8, 803-828

Purnell, T., Idsardi, W., & Baugh, J. 1999. Perceptual and phonetic experiments on American English dialect identification. Journal of Language and Social Psychology , 18, 10-31.

Rickford, John R., & King, Sharese. 2016. Language and linguistics on trial: Hearing Rachel Jeantel (and other vernacular speakers) in the courtroom and beyond.  Language  92:4,948–88

Voigt,Rob, N. Camp, V. Prabhakaran, W. Hamilton, R. Hetey, C. Griffiths, D. Jurgens, D. Jurafsky, and J. Eberhardt. 2017. Racial disparities in police language. Proceedings of the National Academy of Sciences. 114:25, 6521-6526

Wolfram, Walt, and Erik R. Thomas. 2002.  The development of African American English.  Malden, MA, and Oxford, UK: Blackwell.

Linguistics Professor Valerie Fridland

By: Valerie Fridland Professor of Linguistics, Department of English

UNR Med National Suicide Prevention Month

During National Suicide Prevention Month this September, Takesha Cooper, M.D., shares her thoughts on how you can support others

linguistic profiling essay

Tiedt Exploring AI in Education

Assistant Dean of Undergraduate Student Success in the College of Business Jeremy Tiedt discusses his use of a GPT teaching assistant providing individualized tutoring to students in his Fundamentals of Entrepreneurship course

Money Mentors

Discover the comprehensive services provided by Nevada Money Mentors and learn how they’re making a positive impact on campus

Bret Simmons Visiting Professor USAC Thailand

From visiting temples to interacting with elephants and taking trips to nearby countries, Simmons discusses what it was like to teach with USAC this past summer

Editor's Picks

Brian Sandoval sitting next to Stephanie Gibson in the podcasting studio holding up Wolf Pack hand signs.

Sagebrushers season 3 ep. 10: Lilley Museum Director Stephanie Gibson

President Sandoval speaks to a large audience of parents.

University of Nevada, Reno hosts their second annual Latinx Parent Welcome ‘Mi Casa es su Casa’

A student stands and points at a cardboard cutout of Alphie the mascot with a sign that reads "Lost? Check out the Wolf Pack Map!" with a QR code and maps.unr.edu written on it.

Wolf Pack Map – a new resource is available to help navigate around the University campuses and locations

A group of 18 people, some students, some researchers, some leadership, sit at a table, smiling. Some are holding a book called "Hearts of Light." In the center of the table are small versions of the U.S. and Italian flags.

University of Nevada, Reno signs agreements with two universities in Italy

Nevada Today

University Libraries’ Vault Studio Classroom: A rare gem in academia

Special Collections aspires to be a welcoming, inspiring, and vital primary source laboratory for inquiry and discovery.

Brian Sandoval in a blue sport coat, stands with Frankie Sue Del Papa and Catherine Cardwell to his right in front of a historical image affixed to the wall of a classroom.

Professor Valerie Fridland receives competitive Fellowship for the second time

The National Endowment for the Humanities Fellowship is supporting Fridland in the completion of another book

Headshot of Valerie Fridland.

Native American Mural celebration on Sept. 19 part of the University’s sesquicentennial celebrations

Artist Autumn Harry to discuss how her vision has come to life

Autumn Harry smiles in front of a beautiful mural of a fish.

Picture of health: improving Latino representation in health care

From literacy to language barriers, Jose Cucalon Calderon, M.D, aims to bridge gaps in health care access, education and representation for Northern Nevada’s Latino population

A physician using a light scope to look into a young patient's mouth.

Carolyn S. F. Silva selected for American Association of Hispanics in Higher Education Fellowship

National fellowship supports Latina/o/x faculty

Carolyn Silva

Sanford Center for Aging receives $783,000 in grant funding

Funding will support programs for older adults in fiscal year 2025

Exterior shot of the Sandford building.

New Extension educator to boost farming productivity and sustainability in rural Nevada

Andrew Waaswa will develop targeted training programs to support agricultural producers

Headshot of a smiling man wearing a fedora, with a stone building in the background.

Study projects Humboldt lithium project will generate over $1 billion per year in investment and sales for Nevada

University quantifies estimated economic and fiscal impacts of the mine on Humboldt County and Nevada

A view of the snowy humbolt county mountains.

Guest editorial: Linguistic profiling and implications for career development

Career Development International

ISSN : 1362-0436

Article publication date: 21 June 2024

Issue publication date: 21 June 2024

Hughes, C. , Niu, Y. and Bowers, L. (2024), "Guest editorial: Linguistic profiling and implications for career development", Career Development International , Vol. 29 No. 3, pp. 289-296. https://doi.org/10.1108/CDI-06-2024-359

Emerald Publishing Limited

Copyright © 2024, Emerald Publishing Limited

Introduction

Current career literature examines systemic barriers that diverse individuals encounter and some of the contextual factors that enable or hinder their career progression. However, there is limited research on how the systemic barrier of linguistic profiling enables or hinders the role of the organization or individual career development ( Hughes and Mamiseishvili, 2018 ). Linguistics is the scientific study of language and its structure. Several branches of linguistics, including sociolinguistics, dialectology and applied linguistics, among others, have been used to discriminate and profile individuals in the workplace ( Anderson, 2007 ; Barrett et al ., 2022 ; Craft et al ., 2020 ; Hughes and Mamiseishvili, 2018 ), thus stymieing their careers. Baugh (2000 , p. 363) defines linguistic profiling as “identify[ing] an individual … as belonging to a linguistic subgroup within a given speech community, including a racial subgroup.” and noted that one of the most famous occasions in which linguistic profiling took center stage was during the 1995 O. J. Simpson trial. Simpson’s attorney objected that race could be identified solely based on speech queues. Linguistic profiling is a term used to describe inferences derived from a person’s speech ( Smalls, 2004 ) and has been shown to influence individuals’ development in many ways. When linguistic profiling describes discriminatory practices, it can be considered the auditory equivalent of racial profiling ( Smalls, 2004 ).

Career development theory ( Super and Jordaan, 1973 ) examines how individuals grow and develop in their careers. We are still learning ways to enhance the career progression of individuals ( Hughes and Niu, 2021 ; Hughes et al ., 2019 ; Varma et al. , 2021 ). The influences of linguistic profiling on individuals’ lives have been both positive and negative. One aspect of linguistic profiling that has not been examined closely is individuals’ career experiences. Examples of linguistic profiling exist at all levels, including the President of the United States of America Joe Biden, whose political speeches and career capability have been scrutinized because of his stutter.

While stuttering is one example of how linguistic profiling occurs in the workplace, other examples are nowhere near as highly profiled and do not have such a positive outcome. Despite the inspiration that may be felt by some communication disabled individuals from the high achievement of overcoming linguistic profiling by the Honorable President Bide, there are cases where individuals are routinely screened out of jobs within 3–5 s due to linguistic profiling during telephone interviews ( Purnell et al ., 1999 ). Rahman (2008) found that racial identity was identified by listeners in 28 s and Anderson (2007) showed racial identity was determined in only 16 s. The effects of linguistic profiling may limit exposure to diverse cultures within groups at work and create homogeneous environments that lead to groupthink and a lack of innovation ( Wanous and Youtz, 1986 ). Linguistic profiling, when used inappropriately, prohibits the influences of diverse practices and decreases the collaborative effectiveness of individuals within an organization. Individuals may be forced to alter their self-presentation at work ( Dolezal, 2017 ; Goffman, 1949 ). There are cases where linguistic profiling can be appropriately used to find language speakers to communicate and facilitate understanding when no one is available who understands a language or accent. When linguistic profiling is used to harm individuals, it is deemed inappropriately used.

Baruch and Sullivan (2022 , p. 146) suggested that scholars needed to “investigate the dark side of contemporary careers.” In response, the purpose of this special issue is premised on the notion that some diversity is already in the workplace ( Hughes, 2018 ). Understanding the cultural, global economic and technological innovation effects on linguistic profiling as they relate to the career development of employees is limited, and there has always been a dark side to contemporary careers. This special issue adds a linguistic perspective to enrich career theory and practice. In doing so, it includes research that delved into the dark side of careers and sought to shed light on the harmful effects of linguistic profiling on career development processes and outcomes. Articles in this special issue bring forth several common themes that examine linguistic profiling’s negative impact on career development and possible solutions to alleviate the concerns.

Common themes in linguistic profiling and career development

There were several common themes including influences of linguistic profiling on facets of career development, intersectionality of linguistic profiling for marginalized groups, forced adaptive strategies by victims of linguistic profiling, institutionalized linguistic profiling and integration of linguistic profiling with career theories generated from the included articles.

Influences of linguistic profiling on facets of career development

Linguistic profiling profoundly influences various facets of career development, such as workplace inclusivity (Caldwell et al. ; Carter and Sisco; Greer et al. ; Park and Jeong; Ramjattan; Soomro; Stojanović and Robinson), employability (Caldwell et al. ; Carter and Sisco; Greer et al. ; Park and Jeong; Soomro; Stojanović and Robinson; and Ramjattan) and career progression (Carter and Sisco; Greer et al. ; Soomro; Stojanović and Robinson; Park and Jeong; and Ramjattan). It leads to discriminatory practices and biases against individuals that can negatively impact perceptions of their professionalism, intelligence and other qualities crucial for career growth (Caldwell et al. ; Ironsi and Chen; Carter and Sisco; Greer et al. ; Stojanović and Robinson; Park and Jeong; and Ramjattan)​​. Non-standard accents or dialects often result in diminished hiring and promotional prospects and lead to career stagnation and a lack of diversity within organizations​​ (Carter and Sisco; Greer et al. ; Ironsi and Chen; Soomro; Stojanović and Robinson; Park and Jeong; and Ramjattan). Therefore, the effect of linguistic profiling’s systematic bias influences initial employment opportunities and affects ongoing career progression and development​​.

Intersectionality of linguistic profiling for marginalized groups

Linguistic profiling intersects with racial, gender and national identities, which exacerbates challenges for marginalized groups (Caldwell et al. ; Carter and Sisco; Greer et al .; Ironsi and Chen; Soomro; Stojanović and Robinson; Park and Jeong; and Ramjattan). Individuals from minority racial or ethnic backgrounds, or nonnative speakers, often face compounded discrimination due to the convergence of linguistic profiling with intersectionality biases. The perception of nonstandard accents and dialects is not just a linguistic issue but is deeply intertwined with racial and national stereotypes. It leads to systemic discrimination in career opportunities. Language becomes a proxy for racial or ethnic identity, which further marginalizes individuals and reinforces societal inequalities​.

Forced adaptive strategies by victims of linguistic profiling

In response to linguistic profiling, individuals often adopt adaptive strategies such as code switching (Carter and Sisco) and accent modification (Ironsi and Chen; Soomro; and Ramjattan) to navigate workplace dynamics and enhance their perceived employability (Greer et al .). They may also rely upon technology to communicate, or in some instances, technology is being used in place of workers’ natural voices (Caldwell et al. ; Soomro; Stojanović and Robinson; and Ramjattan). These strategies involve altering speech patterns or language use based on the social context, aiming to align more closely with the standard or preferred linguistic norms in a professional setting. While these mechanisms can improve immediate employability or social acceptance, they can also perpetuate and even reinforce discrimination. When individuals feel pressured to conform to dominant linguistic norms, it not only affects their sense of authenticity but can also sustain the discriminatory status quo by implicitly endorsing the idea that certain accents or dialects are less acceptable or professional. Alternatively, discrimination can occur when individuals who have a communication disorder use augmentative and alternative communication (AAC) as a primary mode of communication (Caldwell et al .). Those with communication disabilities who utilize required accommodations (e.g. AAC) and those who choose to retain their natural speech patterns (Ironsi and Chen; Ramjattan) should not face linguistic prejudice that can lead to further discrimination.

Institutionalized linguistic profiling

Linguistic profiling is deeply institutionalized within workplace practices and organizational dynamics (Caldwell et al. ; Carter and Sisco; Greer et al. ; Ironsi and Chen; Park and Jeong; Soomro; Stojanović and Robinson; Ramjattan). It manifests in recruitment, hiring and promotion processes, where standardized language varieties are often privileged and nonstandard accents or dialects are marginalized. This institutionalization of linguistic profiling perpetuates systemic discrimination, hinders diversity and inclusion efforts and creates a homogenized workforce. It reinforces the ideology of monolingualism, that certain linguistic characteristics are inherently superior or more professional than others.

Integrating linguistic profiling with career theories

The integration of career theories in addressing linguistic profiling issues, as highlighted in Table 1 , provides an insightful approach to career development. The theories provide a framework for understanding the complexities of language as a component of identity and influencing career paths and opportunities. By applying these theories, potential solutions are proposed that focus on inclusivity, recognition of diverse communication styles and tailored coaching strategies. While integrating linguistic profiling with career theories addresses the immediate challenges of linguistic profiling, it also provides ways for a more equitable professional environment, where linguistic diversity is understood, respected and integrated into career development practices.

Divergences and unique perspectives

While the shared theme of linguistic profiling and career development unites the articles in this special issue, each piece also brings its own unique perspective and divergent insights. These individual narratives and analyses enrich our collective understanding and offer a multifaceted view of the subject. This section discusses these distinct viewpoints by exploring the diverse ways in which linguistic profiling manifests across different contexts and how these unique experiences contribute to our broader comprehension of the topic.

Firstly, challenges faced by specific groups, such as nonnative English-speaking teachers (Ironsi and Chen), international female faculty (Park and Jeong) and individuals with communication disabilities (Caldwell et al. ), are discussed. For instance, the detrimental impact of categorizing teachers as “native” or “non-native” propagates false hierarchies and undermines professional competence based on linguistic identity​​. Moreover, the exploration of monolingual ideologies in the United States of America reveals the intricate layers of discrimination faced by multilingual individuals. This is even more complicated when considering factors such as ethnicity, race and social and economic background.​​ In addition, societal perceptions and linguistic norms can marginalize individuals whose speech patterns deviate from the so-called standard and impact their social and professional interactions. These targeted insights emphasize the need for a deeper understanding and strategic interventions to support these specific groups in their career development.

Secondly, the transformative potential of personal growth and leadership coaching, especially for marginalized groups grappling with linguistic profiling (Carter and Sisco), is examined. By integrating a critical career development framework that accounts for linguistic diversity and multilingual identity, individuals are empowered to navigate and challenge the monolingual ideologies prevalent in their professional environments​​. This fosters personal growth and cultivates a more inclusive leadership style that values diversity and promotes equity.

Implications for career development and linguistic profiling

To address the implications, organizations must critically examine their policies and practices, raise awareness about linguistic diversity and implement inclusive strategies to ensure linguistic profiling does not hinder the career development of talented individuals from diverse linguistic backgrounds​. In addition, career development professionals play an important role in facilitating the process and assisting individuals. Inclusive policies are essential to mitigate linguistic profiling in organizations. By implementing strategies that promote diversity and inclusivity, organizations can create environments where every individual’s linguistic background is respected and valued. Also, the organizations should foster a culture of acceptance and collaboration that enhances overall productivity and morale.

Education, training and awareness are key ways to address linguistic profiling. Informative programs and sensitivity training can enlighten individuals and management and lead to a more empathetic and inclusive workplace. Understanding the impact of language biases is the first step toward creating a supportive environment where every voice is valued. Support structures like mentorship programs and leadership coaching are vital in empowering individuals facing linguistic profiling. These resources provide guidance, build confidence and offer strategies to navigate and overcome workplace challenges by ensuring that linguistic diversity is not a barrier to professional growth.

This special issue invited authors to adopt a multidisciplinary approach to the topic of linguistic profiling’s influence on career development. We expected manuscripts to bring strong empirical contributions that develop and extend career theory as well as more conceptual papers that integrate, critique and expand existing career theories. We encouraged the use of appropriate methods for both the research context and related research questions. We welcomed both qualitative ( Richardson et al ., 2022 ) and quantitative designs ( Schreurs et al. , 2021 ). However, the state of research in this area is in its infancy; therefore, there were few quantitative studies on the topic. The richness of this issue is seen in the foundational theoretical and conceptual ideas brought forth within the eight articles included.

This special issue will serve as a foundational work to stimulate further research. It is a transdisciplinary approach that brings together articles written by experts from the fields of career development, human resource development, workforce development and communication sciences and disorders. Continued research is crucial to deepen our understanding of linguistic profiling and its impact on career development, especially quantitative inquiry. By exploring this phenomenon from various perspectives and contexts, researchers can uncover insights that lead to more effective strategies for fostering inclusivity and respect for linguistic diversity in the professional environment. The call for future research extends to empirical researchers, especially those utilizing quantitative methods.

Using career theories in addressing linguistic profiling issues

Linguistic profiling issuesApplied career theories and implicationsPotential solutionsImplications for career development
Issues of linguistic hierarchies and discrimination through accent modification (Ramjattan)Social cognitive career theory ( , 1994, )Rethinking intelligibility and the role of accents in professional settingsPromoting inclusivity and re-evaluating communication and hiring practices and processes
Linguistic profiling influences on career opportunities and growth in multilingual contexts (Soomro)Levinson’s eras , and career development theories ( ; ; )Providing training programs for employees and management, changing policy to discourage linguistic discrimination and initiating to promote equal opportunities for career growth regardless of language backgroundUnderstanding the impact of language discrimination on career progression
Use of code-switching by Black women as a response to linguistic profiling (Carter and Sisco)Boundaryless career theory ( ; )Tailored leadership coachingPromoting career advancement and authenticity among Black women
Negative effects of linguistic profiling on perceived employability (Greer )Systems theory framework of career development ( )Organizational and employment process changes to mitigate linguistic profilingImproving career development outcomes by addressing linguistic profiling
Impact of linguistic profiling on nonnative English-speaking teachers’ career development, self-esteem and motivation. (Ironsi and Chen)Career development theory ( ) career growth ( , 2020), career shock (2018, Actively combat linguistic profiling by challenging stereotypes and advocating for themselves and their colleaguesCreating systemic changes and fostering an inclusive environment that respects and values linguistic diversity
Impact of monolingual ideology on linguistic profiling (Stojanović and Robinson)Boundaryless Career ( ), Protean Career, ( , ), Organizational career ( ) and the Kaleidoscope career ( )Suggests a critical conceptual framework using critical theory ( ) to combat linguistic profiling issuesAddresses the gap in career development literature by proposing a critical conceptual framework that integrates language as an important element of one’s career identity
Linguistic profiling challenges and biases experienced by international female faculty (Park and Jeong)Academic career ( )Expands the perspectives and practices related to the career challenges of international female faculty due to linguistic profilingHelps researchers and career development practitioners by adding linguistic profiling specific diversity and inclusion perspectives to existing literature
Support employers in avoiding linguistic profiling of individuals with communication disabilities (Caldwell )Implications for career counseling professionals and organizational career development practitioners and professionalsEducation, training and the use of inclusive practices can reduce linguistic profiling of individuals with communication disabilities in the workplaceHighlights communication disability in the linguistic profiling discussion so that organizations can be more aware of the impact and the need to create supportive and inclusive workplace environments and in turn reduce discrimination and increase diversity

Akkermans , J. , Seibert , S.E. and Mol , S.T. ( 2018 ), “ Tales of the unexpected: integrating career shocks in the contemporary careers literature ”, SA Journal of Industrial Psychology , Vol. 44 No. 1 , pp. 1 - 10 , doi: 10.4102/sajip.v44i0.1503 .

Akkermans , J. , Rodrigues , R. , Mol , S.T. , Seibert , S.E. and Khapova , S.N. ( 2021 ), “ The role of career shocks in contemporary career development: key challenges and ways forward ”, Career Development International , Vol. 26 No. 4 , pp. 453 - 466 , doi: 10.1108/cdi-07-2021-0172 .

Anderson , K.T. ( 2007 ), “ Constructing ‘otherness’ ideologies and differentiating speech style ”, International Journal of Applied Linguistics , Vol. 17 No. 2 , pp. 178 - 797 , doi: 10.1111/j.1473-4192.2007.00145.x .

Arthur , M.B. ( 1994 ), “ The boundaryless career: a new perspective for organizational inquiry ”, Journal of Organizational Behavior , Vol. 15 No. 4 , pp. 295 - 306 , doi: 10.1002/job.4030150402 .

Arthur , M.B. and Rousseau , D.M. ( 2001 ), The Boundaryless Career: A New Employment Principle for A New Organizational Era , Oxford University Press .

Barrett , R. , Cramer , J. and McGowan , K.B. ( 2022 ), English with an Accent: Language, Ideology, and Discrimination in the United States , Taylor & Francis , New York, NY .

Baruch , Y. and Hall , D.T. ( 2004 ), “ The academic career: a model for future careers in other sectors? ”, Journal of Vocational Behavior , Vol. 64 No. 2 , pp. 241 - 262 , doi: 10.1016/j.jvb.2002.11.002 .

Baruch , Y. and Sullivan , S.E. ( 2022 ), “ The why, what and how of career research: a review and recommendations for future study ”, Career Development International , Vol. 27 No. 1 , pp. 135 - 159 , doi: 10.1108/cdi-10-2021-0251 .

Baugh , J. ( 2000 ), “ Racial identification by speech ”, American Speech , Vol. 75 No. 4 , pp. 362 - 364 , doi: 10.1215/00031283-75-4-362 .

Breevaart , K. , Lopez Bohle , S. , Pletzer , J.L. and Munoz Medina , F. ( 2020 ), “ Voice and silence as immediate consequences of job insecurity ”, Career Development International , Vol. 25 No. 2 , pp. 204 - 220 , doi: 10.1108/cdi-09-2018-0226 .

Clarke , M. ( 2013 ), “ The organizational career: not dead but in need of redefinition ”, The International Journal of Human Resource Management , Vol. 24 No. 4 , pp. 684 - 703 , doi: 10.1080/09585192.2012.697475 .

Craft , J.T. , Wright , K.E. , Weissler , R.E. and Queen , R.M. ( 2020 ), “ Language and discrimination: generating meaning, perceiving identities, and discriminating outcomes ”, Annual Review of Linguistics , Vol. 6 No. 1 , pp. 389 - 407 , doi: 10.1146/annurev-linguistics-011718-011659 .

Dolezal , L. ( 2017 ), “ The phenomenology of self-presentation: describing the structures of intercorporeality with Erving Goffman ”, Phenomenology and the Cognitive Sciences , Vol. 16 No. 2 , pp. 237 - 254 , doi: 10.1007/s11097-015-9447-6 .

Goffman , E. ( 1949 ), “ Presentation of self in everyday life ”, American Journal of Sociology , Vol. 55 , pp. 6 - 7 .

Hall , D.T. ( 1976 ), Careers in Organizations , Scott Foresman , Glenview, IL .

Hall , D.T. ( 2004 ), “ The protean career: a quarter-century journey ”, Journal of Vocational Behavior , Vol. 65 No. 1 , pp. 1 - 13 , doi: 10.1016/j.jvb.2003.10.006 .

Hughes , C. (Ed.) ( 2018 ), “ The role of HRD in integrating diversity alongside intellectual, emotional, and cultural intelligences ”, Advances in Developing Human Resources , Vol. 20 No. 3 , pp. 259 - 262 , doi: 10.1177/1523422318778016 .

Hughes , C. and Mamiseishvili , K. ( 2018 ), “ Linguistic profiling in the workforce ”, in Byrd , M.Y. and Scott , C.L. (Eds), Diversity in the Workforce: Current and Emerging Trends and Cases , 2nd ed. , Routledge , New York, NY , pp. 214 - 227 .

Hughes , C. and Niu , Y. (Eds) ( 2021 ), “ How COVID-19 is shifting career reality: ways to navigate career journeys ”, Advances in Developing Human Resources , Vol. 23 No. 3 , pp. 195 - 202 , doi: 10.1177/15234223211017847 .

Hughes , C. , Robert , L. , Frady , K. and Arroyos , A. ( 2019 ), Managing Technology and Middle and Low Skilled Employees: Advances for Economic Regeneration , Emerald Publishing , Bingley .

Lent , R.W. , Brown , S.D. and Hackett , G. ( 1994 ), “ Toward a unifying social cognitive theory of career and academic interest, choice, and performance ”, Journal of Vocational Behavior , Vol. 45 No. 1 , pp. 79 - 122 , doi: 10.1006/jvbe.1994.1027 .

Lent , R.W. , Brown , S.D. and Hackett , G. ( 2002 ), “ Social cognitive career theory ”, in Brown , D. (Ed.), Career Choice and Development , 4th ed. , Jossey-Bass , San Francisco, CA , pp. 255 - 311 .

Levinson , D.J. ( 1986a ), “ A conception of adult development ”, American Psychologist , Vol. 41 No. 1 , pp. 3 - 13 , doi: 10.1037//0003-066x.41.1.3 .

Levinson , D.J. ( 1986b ), The Seasons of a Man's Life: the Groundbreaking 10-Year Study that Was the Basis for Passages! , Ballantine Books .

Mainiero , L.A. and Sullivan , S.E. ( 2005 ), “ Kaleidoscope careers: an alternate explanation for the ‘opt-out’ revolution ”, Academy of Management Perspectives , Vol. 19 No. 1 , pp. 106 - 123 , doi: 10.5465/ame.2005.15841962 .

Patton , W. and McMahon , M. ( 2006 ), “ The systems theory framework of career development and counseling: connecting theory and practice ”, International Journal for the Advancement of Counselling , Vol. 28 No. 2 , pp. 153 - 166 , doi: 10.1007/s10447-005-9010-1 .

Purnell , T. , Idsardi , W. and Baugh , J. ( 1999 ), “ Perceptual and phonetic experiments on American English dialect identification ”, Journal of Language and Social Psychology , Vol. 18 No. 1 , pp. 10 - 30 , doi: 10.1177/0261927x99018001002 .

Rahman , J. ( 2008 ), “ Middle-class African Americans: reactions and attitudes toward African American English ”, American Speech , Vol. 83 No. 2 , pp. 141 - 176 , doi: 10.1215/00031283-2008-009 .

Richardson , J. , O'Neil , D.A. and Thorn , K. ( 2022 ), “ Exploring careers through a qualitative lens: an investigation and invitation ”, Career Development International , Vol. 27 No. 1 , pp. 99 - 112 , doi: 10.1108/cdi-08-2021-0197 .

Schmitt-Rodermund , E. and Silbereisen , R.K. ( 1998 ), “ Career maturity determinants: individual development, social context, and historical time ”, The Career Development Quarterly , Vol. 47 No. 1 , pp. 16 - 31 , doi: 10.1002/j.2161-0045.1998.tb00725.x .

Schreurs , B. , Duff , A. , Le Blanc , P.M. and Stone , T.H. ( 2021 ), “ Publishing quantitative careers research: challenges and recommendations ”, Career Development International , Vol. 27 No. 1 , pp. 79 - 98 , doi: 10.1108/cdi-08-2021-0217 .

Smalls , D.L. ( 2004 ), “ Linguistic profiling and the law ”, Stanford Law and Policy Review , Vol. 15 , pp. 579 - 604 .

Super , D.E. and Jordaan , J.P. ( 1973 ), “ Career development theory ”, British Journal of Guidance and Counselling , Vol. 1 No. 1 , pp. 3 - 16 , doi: 10.1080/03069887308259333 .

Tollefson , J.W. ( 2006 ), “ Critical theory in language policy ”, in Ricento , T. (Ed.), An Introduction to Language Policy: Theory and Method , Blackwell , Oxford , pp. 42 - 59 .

Varma , A. , Kumar , S. , Sureka , R. and Lim , W.M. ( 2021 ), “ What do we know about career and development? Insights from Career Development International at age 25 ”, Career Development International , Vol. 27 No. 1 , pp. 113 - 134 , doi: 10.1108/cdi-08-2021-0210 .

Wanous , J.P. and Youtz , M.A. ( 1986 ), “ Solution diversity and the quality of groups decisions ”, Academy of Management Journal , Vol. 29 No. 1 , pp. 149 - 159 , doi: 10.5465/255866 .

Acknowledgements

As this editorial is an analytical editorial authored by the guest editor of this issue, it has not been subject to the same double blind anonymous peer review process that the other of the articles in this issue were.

Related articles

All feedback is valuable.

Please share your general feedback

Report an issue or find answers to frequently asked questions

Contact Customer Support

Linguistic profiling: The sound of your voice may determine if you get that apartment or not

Many Americans can guess a caller’s ethnic background from their first hello on the telephone.

However, the inventor of the term “linguistic profiling” has found in a current study that when a voice sounds African-American or Mexican-American, racial discrimination may follow.

John Baugh, Ph.D., the inventor of the term

In studying this phenomenon through hundreds of test phone calls, John Baugh, Ph.D., the Margaret Bush Wilson Professor and director of African and African American Studies in Arts & Sciences at Washington University in St. Louis, has found that many people made racist, snap judgments about callers with diverse dialects.

Some potential employers, real estate agents, loan officers and service providers did it repeatedly, says Baugh. Long before they could evaluate callers’ abilities, accomplishments, credit rating, work ethic or good works, they blocked callers based solely on linguistics.

Such racist reactions frequently break federal and state fair housing and equal employment opportunity laws.

John Baugh

In the first two years of his linguistic profiling study, Baugh has found that this kind of profiling is a skill that too often is used to discriminate and diminish the caller’s chance at the American dream of a house or equal opportunity in the job market.

Baugh’s study is backed by a three-year $500,000 grant from the Ford Foundation.

Racist telephone tactics

While Baugh coined the term linguistic profiling, many who suffer from twisted stereotypes about dialect have known for decades about the racist tactic. His mother knew and took protective action. When he was a youngster in Philadelphia, he could tell if she were talking to a white person or a black person on the telephone.

His study shows that some companies screen calls on answering machines and don’t return calls of those whose voices seem to identify them as black or Latino.

Some companies instruct their phone clerks to brush aside any chance of a face-to-face appointment to view a sales property or interview for a job based on the sound of a caller’s voice. Other employees routinely write their guess about a caller’s race on company phone message slips.

Such discrimination occurs across America, says Baugh, who is also a professor of psychology and holds appointments in the departments of Anthropology, Education and English, all in Arts & Sciences.

If the availability of an advertised job or an apartment is denied at a face-to-face meeting with a person of color, employers and renters know that they can be accused of racism. However, when accused of racist and unfair tactics over the phone, many companies have played dumb about racial linguistic profiling.

Had you from ‘hello’

Baugh has found racist responses in hundreds of calls. He tests ads with a series of three calls. First someone speaking with an African-American dialect responds to an ad. Then, a researcher with a Mexican-style Spanish-English dialect calls. Finally, a third caller uses what most people regard as Standard English.

Many times researchers found that the person using the ethnic dialect got no return calls. If they did reach the company, frequently they were told that what was advertised was no longer available, though it was still available to the Standard English speaker.

In no test calls did researchers offer company employees information about the callers’ credit rating, educational background, job history or other qualifications.

“Those who sound white get the appointment,” Baugh says.

Lack of response or refusal to offer face-to-face appointments was higher for Latinos than for African-Americans, Baugh adds.

When challenged in lawsuits, many businesses deny that they can determine race or ethnicity over the phone. However, Baugh’s ongoing study shows that over the phone many Americans are able to accurately guess the age, race, sex, ethnicity, region of heritage and other social demographics based on a few sentences, even just a hello.

Baugh has prepared to be an expert witness in several court cases but so far all have been settled out of court.

Celebrating all dialects

Recognizing heritage in a voice does not make a person a racist, Baugh says.

In October 2002 on MSNBC, Baugh debated the late Johnnie Cochran, then one of the nation’s best-known defense attorneys, about dialect recognition. In the O.J. Simpson case, Cochran had argued that speculation about a speaker’s race based on hearing a person’s voice was inherently racist.

Such recognition is often made by many intelligent listeners. Millions of Americans speak with the lilting cadences of their ancestors.

“I celebrate all dialects,” Baugh says.

So do musicians, playwrights, storytellers, historians and actors. He and many other academic linguists have coached actors and actresses in preparing for roles that require the special tang of non-standard English accents.

Many professional speakers, especially those in broadcast, who learned to speak in South Boston, the Louisiana bayous, Minnesota’s Scandinavian-American crossroads, Los Angeles barrios, Native American reservations or Scotch-Irish Appalachian towns, have stripped their family’s cadences from their pronunciation.

They scrub down colorful, historic expressions that sometimes are shards of a second language their family once spoke, says Baugh.

Instead, these public speakers aim to speak General American English — what most Americans consider Standard American English.

Baugh’s research shows that not all accents get a neutral or negative reaction from the American public. He has found that many Americans consider people with a British upper-class accent to be more cultured or intelligent than those who used General American. Listeners’ snap judgments about the culture behind the British accent may reflect American’s insecurity about their own English, he says.

Speakers with German accents — even if they stumble into grammatical errors — are considered brilliant, his research has shown. The listeners may not even be able to name the accent as German-American. Baugh expects that the brainy stereotype comes from comics and cartoons mimicking Albert Einstein’s German-American accent and from a duck — Walt Disney’s Germanic scientist Ludwig Von Drake.

Tapping a vein

While Baugh coined the term linguistic profiling, there is nothing new about the prejudice, as observing his mother’s phone conversations taught him. Even now it still is only a sideline in his scholarship as the nation’s foremost expert on varied African-American English, also called Ebonics.

It was not until he was about 38, with a doctoral degree, before he ever considered researching linguistic profiling. After being appointed to the Center for Advanced Studies in Behavior at Stanford University, he went shopping for a house for his family, then living in Los Angeles.

He telephoned agents advertising houses. When he made those calls he used what he calls his “professional” English. Even George Bernard Shaw’s fictitious linguist Henry Higgins would not conclude that he is African-American using that voice.

All agents seemed eager to show him houses for sale. When he showed up, most welcomed him warmly, but four, surprised by his race, told him the properties were no longer available.

“I could do a comedy routine about reactions and what they didn’t say.”

No one ever told him, “Oh, we didn’t know you were black on the phone,” but their eyes popped and the unsaid remarks would be the core of his stand-up comic monologue, he says.

Beyond the comedy, he recognized a serious racist problem.

Instead of just wondering what would have happened if he telephoned using an African-American dialect, he did an experiment. He made a series of three telephone calls using both styles of English and then a Mexican-American accent. The Standard English voice got better treatment. He set out to do wider research.

“I tapped a vein,” he says.

In a survey of his own accents, he had hundreds score his disembodied voices and try to identify his background. In those tests, 93 percent identified his “professional English voice” as a white person; 86 percent thought the black dialect as a black person; and 89 percent identified his Latino voice as a Mexican.

He laughed about getting the least convincing score as a black person. His vocal differences in those tests were only in intonation, not in grammar.

Extending empathy

Americans tempted to use their ear for linguistic profiling in racist ways should remember two things, he says.

• They should realize that by an accident of birth they have the privilege to speak Standard English.

• Standard English speakers, descended from non-English immigrants, should show respect for their own ancestors who were challenged to become fluent in English as their second language. They should extend empathy with patience and tolerance to those whose linguistic styles differ from their own use of the English language.

Descendents of African slaves were especially challenged, Baugh notes. Their slave ancestors often were deprived of their family’s language from the time of their capture in Africa.

Slave traders systematically separated captives — in holding pens, in ships and on these shores at auctions — from others who shared the same language. Once sold, slaves were often isolated from anyone who shared their language.

The varied linguistic traditions of black English — Ebonics — evolved over generations when it was illegal to teach African slaves to read or write and when many had limited opportunities to hear native Standard English speakers.

You Might Also Like

Center for the Study of Race, Ethnicity & Equity tackles challenge of structural racism

Latest from the Newsroom

Recent stories.

College Transit Challenge returns Sept. 20

Siteman to welcome first patients in new building dedicated exclusively to cancer care

Wall installed as Baker Professor

WashU Experts

2024 presidential election experts

Colleges work to increase voter turnout

How GOP has gained ground with unions, impact on 2024 election

WashU in the News

Keeping our brains healthy as we age

What to know about delta-8 and other common vape shop drugs

Wheelies look fun, but they’re a serious skill for kids in wheelchairs

Linguistic Profiling in Education: How Accent Bias Denies Equal Educational Opportunity to Students of Color

12 Scholar 355 (2009-2010)

30 Pages Posted: 21 Jun 2017

William Y. Chin

Lewis & Clark College Paul L Boley Library

Date Written: 2010

Students of color have to contend with numerous obstacles in education including the "accent bias" obstacle. Accent bias exists in K-12 education. Just as accent bias is found in the workplace, it is also found in the classroom. Studies reveal that accent bias affects a range of speakers including Black, Asian, Latina/o, and Arab speakers. Accent bias harms students in numerous ways including denying them access to charter schools, access to high-track classes, and access to full classroom participation. Both litigation-based and school-based solutions are needed to remedy accent bias in order to ensure equal educational opportunities for all students.

Suggested Citation: Suggested Citation

William Y. Chin (Contact Author)

Lewis & clark college paul l boley library ( email ).

10015 S.W. Terwilliger Blvd. Portland, OR 97219 United States

Do you have a job opening that you would like to promote on SSRN?

Paper statistics, related ejournals, social sciences education ejournal.

Subscribe to this fee journal for more curated articles on this topic

Discrimination, Law & Justice eJournal

Social & political philosophy ejournal.

Subscribe to this free journal for more curated articles on this topic

Cognitive Linguistics: Cognition, Language, Gesture eJournal

Political behavior: race, ethnicity & immigration politics ejournal, anthropology of education ejournal, education law ejournal, linguistic anthropology ejournal, political economy - development: public service delivery ejournal, anthropology of race, ethnicity & indigenous people ejournal.

  • Current Issue
  • Past Issues
  • Get New Issue Alerts
  • American Academy of Arts 
and Sciences

Introduction: Language & Social Justice in the United States

linguistic profiling essay

Walt Wolfram , a Fellow of the American Academy since 2019, is one of the pioneers of sociolinguistics. He is the William C. Friday Distinguished University Professor at North Carolina State University, where he also directs the Language and Life Project. He has published more than twenty books and three hundred articles on language variation, and has served as executive producer of fifteen television documentaries, winning several Emmys. His recent publications include Fine in the World: Lumbee Language in Time and Place (with Clare Dannenberg, Stanley Knick, and Linda Oxendine, 2021) and African American Language: Language Development from Infancy to Adulthood (with Mary Kohn, Charlie Farrington, Jennifer Renn, and Janneke Van Hofwegen, 2021).

Anne H. Charity Hudley is Associate Dean of Educational Affairs and the Bonnie Katz Tenenbaum Professor of Education and African and African-­American Studies and Linguistics, by courtesy, at the Graduate School of Education at Stanford University. She is the author of four books: The Indispensable Guide to Undergraduate Research (with Cheryl L. Dickter and Hannah A. Franz, 2017), We Do Language: English Language Variation in the Secondary English Classroom (with Christine Mallinson, 2013), Understanding English Language Variation in U.S. Schools (with Christine Mallinson, James A. Banks, Walt Wolfram, and William Labov, 2010), and Talking College: Making Space for Black Linguistic Practices in Higher Education (with Christine Mallinson and Mary Bucholtz, 2022). She is a Fellow of the Linguistic Society of America and the American Association for the Advancement of Science.

Guadalupe Valdés , a Fellow of the American Academy since 2020, is the Bonnie Katz Tenenbaum Professor of Education, Emerita, in the Graduate School of Edu­cation at Stanford University. She is also the Founder and Executive Director of the English coaching organization English Together. Her books Con Respeto: Bridging the Distances Between Culturally Diverse Families and Schools: An Ethnographic Portrait (1996) and Learning and Not Learning English: Latino Students in American Schools (2001) have been used in teacher preparation programs for many years. She has recently published in such journals as Journal of Language, Identity, and Education ; Bilingual Research Journal ; and Language and Education .

In recent decades, the United States has witnessed a noteworthy escalation of academic responses to long-standing social and racial inequities in its society. In this process, research, advocacy, and programs supporting diversity and inclusion initiatives have grown. A set of themes and their relevant discourses have now developed in most programs related to diversity and inclusion; for example, current models are typically designed to include a range of groups, particularly reaching people by their race/ethnicity, sexual orientation, religious affiliation, gender, and other demographic categories. Unfortunately, one of the themes typically overlooked, dismissed, or even refuted as necessary is language. Furthermore, the role of language subordination in antiracist activities tends to be treated as a secondary factor under the rubric of culture. Many linguists, however, see language inequality as a central or even leading component related to all of the traditional themes included in diversity and inclusion strategies. 1 In fact, writer and researcher Rosina Lippi-Green observes that “Discrimination based on language variation is so commonly accepted, so widely perceived as appropriate, that it must be seen as the last back door to discrimination. And the door stands wide open.” 2

Even academics, one of the groups that should be exposed to issues of comprehensive inclusion, have seemingly decided that language is a low-priority issue. As noted in a 2015 article in The Economist :

The collision of academic prejudice and accent is particularly ironic. Academics tend to the centre-left nearly everywhere, and talk endlessly about class and multiculturalism . . . . And yet accent and dialect are still barely on many people’s minds as deserving respect. 3

As such, as the editors of this collection, we have commissioned thirteen essays that address specific issues of language inequality and discrimination, both in their own right and directly related to traditional themes of diversity and inclusion.

Recent issues of Dædalus have addressed immigration, climate change, access to justice, inequality, and teaching in higher education, all of which relate to language in some way. 4 The theme of the Summer 2022 issue is “The Humanities in American Life: Transforming the Relationship with the Public.” As an extension of that work, the essays in this volume focus on a humanistic social science approach to transforming our relationship with language both in the academy and at large.

There is a growing inventory of research projects and written collections that consider issues of language and social justice, including dimensions such as racio­ linguistics, linguistic profiling, multilingual education, gendered linguistics, and court cases that are linguistically informed. Those materials cover a comprehensive range of language issues related to social justice. The collection of essays in this Dædalus volume is unique in its breadth of coverage and extends from issues including linguistic profiling, raciolinguistics, and institutional linguicism to multi­lingualism, language teaching, migration, and climate change. The authors are experts in their respective areas of scholarship, who combine strong research records with extensive engagement in their topics of inquiry.

The initial goal of this Dædalus issue is to demonstrate the vast array of social and political disparity manifested in language inequality, ranging from ecological conditions such as climate change, social conditions of inter- and intralanguage variation, and institutional policies that promulgate the notion and the stated practice of official languages and homogenized, monolithic norms of standardized language based on socially dominant speakers. These norms are socialized overtly and covertly into all sectors of society and often are adopted as consensus norms, even by those who are marginalized or stigmatized by these distinctions. As linguist Norman Fairclough notes in Language and Power , the exercise of power is most efficiently achieved through ideology-manufacturing consent instead of coercion. 5  Practices that appear universal or common sense often originate in the dominant class, and these practices work to sustain an unequal power dynamic. Furthermore, there is power behind discourse because the social order of discourses is held together as a hidden effect of power, such as standardization and national/official languages, and power in discourse as strategies of discourse reflect asymmetrical power relations between interlocutors in sets of routines, such as address forms, interruptions, and a host of other conversational routines. In this context, the first step in addressing these linguistic inequalities is to raise awareness of their existence, since many operate as implicit bias rather than overt, explicit bias recognized by the public.

Unfortunately, and somewhat ironically, higher education has been slow in this process; in fact, several essays in this collection show that higher education has been an active agent in the reproduction of linguistic inequality at the same time that it advocates for equality in many other realms of social structure. 6 Two essays in particular explore underlying notions of standardization and the use of language in social presentation and argumentation. The essays also address language rights as a fundamental human right. In “Language Standardization & Lin guistic Subordination,” Anne Curzan, Robin M. Queen, Kristin VanEyk, and Rachel Elizabeth Weissler discuss how ideologies about standardized language circulate in higher education, to the detriment of many students, and they include a range of suggestions and examples for how to center linguistic justice and equity within higher education.

Curzan and coauthors give us an important overview of language standardization:

We have suggested some solutions to many of the issues we’ve highlighted in this essay; however, implementing solutions in a meaningful way first requires recognition of how important language variation is for our everyday interactions with others. Second, implementing solutions depends on recognizing how our ideas about language (standardized or not) can pose a true barrier to meaningful change. Such recognition includes the understanding that much of what we think about language often stands as a proxy for what we think about people, who we are willing to listen to and hear, and who we want to be with or distance ourselves from. 7

In “Addressing Linguistic Inequality in Higher Education: A Proactive Model,” Walt Wolfram describes a proactive “campus-infusion” program that includes activities and resources for student affairs, academic affairs, human resources, faculty affairs, and offices of institutional equity and diversity. Wolfram’s essay shows directly and specifically how academics aren’t always the solution but, as a whole, are complicit in linguistic exclusion. He writes:

A casual survey of university diversity statements and programs indicates that a) there is an implicitly recognized set of diversity themes within higher education and b) it traditionally excludes language issues. 8 Topics related to race, ethnicity, gender, religion, sexual preference, and age are commonly included in these programs, but language is noticeably absent, either by explicit exclusion or by implicit disregard. Ironically, issues of language intersect with all of the themes in the canonical catalog of diversity issues. 9

The absence of systemic language considerations from most diversity and inclusion programs and their limited role in antiracist initiatives is a major concern for these programs, since language is a critical component for discrimination among the central themes in the extant canon of diversity. Language is an active agent in discrimination and cannot be overlooked or minimized in the process.

Some of the essays in this volume of Dædalus address the sociopolitical dominance of a restricted set of languages and its impact on the lives of speakers of devalued languages. The authors of these essays consider the effects of climate, social, educational, legal, and political dissonance confronted by speakers of nondominant languages. They also show how the metaphors of “disappearance” and “loss” obscure the colonial processes responsible for the suppression of Indigenous languages. People who speak an estimated 90 percent of the world’s languages have now been linguistically and culturally harmed due to the increasing dominance of a selected number of “world languages” and changes in the physical and topographical ecology. The authors describe the implications of this extensive language subjugation and endangerment and the consequences for the speakers of these languages. Both physical and social ecology are implicated in this threat to multitudes of languages in the world.

Linguistics in general, and sociolinguistics in particular, has a significant history of engagement in issues of social inequality. From the educational controversies over the language adequacy of marginalized, racialized groups of speakers in the 1960s, as in linguist William Labov’s A Study of Non-Standard English , to ideological challenges to multilingualism and the social and cultural impact of the devaluing of the world’s languages, as described in the essays by Wesley Y. Leonard, Guadalupe Valdés, and Julia C. Fine, Jessica Love-Nichols, and Bernard C. Perley, the role of language is a prominent consideration in the actualization and dispensation of social justice. 10

In addition, this collection addresses areas of research that are complementary to the American Academy of Arts and Sciences’ 2017 report by the Commission on Language Learning, America’s Languages: Investing in Language Education for the 21st Century . 11 In spite of the long-term presence of the teaching of languages other than English in the American educational system, concern over “world language capacity” has surfaced periodically over a period of many years because of the perceived limitations in developing functional additional language proficiencies. The consensus view (as in Congressman Paul Simon’s 1980 report The Tongue-Tied American ) has been that foreign/world language study in U.S. schools is generally unsuccessful, that Americans are poor language learners, and that focused attention must be given to the national defense implications of these language limitations. 12 In the 2017 Language Commission report, foreign/world language study is presented as 1) critical to success in business, research, and international relations in the twenty-first century and 2) a contributing factor to “improved learning outcomes in other subjects, enhanced cognitive ability, and the development of empathy and effective interpretive skills.” 13

The Academy’s report presents information about languages spoken at home by U.S. residents (76.7 percent English, 12.6 percent Spanish). It also includes a graphic illustrating the prevalence of thirteen other languages (including Chinese, Hindi, Filipino and Tagalog, and Vietnamese) commonly spoken by 0.13 percent to 0.2 percent of the population, as well as a category identified as all other languages (a small category comprising 2.2 percent of residents of the United States). 14 The report focuses on languages — rather than speakers — and recommends: 1) new activities that will increase the number of language teachers, 2) expanded efforts that can supplement language instruction across the education system, and 3) more opportunities for students to experience and immerse themselves in “languages as they are used in everyday interactions and across all segments of society.” It also specifically mentions needed support for heritage languages so these languages can “persist from one generation to the next,” and for targeted programming for Native American languages. 15

While it effectively interrupted the monolingual, English-only ideologies that permeate ideas on language in the United States, the conceptualization of language undergirding the report needs to be greatly expanded. The report focuses on developing expertise in additional language acquisition as the product of deliberative study. For example, in the case of heritage languages (defined as those non-English languages spoken by residents of the United States), the report highlights efforts such as the Seal of Biliteracy. Through this effort (now endorsed by many states around the country), high school students who complete a sequence of established language classes and pass a state-approved language assessment can obtain an official Seal of Biliteracy endorsement. Unfortunately, the series of courses and the assessments required to obtain the Seal are only available in a limited number of languages. The report mentions other efforts, including dual language immersion programs, yet it does not recognize family- and community-­gained bilingualism and biliteracy. Notably, the report specifically laments what are viewed as limited literacy abilities of heritage language speakers and recommends making available curricula specially designed for heritage language learners and Native American languages.

The view of language that the report is based on is a narrow one and does not represent the linguistic realities of the majority of bilingual and multilingual students. In her contribution to this volume, “Social Justice Challenges of ‘Teaching’ Languages,” Guadalupe Valdés “specifically problematize[s] language instruction as it takes place in classroom settings and the impact of what I term the curricularization of language as it is experienced by Latinx students who ‘study’ language qua language in instructed situations.” 16 Valdés shows us how these specific issues play out in what is typically viewed as the neutral “teaching” of languages. She writes that challenges to

linguistic justice [result] from widely held negative perspectives on bi/multilingualism and from common and continuing misunderstandings of individuals who use resources from two communicative systems in their everyday lives. My goal is to highlight the effect of these misunderstandings on the direct teaching of English. 17

In “Refusing ‘Endangered Languages’ Narratives,” Wesley Y. Leonard draws from his experiences as a member of a Native American community whose language was wrongly labeled “extinct”:

Within this narrative, I begin with an overview of how language endangerment is described to general audiences in the United States and critique the way it is framed and shared. From there, I shift to an alternative that draws from Indigenous ways of knowing to promote social justice through language reclamation. 18

Leonard encourages us to directly refute “dominant endangered languages narratives” and replace the focus on the actors of harm in Indigenous communities with a focus on the creativity and resolve of native scholars working to revitalize native language and culture. As he states, the “ultimate goal of this essay is to promote a praxis of social justice by showing how language shift occurs largely as a result of injustices, and by offering possible interventions.” 19

In “Climate & Language: An Entangled Crisis,” Julia C. Fine, Jessica Love-­Nichols, and Bernard C. Perley

note that these academic discourses — as well as similar discourses in nonprofit and policy-making spheres — rightly acknowledge the importance of Indigenous thought to environmental and climate action. Sadly, they often fall short of acknowledging both the colonial drivers of Indigenous language “loss” and Indigenous ownership of Indigenous language and environmental knowledge. We propose alternative framings that emphasize colonial responsibility and Indigenous sovereignty. 20

Fine, Love-Nichols, and Perley present models of how language and climate are intertwined. They write, “Scholars and activists have documented the intersections of climate change and language endangerment, with special focus paid to their compounding consequences.” The authors “consider the relationship between language and environmental ideologies, synthesizing previous research on how metaphors and communicative norms in Indigenous and colonial languages influence environmental beliefs and actions.” 21

The essays in this volume profile a wide range of language issues related to social justice, from everyday hegemonic comments to legislative policies and courtroom testimony that depend on language reliability and the linguistic credibility of witnesses who do not communicate in a mainstream American English variety. In 1972, the president of the Linguistic Society of America, Dwight Bolinger, gave his presidential address titled “Truth is a Linguistic Question” as a forewarning of the linguistic accountability of public reporting of national events. In his other work, he describes language as “a loaded weapon.” Through these essays, we find both concepts to be true. 22

Over recent decades, the field of linguistics has developed a robust specialization in areas that pay primary attention to the application of a full range of legal and nonlegal verbal, digital, and document communication that is at the heart of equitable communication strategies. Language variation is also a highly politicized behavior, extending from the construct of a “standardized language” considered essential for writing and speaking to the use of language in negotiating the administration of social and political justice. The essays on linguistic variation and sociopolitical ideology, by Curzan and coauthors, Jonathan Rosa and Nelson Flores, and H. Samy Alim, examine both the ideological underpinnings of consensual constructs such as “standard” versus “nonmainstream” and their use in the political process of persuasion and sociopolitical implementation. 23 The authors in this section address key issues of language variation and language discrimination that demonstrate the vitality of language in issues of social justice, both independent of and related to other attributes of social justice. This model includes standardization in media platforms, as described in Rosa and Flores’s ­essay, demonstrating the systemic othering of those who do not speak this variety as their default dialect.

In “Rethinking Language Barriers & Social Justice from a Raciolinguistic Perspective,” Rosa and Flores show how “the trope of language barriers and the toppling thereof is widely resonant as a reference point for societal progress.”

We argue that by interrogating the colonial and imperial underpinnings of widespread ideas about linguistic diversity, we can connect linguistic advocacy to broader political struggles. We suggest that language and social justice efforts must link affirmations of linguistic diversity to demands for the creation of societal structures that sustain collective well-being. 24

Rosa and Flores present and update their raciolinguistics model in current spaces where race meets technology. With this emerging technology as a reference point, they demonstrate why “it is crucial to reconsider the logics that inform contemporary digital accent-modification platforms and the broader ways that purportedly benevolent efforts to help marked subjects modify their language practices become institutionalized as assimilationist projects masquerading as  assistance.” They also note that disability has always been part of the story — and needs to be brought back to light — sharing that Mabel Hubbard and Ma Bell, who were both influential on modern linguistic technology, were deaf women. 25

In “Black Womanhood: Raciolinguistic Intersections of Gender, Sexuality & Social Status in the Aftermaths of Colonization,” Aris Moreno Clemons and Jessica A. Grieser “call for an exploration of social life that considers the raciolinguistic intersections of gender, sexuality, and social class as part and parcel of overarching social formations.” They center the Black woman as the prototypical Other, her condition being interpreted neither by conventions of race nor gender. As such, we take “Black womanhood as the point of departure for a description of the necessary intersecting and variable analyses of social life.” Clemons and Greiser “interrogate the intersections of gender, sexuality, and social status, focusing on the experiences of Black women who fit into and lie at the margins of these categories.” They highlight the work of semiotician Krystal A. Smalls, who “reveals a model for how interdisciplinary reading across fields such as Black feminist studies, Black anthropology, Black geographies, and Black linguistics can result in expansive and inclusive worldmaking.” 26

In “Asian American Racialization & Model Minority Logics in Linguistics,” Joyhanna Yoo, Cheryl Lee, Andrew Cheng, and Anusha Ànand “consider historical and contemporary racializing tactics with respect to Asians and Asian Americans.” Such racializing tactics, which they call model minority logics,

weaponize an abstract version of one group to further racialize all minoritized groups and regiment ethnoracial hierarchies. We identify three functions of model minority logics that perpetuate white supremacy in the academy, using linguistics as a case study and underscoring the ways in which the discipline is already mired in racializing logics that differentiate scholars of color based on reified hierarchies. 27

The authors consider the often-overlooked linguistic experiences of Asian Americans in linguistics and show how “ideological positioning of Asian Americans as ‘honorary whites’ is based on selective and heavily skewed images of Asian American economic and educational achievements that circulate across institutional and dominant media channels.” 28

In “Inventing ‘the White Voice’: Racial Capitalism, Raciolinguistics & Culturally Sustaining Pedagogies,” H. Samy Alim explores

how paradigms like raciolinguistics and culturally sustaining pedagogies, among others, can offer substantive breaks from mainstream thought and provide us with new, just, and equitable ways of living together in the world. I begin with a deep engagement with Boots Riley and his critically acclaimed, anticapitalist, absurdist comedy ­ Sorry to Bother­ You in hopes of demonstrating how artists, activists, creatives, and scholars might: 1) cotheorize the complex relationships between language and racial capitalism and 2) think through the political, economic, and pedagogical implications of this new theorizing for Communities of Color. 29

Alim digs deep into models of aspirational whiteness in Sorry to Bother You and shows how it goes past the mark. In the script, Boots states, “It’s not really a white voice. It’s what they wish they sounded like. So, it’s like, what they think they’re supposed to sound like.” All of the authors in this section examine varied kinds of intervention strategies and programs in institutional education and social action that can raise awareness of and help to ameliorate linguistic subordination and sociolinguistic inequality in American society.

From our perspective, it is not sufficient to raise awareness and describe lin guistic inequality without attempting to confront and ameliorate that inequality. ­ Thus, our third and final set of papers by John Baugh, Sharese King and John R. Rickford, and Norma Mendoza-Denton offer legal and policy alternatives that implement activities and programs that directly confront issues of institutional inequality. As linguist Jan Blommaert puts it, “we need an activist attitude, one in which the battle for power-through-knowledge is engaged, in which knowledge is activated as a key instrument for the liberation of people, and as a central tool underpinning any effort to arrive at a more just and equitable society.” 30 Our authors illustrate the communicative processes involved when we use our human capacity for language to work toward justice.

In “Linguistic Profiling across International Geopolitical Landscapes,” Baugh “explore[s] various forms of linguistic profiling throughout the world, culminating with observations intended to promote linguistic human rights and the aspirational goal of equality among people who do not share common sociolinguistic backgrounds.” 31 Baugh extends his previous work on linguistic profiling into the international geopolitical landscape and notes, in countries that have them, the role that language academies play in reinforcing narrow norms, showing how those practices relate to practices in countries where these processes are more organic and situated in the educational systems.

In “Language on Trial,” King and Rickford draw on their case study of the testimony of Rachel Jeantel, a close friend of Trayvon Martin, in the 2013 trial of George Zimmerman v. The State of Florida . 32 They show that despite being an ear-witness (by cell phone) to all but the final minutes of Zimmerman’s interaction with Trayvon, and despite testifying for nearly six hours about it, her testimony was dismissed in jury deliberations. “Through a linguistic analysis of Jeantel’s speech, comments from a juror, and a broader contextualization of stigmatized speech forms and linguistic styles,” they show that “lack of acknowledgment of dialectal variation has harmful social and legal consequences for speakers of stigmatized dialects.” 33 Their work complements legal scholar D. James Greiner’s essay on empiricism in law, from a previous volume of Dædalus , to show how empirical linguistic analysis should be included in such models. 34 As King and Rickford state:

Alongside the vitriol from the general public, evidence from jury members suggested that not only was Jeantel’s speech misunderstood, but it was ultimately disregarded in more than sixteen hours of deliberation. With no access to the court transcript, unless when requesting a specific playback, jurors did not have the materials to reread speech that might have been unfamiliar to most if they were not exposed to or did not speak the dialect. 35

In “Currents of Innuendo Converge on an American Path to Political Hate,” Norma Mendoza-Denton shows that politicians’ “innuendo such as enthymemes, sarcasm, and dog whistles” gave us “an early warning about the type of relationship that has now obtained between Christianity and politics, and specifically the rise of Christian Nationalism as facilitated by President Donald Trump.” She demonstrates that “two currents of indirectness in American politics, one religious and the other racial, have converged like tributaries leading to a larger body of water.” 36

Anne H. Charity Hudley concludes the collection with “Liberatory Linguistics,” offering the model as “a productive, unifying framework for the scholarship that will advance strategies for attaining linguistic justice [ . . . ] [e]merging from the synthesis of various lived experiences, academic traditions, and methodological approaches.” She highlights promising strategies from her work with Black undergraduates, graduate students, postdoctoral scholars, and faculty members as they endeavor to embed a justice framework throughout the study of language broadly conceived that can “improve current approaches to engaging with structural realities that impede linguistic justice.” 37 Charity Hudley ends by noting how this set of essays is in conversation with the 2022 Annual Review of Applied Linguistics on social justice in applied linguistics, and the forthcoming Oxford volumes Decolonizing Linguistics and Inclusion in Linguistics , which “set frameworks for the professional growth of those who study language and create direct roadmaps for scholars to establish innovative agendas for integrating their teaching and research and outreach in ways that will transform linguistic theory and practice for years to come.” 38

As our summaries suggest, this collection of essays is diverse and comprehensive, representing a range of situations and conditions calling for justice in language. We hope these essays, along with other publications on this topic, broaden the conversations across higher education on language and justice. We are extremely grateful to the authors who have shared their knowledge, research, advocacy, and perspectives in such lucid, accessible presentations.

  • 1 See, for example, the statement by the Linguistic Society of America, “ LSA Statement on Race ,” May 2019.
  • 2 Rosina Lippi-Green, English with an Accent: Language, Ideology, and Discrimination in the United States (New York: Routledge, 2012).
  • 3 R. L. G., “ The Last Acceptable Prejudice ,” The Economist , January 29, 2015.
  • 4 Cecilia Menjívar, “ The Racialization of ‘Illegality,’ ” Dædalus 150 (2) (Spring 2021): 91–105; Jessica F. Green, “ Less Talk, More Walk: Why Climate Change Demands Activism in the Academy, ” Dædalus 149 (4) (Fall 2020): 151–162; D. James Greiner, “ The New Legal Empiricism & Its Application to Access-to-Justice Inquiries ,” Dædalus 148 (1) (Winter 2019): 64–74; Irene Bloemraad, Will Kymlicka, Michèle Lamont, and Leanne S. Son Hing, “ Membership without Social Citizenship? Deservingness & Redistribution as Grounds for Equality ,” Dædalus 148 (3) (Summer 2019): 73–104; and Sandy Baum and Michael McPherson, “ The Human Factor: The Promise & Limits of Online Education ,” Dædalus 148 (4) (Fall 2019): 235–254.
  • 5 Norman Fairclough, Language and Power , 2nd ed. (New York: Routledge, 2001).
  • 6 Stephany Brett Dunstan, Walt Wolfram, Andrey J. Jaeger, and Rebecca E. Crandall, “ Educating the Educated: Language Diversity in the University Backyard ,” American Speech 90 (2) (2015): 266–280.
  • 7 Anne Curzan, Robin M. Queen, Kristin VanEyk, and Rachel Elizabeth Weissler, “ Language Standardization & Linguistic Subordination ,” Dædalus 152 (3) (Summer 2023): 31.
  • 8 Kendra Nicole Calhoun, “ Competing Discourses of Diversity and Inclusion: Institutional Rhetoric and Graduate Student Narratives at Two Minority Serving Institutions ” (PhD diss., University of California, Santa Barbara, 2021).
  • 9 Walt Wolfram, “ Addressing Linguistic Inequality in Higher Education: A Proactive Model ,” Dædalus 152 (3) (Summer 2023): 37.
  • 10 William Labov, A Study of Non-Standard English (Washington, D.C.: Center for Applied Linguistics, 1969); Wesley Y. Leonard, “ Refusing ‘Endangered Languages’ Narratives ,” Dædalus 152 (3) (Summer 2023); Guadalupe Valdés, “ Social Justice Challenges of ‘Teaching’ Languages ,” Dædalus 152 (3) (Summer 2023); and Julia C. Fine, Jessica Love-Nichols, and Bernard C. Perley, “ Climate & Language: An Entangled Crisis ,” Dædalus 152 (3) (Summer 2023): 84–98.
  • 11 American Academy of Arts and Sciences, America’s Languages: Investing in Language Education for the 21st Century (Cambridge, Mass.: American Academy of Arts and Sciences, 2017).
  • 12 Paul Simon, The Tongue-Tied American: Confronting the Foreign Language Crisis (New York: The Crossroad Publishing Company, 1980).
  • 13 American Academy of Arts and Sciences, America’s Languages , vii.
  • 14 Ibid., 4.
  • 15 Ibid., 6.
  • 16 Valdés, “Social Justice Challenges of ‘Teaching’ Languages,” 53.
  • 18 Leonard, “Refusing ‘Endangered Languages’ Narratives,” 69.
  • 20 Fine, Love-Nichols, and Perley, “Climate & Language,” 84.
  • 22 Dwight Bolinger, Language: The Loaded Weapon—The Use and Abuse of Language Today (New York: Routledge, 2021).
  • 23 Curzan et al., “Language Standardization & Linguistic Subordination”; Jonathan Rosa and Nelson Flores, “ Rethinking Language Barriers & Social Justice from a Raciolinguistic Perspective ,” Dædalus 152 (3) (Summer 2023): 99–114; and H. Samy Alim, “ Inventing ‘The White Voice’: Racial Capitalism, Raciolinguistics & Culturally Sustaining Pedagogies ,” Dædalus 152 (3) (Summer 2023): 147–166.
  • 24 Rosa and Flores, “Rethinking Language Barriers & Social Justice from a Raciolinguistic Perspective,” 99.
  • 25 Ibid., 101–102.
  • 26 Aris Moreno Clemons and Jessica A. Grieser, “ Black Womanhood: Raciolinguistic Intersections of Gender, Sexuality & Social Status in the Aftermaths of Colonization ,” Dædalus 152 (3) (Summer 2023): 115, 117, 119, 124.
  • 27 Joyhanna Yoo, Cheryl Lee, Andrew Cheng, and Anusha Ànand, “ Asian American Racial­ization & Model Minority Logics in Linguistics ,” Dædalus 152 (3) (Summer 2023): 130.
  • 28 Ibid., 134.
  • 29 Alim, “Inventing ‘The White Voice,’” 147.
  • 30 Jan Blommaert, “ Looking Back, What Was Important? ” Ctrl+Alt+Dem, April 20, 2020.
  • 31 John Baugh “ Linguistic Profiling across International Geopolitical Landscapes ,” Dædalus 152 (3) (Summer 2023): 167.
  • 32 John R. Rickford and Sharese King, “ Language and Linguistics on Trial: Hearing Rachel Jeantel (and Other Vernacular Speakers) in the Courtroom and Beyond ,” Language 92 (4) (2016): 948–988.
  • 33 Sharese King and John R. Rickford, “ Language on Trial ,” Dædalus 152 (3) (Summer 2023): 178.
  • 34 Greiner, “The New Legal Empiricism & Its Application to Access-to-Justice Inquiries.”
  • 35 King and Rickford, “Language on Trial,” 181.
  • 36 Norma Mendoza-Denton, “ Currents of Innuendo Converge on an American Path to Political Hate ,” Dædalus 152 (3) (Summer 2023): 194.
  • 37 Anne H. Charity Hudley, “ Liberatory Linguistics ,” Dædalus 152 (3) (Summer 2023): 212 .
  • 38 Alison Mackey, Erin Fell, Felipe de Jesus, et al., “ Social Justice in Applied Linguistics: Making Space for New Approaches and New Voices ,” Annual Review of Applied Linguistics 42 (2022): 1–10; Anne H. Charity Hudley and Nelson Flores, “ Social Justice in Applied Linguistics: Not a Conclusion, but a Way Forward ,” Annual Review of Applied Linguistics 42 (2022): 144–154; Anne H. Charity Hudley, Christine Mallinson, and Mary Bucholtz, eds., Decolonizing Linguistics (Oxford: Oxford University Press, forthcoming); and Anne and Anne H. Charity Hudley, Christine Mallinson, and Mary Bucholtz, eds., Inclusion in Linguistics (Oxford: Oxford University Press, forthcoming).

Recent Dædalus Issues Explore Mental Health as well as Language & Social Justice in the United States

  • Tools and Resources
  • Customer Services
  • Applied Anthropology
  • Archaeology
  • Biological Anthropology
  • Histories of Anthropology
  • International and Indigenous Anthropology
  • Linguistic Anthropology
  • Sociocultural Anthropology
  • Share Facebook LinkedIn Twitter

Article contents

Language and white supremacy.

  • Jennifer Roth-Gordon Jennifer Roth-Gordon University of Arizona
  • https://doi.org/10.1093/acrefore/9780190854584.013.591
  • Published online: 19 April 2023

White supremacy is a racial order that relies on a presumed “natural” superiority of whiteness and assigns to all groups racialized as non-white biological or cultural characteristics of inferiority. Despite decades of scientific studies refuting these claims, beliefs in racial difference continue to rely on ideas of innate or genetic differences between groups. Scholars now widely agree that race is a social, cultural, and political distinction that was and continues to be forged through relations of transatlantic slavery, colonialism, and imperialism. A focus on white supremacy does not limit scholars to the study of white supremacists, that is, those individuals and groups that outwardly espouse a racial order that privileges whiteness and white people and frequently endorse physical violence to maintain this order. Under white supremacy, societies privilege whiteness even in the absence of explicit laws and sometimes while promoting ideologies of racial inclusion and equality. Contexts of white supremacy feature the consolidation of white power and wealth at the expense of people of color—an arrangement that is maintained through racial capitalism, settler colonialism, anti-blackness, imperial conquest, Islamophobia or anti-Muslim racism, and xenophobic or anti-immigrant sentiment. Widespread awareness of linguistic difference can be mobilized to support these pillars of white supremacy through a range of official language policies and overt acts of linguistic suppression, as well as more covert or subtle language practices and ideologies. While the term “white supremacy” has gained broader circulation in the 21st century, these topics have been studied by linguistic anthropologists and sociolinguists for decades under the more familiar headings of “race and language,” “racism and language,” and “raciolinguistics.” This scholarship examines how racial domination is consolidated, maintained, and justified through attention paid to language, but also the ways that marginalized speakers take up a broad range of linguistic practices to challenge assumptions about the superiority of whiteness and emphasize non-white racial pride, community ties, and cultural and linguistic heritage and traditions.

Racial and linguistic hierarchies work together to falsely connect whiteness and the use of “standard” (officially sanctioned) language with rationality, intelligence, education, wealth, and higher status. Under these racial logics, speakers of languages associated with non-whiteness are readily linked to danger, criminality, a lack of intelligence or ability, primitivism, and foreignness. Together these ideologies naturalize connections between languages or specific linguistic practices and types of people, producing the conditions under which racialized speakers experience discrimination, marginalization, exclusion, oppression, and violence. At the same time, speakers challenge these power dynamics through linguistic practices that range from codeswitching, bilingualism and multilingualism, and language revitalization efforts, to verbal traditions both old and new, including social media genres. Though racial hierarchy continues to be bolstered by a linguistic hierarchy that assigns higher value to English as well as other European or colonial languages, linguistic variation persists, as speakers proudly embrace linguistic practices that defy the push to assimilate or submit to language loss. Beliefs in the superiority of whiteness have global resonance, but local specificities are important, and a majority of research has thus far been conducted within the context of the United States. Scholars who study language, racial inequality, and oppression continue to weigh in on public policies and debates in an attempt to raise awareness on these issues and advocate for racial and social justice.

  • racial inequality
  • racial hierarchy
  • racial identity
  • raciolinguistics
  • language standardization
  • language ideologies
  • white supremacy

The Linguistic Construction of Race and Racial Hierarchy

As a racial order, white supremacy does not merely attribute positive characteristics to white people and negative supposedly “innate” traits to people of color; it also constantly reminds people of this ranking of human difference as part of a “racializing social imaginary” ( Dick and Wirtz 2011 ). The study of racialization as a process helps us see the co-construction of race and language and why racial categories cannot be taken for granted as describing preexisting racial groups ( Charity Hudley 2017 ). Instead, scholars analyze how race is constructed in linguistic interactions through linguistic and cultural ideologies about types of people. Research on racialization, linguistic stances toward whiteness, covert racializing discourse, and mock languages shows how these connections are embedded in daily life through a range of interactions that may not appear, on the surface, to be addressing race at all.

Racialization

Race is an invented and imagined form of social difference, only loosely based on phenotypic or visual cues, that has the real-world effects of separating, categorizing, and ranking groups of people in society. Racialization describes the process through which this difference is created and then naturalized. As Wirtz (2014) notes, “Racialization processes create a circular logic in which racial ascriptions seem always to be already present” (88). Chun and Lo (2016) define linguistic racialization as “the sociocultural processes through which race—as an ideological dimension of human differentiation—comes to be imagined, produced, and reified through language practices” (220). Thus, even as race tends to be popularly imagined in terms of obvious visual cues linked to putative invisible properties, scholarship on language and white supremacy draws attention to the ways that race is very often imagined and projected through spoken, signed, and written language. As Alim, Rickford, and Ball (2016) remind us with the blended and now popular term “raciolinguistics,” language is a powerful site through which racial meanings are produced and perpetuated. Without everyday linguistic, cultural, and other semiotic practices through which racial meaning is established, circulated, and sometimes challenged, race would cease to exist as a way of organizing our world.

One early example of research on linguistic racialization is Urciuoli’s (2013) ethnography of working-class Puerto Ricans living in New York. In this classic text, Urciuoli shows how “good English” comes to be associated with the speech of white, middle-class Americans while the speaking of Spanish—including any observable or perceived Spanish-English bilingualism—takes on connotations of danger and disorder. Her excellent discussion of race and ethnicity (in the chapter entitled “Racialization and Language”) shows how these constitute important points of contrast in the US context: Groups that are “ethnicized” are understood to demonstrate only safe, “cultural” differences from unmarked white Americans and are viewed as more assimilable into American society. By contrast, groups that are racialized as non-white are understood to have innate racial differences that make them dangerous threats to the nation. Under this still-present logic, some accents, typically those associated with white Europeans, may be perceived as “cute” or “exotic,” while non-white immigrants, often from Latin America and Asia. are described as linguistically unassimilable due to their public use of languages that threaten white English-speaking spaces (see also Hill 2008 ). Immigration policies can also produce and naturalize racialized difference, in particular through the widespread use of terms such as “illegal aliens” that allow white Americans to participate in a “legal racialization” of Mexican immigrants as dangerous racial “Others” ( Dick 2011b ). Anxieties surrounding migration also promote discourses of racialization that turn people into “distinct and hierarchically ranked kinds” ( Dick 2011a , 227).

A wide range of practices can be taken up by individuals themselves to engage in racialization—both to reinforce racial differentiation between groups and as acts of racial identification and community belonging. Mendoza-Denton (2008) shows how Latina girls in Northern California choose English or Spanish to engage in a “racialized nationalism” that indicates their connections to “Norteña” (Northern) versus “Sureña” (Southern) gangs that are also aligned with an imagined white United States in contrast to an imagined non-white Mexico. She tracks their careful juxtaposition of cultural, linguistic, and embodied practices—including using different types of makeup to lighten or darken one’s skin color—to map the Global North and Global South onto their bodies in a process she calls “hemispheric localism.” In a later study, Mendoza-Denton (2011) illustrates how linguistic features such as “creaky voice” can be used to index or point to the desired image of a “hardcore Chicano gangster” (by Chicano gang members themselves), but also racialize them as dangerous, macho criminals (by outsiders). In an ethnographic study of a Chicago high school, Rosa (2019) describes how Latinx students can be associated with Spanish, even when speaking English, in what he describes as “looking like a language, sounding like a race.”

The normalization of connections between types of people, the languages they speak, and their ranking in society is also conveyed through mass media. In her study of how Indigenous peoples in the Amazon are portrayed in a wide range of films, Graham (2020) describes how Amazonian languages are treated as interchangeable, and speakers are often represented through unintelligible acoustic cues. She argues that this persistent portrayal of a lack of language creates a sense of “linguistic primitivism,” rendering Indigenous speakers closer to nature and naturally subordinate to speakers of “civilized” languages. Studies of linguistic racialization thus draw out racial ideologies that link languages, and specific linguistic features or practices, to groups of speakers and their supposedly “innate” characteristics or racial capacities—such as a tendency toward criminal behavior or a capacity to become “civilized.”

Speakers learn to navigate, and continue to be impacted by, racialization and the ranking of races and languages. For example, in a study of young Dominican Americans growing up in Providence, Rhode Island, Bailey (2002) illustrates how bilingual English/Spanish speakers choose to emphasize their status as fluent Spanish speakers. Their construction of a locally significant racial category of “Spanish” serves to distance them from the more stigmatized identification as “Black.” Drawing on ethnographic research conducted in Cuba, Wirtz (2014) brings together a wide range of images, religious rituals, and state-sanctioned carnival and folkloric performances to ask how similarly stigmatized notions of blackness are semiotically constructed at multiple levels, including through language. Unpacking embodied moments of racialization or “racializing performances” through which racial figures are created in colonial time-spaces or chronotopes, she explores how the racialized figure of the bozal reifies ideas of an “untamed” and parodied African-born slave ( Wirtz 2020 ). She urges us to see how these instances of symbolic violence make possible the physical violence on which (ongoing) colonial domination depends. Also highlighting the central role of language in colonial contexts, Carlan (2018) examines the racial and linguistic ideologies that upheld the categorizing and ranking of dialects, including by evolutionary stage, in the Linguistic Survey of India ( 1903–1928 ). In this case, the British colonial entextualization of overdetermined connections between linguistic and racial difference continues to influence 21st-century struggles over political recognition in India.

Linguistic Stances toward Whiteness

Some studies of linguistic racialization have focused on relationships to whiteness to show how languages and their speakers are assigned differential values within linguistic and racial hierarchies. Contributors to an early collection ( Trechter and Bucholtz 2001 ) show how linguistic anthropology’s focus on identity and ideology offers powerful theoretical insights into the construction of whiteness and to the field of critical whiteness studies. In a now classic ethnographic study of white high school students in the United States, Bucholtz (2011) carefully tracks how youth linguistically identify with and distance themselves from racialized styles that often rely on linguistic features for their social recognizability. For example, while “nerds” draw on superstandard forms of English and diligently avoid slang, white hip-hop fans embrace African American Language (AAL), including slang, to align themselves with “cool” masculinity. Kiesling (2001) also finds that white fraternity men employ linguistic strategies, such as the borrowing of AAL, to mark racial difference and claim whiteness in indirect ways. While these studies argue that racialized linguistic forms can be used by white speakers to highlight and perform their own whiteness, Roth-Gordon (2011) suggests that white English speakers in the United States can also play with linguistic racialization, choosing to make direct connections to Spanish that temporarily distance them from whiteness. As speakers draw on naturalized connections between race and language within interactions, they continuously reestablish and renegotiate their positions within linguistic and racial hierarchies.

The meanings of these strategic maneuverings are not solely determined by speakers themselves. Contributors to the collection, “Beyond Yellow English” ( Reyes and Lo 2009 ) examine how language use can position Asian Pacific Americans as “forever foreigners,” “honorary whites,” or “inauthentic” speakers of heritage languages. In an ethnographic account in Rio de Janeiro, Brazil, Roth-Gordon (2017) shows how city residents “read” the body for racial signs, which include both visual and acoustic cues. Slang speaking, in particular, can dangerously align poor, dark-skinned youth who live in favelas (shantytowns) with blackness and criminality, while the avoidance of slang and reliance on a more “standard” Portuguese offers improved chances for the recognition of citizenship rights—including protection from police violence. In a study of Brazilian hip-hop, Roth-Gordon (2013) finds that the use of racialized language can sometimes destabilize unspoken preferences for whiteness. When rappers and rap fans draw on Black aesthetics, patterns of global/US consumption, and rap lyrics spoken within daily conversation, they transform their “sensory regimes” in a display of modern blackness that seeks to reject imperatives to “whiten” in a society that claims to ignore signs of racial difference.

Research conducted outside the US context often shows how racial categorization is not based solely—or even primarily—on phenotype. Exploring how the construction of racial difference in Peru hinges on an evaluation of one’s education, “cultural superiority,” and whiteness, contributors to Back and Zavala (2019) illustrate how linguistic practices allow speakers to convey and create distinctions among a Peruvian population that cannot easily rely on physical appearance to explain inequality. In South Africa, Williams and Stroud (2015) break down a rap battle to show how Colored (mixed-race) and white racial positions are interactionally negotiated in the hip-hop scene and how features of AAL can be reindexicalized to represent a transnational whiteness. In a study of passing and “linguistic whitening,” Telep (2022) examines the bodily and linguistic practices of a Cameroonian female activist in Paris, France, and shows how multiple semiotic cues project a “dominant Black woman ethos” and a “cosmopolitan Black beauty” that partially conforms to white Western notions of beauty. Linguistic anthropologists thus interrogate the role that language plays in constructing and normalizing racial order through a careful investigation of strategi c moment-by-moment linguistic choices that position speakers in relation to whiteness.

Covert Racializing Discourse and Mock Languages

While race and racial hierarchy are linguistically constructed through the use and interpretations of racialized language, racialized linguistic choices are not always conscious and intentional on the part of speakers, nor are the racial implications always obvious to listeners. In her groundbreaking text, The Everyday Language of White Racism , Hill (2008) introduces the idea of “covert racist discourse” and walks readers through a wide variety of linguistic strategies that highlight racial difference, from slurs to gaffes, metaphors, mocking, and linguistic appropriation. She emphasizes that these discursive features can range from more obviously racist to less obviously racist, and often allow a speaker to defend themselves as someone who does not have racist intentions. Through a linguistic analysis of the presuppositions and ideologies behind these utterances, however, Hill (2008) shows how they continue to “reproduce racializing stigma, protect White virtue, and advance White privilege by denying the existence of White racism” (88). In a special issue dedicated to the topic of covert racializing discourse, Dick and Wirtz (2011) and fellow contributors take up this topic, asking “how race persists and moves across scales of imagination and interaction” (E7) and how it comes to be seen as a natural category that upholds white supremacy.

In her most well-known example of covert racial discourse, Hill (1993 , 2008) takes up the study of “Mock Spanish,” collecting examples of Spanish phrases and pseudo-Spanish exaggerations and imitations in greeting cards, advertisements, movies, and daily speech. She seeks to explain why white monolingual speakers embrace tokens of Spanish while living and speaking in the US southwest—a political climate that tolerates increasing hostility toward Mexicans, Mexican Americans, and other Spanish speakers. She finds that though Spanish can sometimes be employed in contexts that are associated with wealth and cosmopolitanism (such as in street or store names in fancy developments), Mock Spanish more often relies on linguistic strategies such as hyperanglicization (“moochas grassy-ass”), pejoration (“hasta la vista, baby”), and ungrammaticality (“el cheapo”) to denigrate Spanish and Spanish speakers through layers of direct and indirect indexicality. Linguistic anthropologists and sociolinguists have further investigated her insight that monolingual speakers choose to incorporate other languages and linguistic varieties into their speech to subtly attack non-white speakers and reassert linguistic and racial dominance. In one notable example, Barrett (2006) shows how English-speaking managers in a Mexican restaurant often resort to limited or ungrammatical uses of Spanish, undermining the comprehension of their Spanish-speaking employees while reinforcing racial and linguistic hierarchies.

These strategic linguistic choices are especially powerful when they comment on salient debates such as the US Oakland Ebonics controversy in the late 1990s, when the American public discussed the possibility of overtly recognizing—but not teaching—African American English in schools where Black students were being undereducated ( Perry and Delpit 1998 ). In these cases, as Ronkin and Karn (1999) note, more egregious and obvious forms of linguistic mockery helped uphold anti-black attitudes and thwart efforts to improve educational equity (see also Rickford and Rickford 2000 ). Decades later, white linguistic performances imitating AAL continue to circulate online and in movies as forms of “linguistic minstrelsy” ( Bucholtz and Lopez 2011 ). Meek (2006) similarly documents the widespread mockery of Native Americans in mass media, through a code she calls Hollywood “Injun” English (HIE). She explains how HIE linguistically constructs “the White Man’s Indian” as foreign, silent, incompetent, and child-like—thus defending white control and contributing to the goals of settler colonialism. As Chun (2004) shows through an analysis of Korean American comedian Margaret Cho’s use of Mock Asian, in-group imitations sometimes attempt to complicate these often thinly veiled attempts to reassert white linguistic superiority by offering alternative readings of how “accented” or “broken” English sounds when spoken within one’s own family (see also Tan 1995 ). Representations of language thus constitute a powerful tool for negotiating racial hierarchy within white supremacist societies.

The Linguistic Defense of Whiteness and White Power

In the construction of social and racial dominance, language has long been used through processes of coercion and consent to buttress white positions of power in a global racial hierarchy ( McIntosh 2020 ). Language standardization is one notable example of how power relations are linguistically established, defended, and justified, often through institutions like schools. Language standardization attempts to shut down linguistic variation, including multilingualism or stigmatized varieties, to reify and naturalize assumptions about the superiority of a select—often white/European—group of speakers. Standard language ideology, which includes “common sense” beliefs in the importance or necessity of language standardization, thus works together with linguistic racialization to connect racial and linguistic hierarchies.

Standard Language Ideology

In her exceptionally comprehensive and accessible text, English with an Accent , Lippi-Green (2012) lays out for readers the misconceptions that support the idea that a “standard language” exists—in any language—and explains the very real reasons that certain, generally white, speakers benefit from upholding and defending this linguistic ideal. In and beyond the US context, where much of the research on race and language is conducted, ideas about “correct speaking” encourage those who have acquired these rules to diligently protect them. Alignments with “correctness” may often disingenuously suggest that other (racialized) speakers need only ditch their “accents,” their multilingualism, or their codeswitching, to earn similar levels of prestige, power, and access to resources. This more subtle level of coercion allows English to become “misrecognized,” or falsely viewed, as the key to a white, and thus American, identity, as Cho (2021) carefully documents through an examination of the diaries of Yun Chi-Ho, a prominent Korean intellectual living in the United States in the late 19th century . In another historical study of the making of “standard American,” Bonfiglio (2002) reveals how racism and xenophobia created a desire for distance from the English spoken by Eastern European immigrants, including people identifying as Jewish. This process of differentiation helped solidify ideas of proper pronunciation and the ideal of an “accentless” speaker associated with the purity of the white Anglo Saxon race. In later work, Bonfiglio (2010) examines how a nationalist conflation of race and language continues to promote and perpetuate ethnolinguistic discrimination through non-neutral metaphors such as “mother tongue” and “native speakers.”

Illustrating how white supremacy engages in the linguistic dehumanization of people of color through language standardization, Vigouroux (2017) shows how Africans are mocked in France through the circulation of one racialized expression, “ y’a bon ” (there’s good—an example of so-called “broken French”), which linguistically stereotypes Africans as incompetent speakers of a more “civilized” language. Similarly attending to public mockery and outcry over the use of languages/language varieties other than English in the United States, Hill (2001) follows “language panics” including the Oakland Ebonics controversy and English-Only initiatives. On the surface, these metalinguistic debates appeared to take up neutral and straightforward questions about language competence, correctness, enhanced communication, and unity. She shows how these public discussions about language instead work as white racist projects that elevate white people, whiteness, and white ways of speaking at the expense of people of color (see also Hill 2008 ).

If language standardization helps naturalize notions of ideal speakers and “correct speech,” it can also bind language to race, patriotism, and national belonging. In the text Language in Immigrant America , Baran (2017) provides an overview of the history and relationship the United States has had to accented-English, multilingualism, and codeswitching, critiquing the ways English monolingualism has been normalized as a requirement of Americanness. The push toward English and these (un)spoken requirements for full inclusion in the nation-state have precipitated language loss among many ethnicized and racialized groups, as well as the “dispossession” of Spanish for millions of US residents and citizens ( Aparicio 2000 ). Perea (1998) has described the feelings of invisibility and erasure that English dominance inflicts on racialized Others, as white people who fear that “their” country is being overtaken by the presence of Spanish respond by discouraging and removing positive representations of Spanish speaking. He argues that language plays a critical role in exclusionary practices and the “symbolic deportation” of Latinx people—particularly from public spaces—and he laments the experience of “death by English . . . a death of the spirit, the slow death that occurs when one’s own identity is replaced, reconfigured, overwhelmed, or rejected by a more powerful, dominant identity” ( Perea 1998 , 583).

Describing how standard English carries in it “the sound of slaughter and conquest,” hooks ( 1995 ) poignantly lays out the connections between language and racial domination:

I know that it is not the English language that hurts me, but what the oppressors do with it, how they shape it to become a territory that limits and defines, how they make it a weapon that can shame, humiliate, colonize. (296)

In a study of linguistic oppression and state racism in northeast Tibet, Roche (2021) theorizes a “lexical necropolitics” which weaponizes lexical purism against minority languages and leaves racial/ethnic groups struggling to preserve language and identity “under conditions of material deprivation, grinding everyday violence, hyper-surveillance, and suspicion” (118). Standard language ideology can work hand-in-hand with racialized violence.

“Standard English,” English Only, and Monolingualism in US Institutions

Within the US context, the consolidation of white power, wealth, and prestige has been furthered by repeated attacks on languages other than standard English—primarily Spanish and AAL. These disparaging attitudes are often institutionalized in policies and even encoded into law. Since the early 1980s, English-Only legislation that would make English the official language of the United States has enjoyed high levels of political support. Implemented at the state level across dozens of states, a context of “English only” has played a strong role in reifying and naturalizing beliefs in the inferiority of native Spanish speakers. From this vantage point, Spanish is valued only when acquired as a second “foreign” language by European American students ( Aparicio 2000 ). Valdés (1997) argues that assumptions about monolingualism harm members of immigrant communities by prioritizing the linguistic norms associated with elective bilinguals (who opt to learn a language later in life) vs. those associated with circumstantial bilinguals (who acquire a second language by necessity). She illustrates how this linguistic bias toward monolingualism fails to compensate bilingual speakers for their skills and the additional work that they do and can even make unfair and harmful requests of them, as in situations when bilingual Latinx jurors are asked to just ignore part of their linguistic competencies and focus only on English input.

Calling out this “paradox of bilingualism,” Ruiz (2016) notes that “other languages are to be pursued by those who don’t have them, but they are to be abandoned by those who do” (182). Within this worldview, speakers are asked to relinquish any personal or intimate connection to languages that are better thought of as “utilitarian” and “academic” tools. Mena and García (2020) also caution against “conceptualizing language(s) and multilingualism as linguistic ‘assets’ and marketable global resources, which tends to undermine political struggles for equity of language minoritized populations in the United States and around the world” (344). Defending the various forms of Spanish and English spoken in the United States, Fought (2003) describes the grammaticality of Chicano English which, while influenced by Spanish, is not the same as Spanglish and is spoken by Mexican Americans who are English dominant. She shows how speakers and their new linguistic varieties manage to flourish and thrive, even under the harsh gaze of standard language ideology and the push toward English-Only monolingualism.

Growing up within this complicated sociopolitical context, new Mexican immigrant children in the US experience empowerment and ethnicization (as “good ethnics” because of their abilities to speak English) but may also be subjected to racial surveillance in white public space where they frequently find themselves serving as translators for Spanish-speaking relatives ( Reynolds and Orellana 2009 ). “English monoglotism” ( Haviland 2003 ) can be legislated and coercively upheld, denying Indigenous people from Mexico the right to court translation in their native languages and refusing Latinx people the right to freely converse in Spanish at their place of work. Haviland coins the term “linguistic paranoia” to illustrate how speaking a language other than English can be interpreted as inherently insulting or threatening to white people.

In his studies of “linguistic profiling,” Baugh (2015) similarly describes the negative repercussions for speakers of AAL when listeners assign race to the voices they hear, even in the absence of visual and cultural cues, and draw on readily available racial stereotypes and ideologies. He presents findings from the National Fair Housing Alliance that show the role of linguistic bias in housing discrimination experienced by African Americans and also Latinx house seekers. Conducting surveys and interviews with African Americans who filed complaints against co-workers speaking Spanish on the job, Zentella (2014) deepens our understanding of the complicated racial politics that surround acts of linguistic profiling and the linguistic oppression of the Latinx community. Ultimately, attacks on speakers for “talking while bilingual” ( Zentella 2014 ) or “speaking while Black” ( Baugh 2015 ) work together to keep white people’s position on top of racial and linguistic hierarchies firmly in place.

This unspoken privileging of “standard English,” and the real-world consequences for racialized speakers, can sometimes play out on a national stage, broadening the reach of the defense of white ways of speaking. Examining the well-publicized case against George Zimmerman for the murder of Trayvon Martin, Rickford and King (2016) meticulously document the silencing and dismissal of the testimony offered by key witness Rachel Jeantel, a fluent speaker of AAL with possible Caribbean influence who was made to seem linguistically, and cognitively, incompetent. They decry the discriminatory impact of dialect unfamiliarity that dangerously aligns with structural racism and anti-black racial ideologies. Baugh (2018) encourages the use of “forensic linguistics” to aid speakers of nonstandard language varieties during trials. His book, Linguistics in Pursuit of Justice , covers “ear-witness” testimonies, linguistic harassment, discrimination, and a range of ways that groups of people become linguistically disenfranchised—not only due to race.

Linguistic Oppression in Schools

Institutional support for English-Only policies, monolingualism, and standard language ideology also extends beyond places of work, the housing market, and the court system to strongly impact school-aged children. The field of linguistic anthropology has long explored how white discourse practices can silence, marginalize, and oppress children and communities of color while rewarding students socialized into white middle-class linguistic norms. This research includes classic studies such as Philips (1983) , in which she describes how the lack of recognition of critical differences between “Anglo” and Indigenous participant structures, or the expected norms surrounding participation in interactions, impedes learning for Indigenous students. In another important study published in the same year, entitled Ways with Words , Heath (1983) describes how children growing up in a white working-class community and a Black working-class community acquire and use language differently, but they differ most strongly from the white, and Black, middle-class children in a nearby community who are socialized into the narrative and literacy skills prioritized by schools.

Within the field of sociolinguistics, scholars such as Labov (1972) had already begun paying careful attention to how the linguistic structure and communicative norms of speakers of what was then described as Black English Vernacular (BEV) also failed to live up to the expected linguistic behavior in white spaces like schools. Linguists were directly involved in important court cases (such as King v. Ann Arbor ) in which Black families sued schools for linguistic discrimination and educational malpractice against Black students ( Alim and Baugh 2007 ). Struggles to counteract the attack on Black language used in educational contexts to improve student engagement and learning continued into the 1990s with the Oakland Ebonics controversy ( Perry and Delpit 1998 ; Rickford and Rickford 2000 ).

This line of research continues into the 21st century with an eye toward reducing racial disparities in education and shedding light on embedded systems of structural racism. Scholars continue to call out the white linguistic norms that (un)intentionally defend the educational and future career success of white speakers from middle-class backgrounds. As part of her lifelong crusade for an “anthropolitical linguistics” that directly engages with issues of social and racial justice, Zentella has fought against linguistic descriptors such as “limited English proficient” and “linguistically isolated” which assume that all children growing up in the US should have access to adults in their home that speak English “very well” (based on a question on the census). As Zentella (2018) notes, “the category suggests that those who speak languages other than English are isolated because they do not speak the ‘right’ language ‘very well’” (218). Mendoza-Denton’s (2008) ethnography also explores how these state-mandated classifications impact Latinx students in California schools. Examining language education from a raciolinguistic perspective, Flores and Rosa (2015) critique the way an imagined white listener they call “the white listening subject” continues to uphold standard English speaking as the objective norm and positions English language learners, heritage language learners, and nonstandard language users as racial Others who must be taught to speak “appropriately.” They note that such speakers are forced to speak, learn, and learn to speak within a “raciolinguistic regime [that] combines monoglossic language ideologies and the white listening subject” (161).

Along similar lines, Rosa (2019) reveals how bilingualism comes to be associated with “languagelessness,” even—and especially—in American schools with high percentages of Latinx students. Here, students’ varied linguistic repertoires come to be seen not as a resource that they, or schools, can draw upon, but as a deficiency and lack of linguistic skills that justify their political and social exclusion from American society. Framing linguistic differences in terms of deficiency and lack is a persistent strategy that often implicates children of color and denies the clear patterns of structural racism in education. In the early 21st century , parents of non-white and low-income children across the United States have been blamed for a “language gap” that is purportedly found in the tens of thousands of additional words uttered to young children in white middle-class families and “missed” by their counterparts ( Avineri et al. 2015 ). Scholarly critiques point out that white middle-class families shore up their positions of privilege whenever public policies and popular perception treat linguistic differences as individual and community “problems,” allowing schools to continue to reward the supposedly “superior” communicative skills acquired in white middle-class households.

The Pull toward Black Language

Researchers trained in linguistics and attentive to the impacts of racism and racial inequality began studying racialized communities in the US context after the Civil Rights movement in the 1960s (see, for example, Labov 1972 ). Sociolinguists began to theorize how Black speakers could be caught between competing forces: On the one hand, they experience a strong push toward standard languages; on the other, they might feel pulled toward stigmatized language varieties that allow them to connect with other Black speakers ( Smitherman 2006 ). The use of racialized linguistic features continues to subject speakers to experiences of anti-blackness ( Spears 2021 ), but also allows for joyful expressions of what Rickford and Rickford (2000) call “Spoken Soul.” Their well-cited and frequently assigned text (of the same name) breaks down the sounds, lexicon, and grammar of AAL in accessible and celebratory prose that also lays out the language variety’s history, development, and broader sociopolitical context.

Embracing African American Language and Culture

Alternately described as African American Vernacular English (AAVE), African American English (AAE), Ebonics, Black English (BE), and Black English Vernacular (BEV), AAL was initially studied through attention paid to the language of working-class urban Black male youth who were assumed to diverge the most strongly from what Smitherman (2006) has called the “Language of Wider (Whiter) Communication.” King (2020) sounds the call to continue to broaden this focus, which has been critiqued for upholding the idea of an “ideal” Black speaker. Recent studies reveal the more intricate and skillful ways a range of speakers employ AAL to align with blackness—both inside and outside of the US context. For example, Ibrahim (2014) establishes the importance of Black English as a Second Language (BESL) for African migrant youth learning how to “become Black” in Canada, partly through an affiliation with North American hip-hop culture and language.

Hip-hop has increasingly been recognized for its potential to give voice to historically disenfranchised youth, offering a platform for resistance and the expression of day-to-day experiences of social, political, and racial marginalization ( Morgan 2009 ). Described as constituting a “Global Hip Hop Nation” ( Alim, Ibrahim, and Pennycook 2009 ), hip-hop, and more specifically rap, has been recognized as part of a distinguished legacy of Black verbal art traditions through which African Americans, in particular, draw on African-inspired speaking styles. These traditions have thrived in the language of pastors and sermons in Black churches in the United States, and this powerful speech register has been picked up by celebrated public speakers from stand-up comedians like Richard Pryor ( Britt 2016 ) to former President Barack Obama ( Alim and Smitherman 2012 ). For both Pryor and Obama, fluency and skill in Black cultural and linguistic practices helped them navigate their appeal to an audience of white people less familiar with these traditions while still creating racial solidarity with Black audiences.

Weldon (2021) finds that, like Obama, bi-dialectal middle-class African Americans display linguistic flexibility and maintain patterns of linguistic divergence from the standard through “camouflaged” features that allow them to more subtly display racial identity and affiliation and resist assimilation. In an excellent chapter reviewing the history of the racially politicized (read: anti-black) context in which AAL speakers are forced to make careful choices, Lanehart (2015) reminds readers, “My language was and is my power: you see/hear the ‘me’ I choose to reveal” (864). In a study of the strong connections between AAL and space, Grieser (2022) examines how residents in a historically Black neighborhood in Washington, DC, that is undergoing class gentrification, use language to display and negotiate their sense of belonging. Similarly concerned with questions of class politics, but also addressing intersectionality, or how various social categories and experiences intersect and overlap, Lane (2019) explores how the term “ratchet” (a racialized, gendered, sexualized, and classed term, similar to the slang term “ghetto”) is taken up by middle and upper-class Black Queer women, also in Washington, DC, to debate the terms of Black normativity and respectability. Black speakers thus strategically employ specific terms and linguistic features to navigate complicated sociopolitical contexts.

Updating her previous work on AAL as a “counterlanguage,” Morgan (2002 , 2020) shows how Black women’s language is employed to expose and counter racist and sexist ideologies. She examines the phrase “We/I don’t play,” and explains: “In the end, the counterlanguage ideology behind I don’t play is a reference to intentionality and morality and located somewhere between cold indifference to sexism, white supremacy and privilege and a declaration of war against it” ( Morgan 2020 , 282). Highlighting the powerful use of AAL by Black women in the public eye, Washington (2020) explores the discursive strategies of signifying and resignification (semantic reclamation), including Congresswoman Maxine Waters’ strategic assertion, “reclaiming my time.” Also pursuing the connections between racialized language and power, Eberhardt and Vdoviak-Markow (2020) examine the use of zero copula in Beyoncé songs as “an important symbolic resource for the assertion of an unapologetically Black and feminist persona” (68). Smalls (2018b) introduces the concept of “emphatic blackness” to celebrate the linguistic, semiotic, and digital strategies Black people embrace as an “enduring mode of survival” (55) to challenge anti-blackness in white public spaces. She engages in theorizing racial semiotics or “raciosemiotics” ( Smalls 2020 ) to hear the “deafening declarations of an uncontested humanity” ( Smalls 2018b , 58) and to publicly affirm that, “Black people using Black language with purpose and delight are acts of healing and resistance” ( Smalls 2020 , 254).

Avoiding Whiteness through Black Language

Both within and outside of the US, white speakers of prestige languages can make choices to codeswitch or “borrow” features from other languages and more stigmatized language varieties. Diverging from examples of overt mocking, these situations feature linguistic choices intended to signal intimate personal relationships, to convey stances of solidarity, and to mark their desire for inclusion in a non-white or interracial community. White speakers may even use these linguistic features in their daily speech with a patterned regularity. In one of the earliest studies, set within the field of cross-cultural communication and emphasizing conversational harmony, Hewitt (1986) shows how white teenagers in London employ features of Jamaican Creole within Black–white friendship groups. In another study also set in London just a few decades later, Rampton (2005) coins the term “language crossing” to explain how teens take up features from Creole, Punjabi, and a “stylized” Indian English to negotiate racial identity within their multiracial peer group. Recording non-white, non-black speakers, Reyes (2005) finds that Southeast Asian American teenagers borrow AAL slang to achieve multiple social goals, including a desire to avoid being positioned as honorary whites.

White women with ties to Black communities in the United States have also been studied for their use of AAL features. This research has alternately focused on women who consider themselves, and are considered by others, to be in-group members ( Sweetland 2002 ) and a range of white women who take up varied stances toward blackness and AAL ( Fix 2014 ). White speakers have also been found to engage in (un)successful linguistic acts that make claims to authenticity and affiliation with communities to which they do not belong. Conducting research in the 1990s with white New York City-based hip-hoppers, Cutler (2014) shows how these teenagers employ racialized linguistic stylization to connect themselves to Black youth culture. Along similar lines but in a very different context, McIntosh (2016) finds that some white settler descendants in Kenya seek to distance themselves from earlier (colonizing) generations and white supremacist ideologies. She describes how they learn Kiswahili as acts of what she calls “linguistic atonement.” This valuing of an African language previously, and still, disparaged by other white settlers helps them shore up their rights to national belonging. At the same time, however, their strategic linguistic choices associate African languages with expressiveness, affect, and informality, in sharp contrast to European languages’ presupposed connections to rationality.

Related to these linguistic acts of solidarity that hinge on Black language, non-black speakers frequently engage in acts of linguistic appropriation, where out-group members take up language that remains socially stigmatized for in-group speakers. Here their goals may be less about a demonstration of respect and camaraderie and more geared toward gaining perceived positive attributes such as the “coolness” or “toughness” associated with AAL. Smitherman (1998 , 2000 , 2006) has spent decades describing how white America engages in the frequent theft of Black language and culture, denying credit and erasing the history of linguistic forms that have “crossed over” into standard English. She famously notes (in Smitherman 1998 ), “Whites pay no dues, but reap the psychological, social, and economic benefits of a language and culture born out of struggle and hard times” (218).

In a study of corporate advertisements on social media, Roth-Gordon, Harris, and Zamora (2020) find that tokens of Black language and culture, including slang, hip-hop lyrics, and AAL, continue to be stripped of their connections to actual Black people and anti-racist politics to keep white customers comfortable. Even as these “borrowings” seem on the surface to engage in positive imitation, negative associations may remain, connecting AAL to criminality, as in an IHOP tweet “The OG of OJ,” where OG stands for original gangster, a reference to rap music and gang/criminal/violent activity. “Positive” imitations can thus work hand-in-hand with linguistic mockery and situations of language “crossing,” elevating the white speakers who temporarily borrow or corrupt these forms. At the same time, these acts of linguistic appropriation and borrowing continue the stigmatization of speakers of these otherwise marginalized languages by emphasizing their distance from more standard and respected languages and linguistic varieties.

Challenging Whiteness and Linguistic Assimilation

Grappling with the formidable power of white supremacy, speakers appeal to a range of linguistic strategies to position themselves more favorably within contexts that are racially hostile to a wide range of groups including immigrants, their descendants, and Indigenous peoples. In these cases, codeswitching, bilingualism, multilingualism, language revitalization, and other discursive strategies help establish speakers as belonging to tribal, national, transnational, diasporic, and global communities. Language thus opens spaces for racially, politically, and socially disenfranchised groups to argue for their own inclusion and to fight for recognition on their terms.

Bilingualism and Multilingualism

Despite the strong push toward English and the popular perception that it is the “language of success,” racialized speakers in the United States embrace a wide range of linguistic varieties to foreground their linguistic and racial pride and connect themselves to their communities. In her classic work, Borderlands/La Frontera , Anzaldúa (1987) proudly codeswitches between Spanish and English and demands to be recognized in all her Spanglish glory. Her chapter, “How to Tame a Wild Tongue,” describes how Chicano Spanish developed under—and talks back to—conditions of linguistic terrorism: “Chicanas who grew up speaking Chicano Spanish have internalized the belief that we speak poor Spanish. It is illegitimate, a bastard language. . . . I will no longer be made to feel ashamed of existing” (58–59). In Growing up Bilingual , Zentella (1997) documents the rule-governed systematicity that New York Puerto Rican bilingual children display while mixing two languages and the level of communicative competence required for fluency in Spanglish. Studies have also shown that Latinx community members who do not fully command Spanish can also find ways to forge symbolic connections to Spanish and to reclaim their heritage language at the college level ( Aparicio 2000 ).

In an edited collection dedicated to the linguistic experiences of the Asian Pacific American community in the United States ( Reyes and Lo 2009 ), contributors explore how a wide range of Asian Americans, who differ in age, generation, and regional background, use language to negotiate their identities and the stereotypes through which they continue to be racialized. In her ethnography of California youth, Shankar (2008) follows South Asian teens as they redefine what it means to be “Desi” in relation to heritage languages, Bollywood films, and the model minority myth. In a related study, Sharma (2010) argues that South Asian hip-hop artists incorporate South Asian languages and immigrant themes to project their own version of an authentic Desi identity. Situated within the spaces of hip-hop culture in Cape Town, South Africa, Williams’ (2017) study asks how Colored (mixed-race) and Black participants embrace multilingual voices or “remix multilingualism” to accomplish multiple social goals—from establishing a local persona, to “keeping it real,” to opening up spaces to claim the linguistic citizenship of marginalized groups.

Citizenship, Belonging, and Transnational Connections

As a shared cultural practice, ways of speaking help connect people racialized as non-white across space and to places that do not always grant them rights to belong. In a study of videos made by “Dreamers” (migrants who traveled to the United States with their undocumented parents as young children and continue to lack access to US citizenship), De Fina (2018) examines how they narratively construct themselves and their families as desirable citizens. Recording the testimonios (first-person narratives) of undocumented Mexican mothers, Figueroa (2013) shows how this linguistic genre allows for symbolic and strategic forms of national participation in an anti-immigrant context that denies migrants both political and civil rights. Migrants who live within the United States/Mexico transnational context also employ everyday expressions as forms of transgressive verbal play to protest the dehumanizing logics of illegality and build alternate figures of personhood ( Chávez 2015 ). In another example from an earlier wave of migration, French Creoles in 19th-century Louisiana defied the country’s English-only logics and attempted to redefine patriotic American citizenship as multilingual, while still defending the requirement of whiteness based on ideas of racial purity and a reliance on European ancestry ( Urbain 2016 ).

Speakers can also appeal to multiple languages to mark their transnational belonging. Vigouroux (2015) has analyzed the multilingual/heteroglossic repertoires of French comics of African origin to show how they mix multiple languages and varieties of French to challenge linguistic ideologies of monolingualism and promote their own visibility in Hexagonal (mainland) France. In an ethnographic account of North African teens living in housing projects in Paris, France, Tetreault (2015) studies the daily interactions and creative linguistic strategies through which “transcultural teens” negotiate their place in a national context where Arab Muslims are heavily stigmatized, racialized, and marginalized. García-Sánchez (2014) describes a similar context of modern-day European Islamophobia, as she attends to the everyday language use of Moroccan immigrant children in Spain who must chart their own politics of inclusion, particularly in Spanish classrooms. In her research with rancheros, members of a more middle-class rural community in Mexico who also share experiences of migration, Farr (2010) documents the different communicative practices that speakers use to position themselves between Indigenous Mexicans and more educated urban Mexicans, but also as members of a transnational community.

Speakers do not need to migrate to seek out transnational connections, as Gaudio (2011) finds in his examination of the use of stigmatized, “broken” Nigerian Pidgin (NP) by Nigerian popular singers. Some of these performers draw on AAL and Jamaican Creole (which they identify as similar forms of “broken English”) to evoke shared cultural and political experiences and to locate themselves within Black Atlantic transnational space. Exploring the significance of whiteness in a globalized context, Lo and Kim (2011) examine how mixed-race South Korean celebrities are linguistically evaluated to determine both their racial standing—how white or Korean they are—and the legitimacy of their (trans)national belongings. These studies show how speakers can and do refuse monolingualism and linguistic assimilation to assert a sense of self and their rights to migration, citizenship, and belonging.

Indigeneity and Native American Language Revitalization

White supremacy has had a devastating impact on Indigenous peoples who fight to retain possession and connections to their lands and languages. Within a settler-colonial context that includes violent and brutal acts of physical, cultural, and linguistic genocide, Indigenous communities have survived dramatic and painful experiences of language shift and language loss. Demonstrating how settler logics work to destroy and replace, Iyengar (2014) unpacks the ways that language ideologies and policies worked to eliminate Indigenous languages—at the same time that white European settlers were given state support to import and maintain their heritage languages. In a reflexive piece on the framing of endangered language scholarship, Hill (2002) shows how even academic research, mostly conducted by outsiders, has contributed to Indigenous languages being viewed as “belonging to everyone,” especially as they are problematically described with hyperbolic valorization (as “priceless treasures”) and dramatically enumerated (through counts of remaining speakers). She suggests that this orientation to endangered language research alienates speakers from their own languages, objectifies languages as possessions, and subjects languages and communities to non-Indigenous regimes of power. Kroskrity (2021) similarly reveals the covert linguistic racism of earlier academic descriptions of Native languages which characterized them in terms of lack, deficiency, and simplicity.

A growing number of Native American linguists and linguistic anthropologists theorize the ways that scholarship can uphold white supremacy. Perley (2012) critiques how existing metaphors of language death and endangerment further a settler-colonial project that emphasizes Indigenous erasure and extinction. He advocates for different metaphors and approaches that emphasize “emergent vitalities” and do not turn languages into collectible artifacts that are removed from living speakers and communities that have their own specific communicative goals. Leonard (2021) criticizes an exploitative approach that can make Indigenous scholars feel as though the lived experiences of their communities are being reduced to “interesting” grammatical structures. He proposes a “language reclamation” framework that “as a decolonial project . . . firmly rejects the neoliberal demands for ‘authentic’ Indigenous languages and language ecologies––those imagined by settlers” (223).

Davis (2017) calls out not only scholarly but also journalistic writing that engages in harmful acts of “linguistic extraction” that separate languages from their speakers and often (in)directly blame Indigenous speakers for their own language loss. To attract readers, these stories rely heavily on what has been called “lasting”—focusing on the diminishing numbers of speakers and describing individuals as the “last speakers” in ways that perpetuate notions of “vanishing” tribes and the “extinction” of Native Americans. This type of rhetoric can then be used to justify a lack of attention or concern for the present well-being and ongoing political struggles of Indigenous peoples. Davis (2017) asks what would happen if instead of repeating the same stories about Native American language loss which further settler-colonial goals of elimination, those with representational power sought to “highlight the incredible extent of Indigenous language and cultural maintenance against all odds as a decolonial act of breath-taking resistance, resilience, and survivance” (54).

Other studies that bridge the topics of Indigeneity, language revitalization, and critical race theory attend to racial identity construction and racial ideologies. Researching the complicated relationship between country music, language, physical appearance, and identity, Jacobsen (2017) explores how people use “sound” and “voice” to establish Navajo (Diné) belonging or “social citizenship.” Meek (2020) unpacks the semiotic (visual and linguistic) elements circulated through European racial logics and media representations that overdetermine conceptualizations of American Indianness. Traveling between New York and Puerto Rico, Feliciano-Santos (2021) describes how Taíno/Boricua activists engage in projects of language reconstruction and cultural reclamation to decolonize ethnoracial ideals of Puerto Ricanness that assume Indigenous people have disappeared. Indigenous people in other national contexts also protest against widespread myths that exclude or demean them. For example, French (2010) describes how urban bilingual Kaqchikel and K’iche’ speakers (who also speak Spanish) work against the state-promoted racial essentialization of “Indians” as hindering the progress of the Guatemalan state.

Across the Americas, racial ideologies have denigrated Indigenous peoples, suggesting that shift toward colonial languages like Spanish was the best way to “ salir adelante ” (forge ahead and improve one’s socioeconomic position), as Messing (2007) finds in her research on Mexicano (Nahuatl) speaking communities in Mexico. Within these contexts, language revitalization efforts are advanced when language activists replace Spanish loan words with neologisms at an Aymara-language radio station in Bolivia ( Swinehart 2012 ) and when local hip-hop artists challenge common beliefs in the racial superiority of ladinos (Spanish-speaking European descendants) and contest notions of Indigenous “primitivity” with their participation in contemporary global music culture ( Barrett 2016 ). Through these widespread musical and linguistic practices, they also publicly expose the racial violence experienced by Indigenous communities ( Navarro 2016 ). Hip-hop has also allowed Indigenous peoples across North America to express themselves as modern subjects, to assert Indigenous sovereignty, and to challenge the continued conditions wrought by settler colonialism and racism ( Mays 2018 ).

From the Defense of White Supremacy to Its Challenges

Research in a wide range of white supremacist contexts shows that white people linguistically work to maintain their positions of privilege and their beliefs in the superiority of whiteness. This active, if not always intentional, defense includes broader social policies such as colorblindness and discourses of diversity and multiculturalism, as organizing frameworks through which societies and institutions attempt to understand and contain racial difference. White supremacy is not successfully naturalized for all, however, and people of color and others pursuing racial justice find ways to “call out” the privilege and power of whiteness.

Racism and Its Denials

Continued movement around the globe and political gains by marginalized people have, in many cases, heightened white people’s investment in the construction of a racial difference that affirms their sense of superiority. Stasch (2011) shows how the Korowai of West New Guinea are described as primitive, isolated, and archaic by white tourists who use their travel experiences to understand themselves as civilized, modern, and innovative. Within a Northern Italian context that features simultaneously rising levels of migration and nativist sentiment, Perrino (2020) analyzes the discursive strategies, including joking, mocking, codeswitching into local dialects, and racialized stance taking, that Italians use to create “intimacies of exclusion.” The narratives she examines include ethnonationalist claims that turn to DNA evidence, along with a defense of local artistic, historical, and linguistic patrimony, to actively resist multiculturalism and the new European norm of linguistic and racial heterogeneity—sometimes described as “superdiversity.” Both Perrino (2020) and Pagliai (2009 , 2011) , who also works in Northern Italy, investigate how conversational participants can wind up supporting xenophobic and white supremacist conversational themes through laughter or agreement with racial stances against immigrants, either explicit or unmarked, causing a “spiral effect” that deepens the resonance of racist discourse.

While Europe has found itself grappling with the inclusion of large numbers of African and other racialized migrants only in the late 20th century , Africa continues to struggle with the earlier, and violent, arrival of European colonial settlers. In her study of the descendants of white settlers in postindependence Kenya, McIntosh (2016) describes how white Kenyans still manage to uphold white supremacy despite a rapidly shifting context in which Black Africans loudly critique their colonial past and present. She coins the term “structural oblivion” to describe how white people overlook the ways that “one’s ideologies, practices and very habits of mind continue to uphold one’s privilege” (10). In an especially insightful chapter on linguistic stance taking, McIntosh (2009) explores how white Kenyans juggle their loyalty to Africa and deep sense of national belonging with claims to white rationality and discipline that they believe justify their positions of extreme racial privilege in a social order still strongly, but more quietly, committed to defending white supremacy.

In an article entitled “The Public Life of White Affects,” Bucholtz (2019) outlines discourse strategies that position whiteness and white people as weak, vulnerable, and disempowered to deflect accusations of racism and to resolve anxiety caused by changing US race demographics and the political rise of non-white groups. She argues that these strategies shift whiteness from a position that was unmarked and dominant to one that seems visible and wounded, thereby “protecting all white people’s possessive investment in white supremacy” (485). Along these lines, Bax (2018) shows how use of the euphemism “the c-word”—as a substitute for the term “cracker”—posits an equivalency for racial epithets used for white people and Black people, upholding mistaken beliefs in the existence of “reverse racism.”

Other research shows how broader and deeper understandings of structural racism can be shut down through subtle discourse patterns. Examining the politically and racially charged context of Trayvon Martin’s murder by George Zimmerman in the United States in 2012 , among other cases, Hodges (2016) describes a language game—“hunting for racists”—that allows mainstream media, in particular, to focus on racial slurs and other bits of linguistic “evidence” that will demonstrate that an individual is racist or “nonracist,” thereby limiting the definition of racism to only active (individual, obvious, and intentional) forms of racism. Koven (2013) shows how Lusodescendants born and raised in France engage in similar judgments of self and other as either racist or anti-racist based on the use of competing linguistic forms that are heard to index different types of personhood—either a modern and urban French anti-racist or a nonmodern and rural Portuguese racist. After Chinese American YouTuber Jimmy Wong released a popular anti-racist video response to an anti-Asian rant that went viral, McKinney and Chun (2016) examine the lyrics of his song to show that while humor can effectively subvert racism, anti-racist discourse can also suffer from a reliance on the belief that only individuals can be held accountable for racism—and anti-racism. The linguistic construction of personalist ideologies, which foreground individual beliefs and acts, thus continues to thwart broader awareness of and challenges to structural racism (see also Hill 2008 ).

Colorblindness, Multiculturalism, and “Diversity”

US colorblindness has created a taboo around the mention of racial difference and discourages recognition or talk about present-day racism. While many see this as “progress,” and it does improve upon a pre-Civil Rights/Jim Crow era in which presumed white superiority and non-white inferiority could be spoken about with a matter-of-fact explicitness, colorblindness has been found to mask racial inequality and discourage anti-racist efforts. In an early linguistic study of the impact colorblindness has on daily discourse, McElhinny (2001) describes how white police officers turn to silence, colorblind stances, and “strategic inarticulateness” to avoid acknowledging the existence of racial hierarchy or white privilege within discussions of affirmative action policies. Also working within the US context of colorblindness, Modan (2008) offers an ethnographic account of everyday interactions in a multiethnic and multiclass community in Washington, DC, using discourse analysis to examine when neighbors talk—or avoid talk—about race and ethnicity, rights, tensions, and racial representations.

Universities, advertising agencies, and national governments also struggle with the question of when and how to talk about race, using terms like “culture” and “diversity” to manage public discussions of linguistic, cultural, and racial difference. Building on her earlier theorizing of ways of belonging to the nation, Urciuoli (2013) adds “diversity” to “race” and “ethnicity” as categories that mark differences from whiteness. In her book tracking understandings and uses of “diversity” on a college campus, Urciuoli (2022) shows how the term has come to entail “a neoliberalization of social markedness” in which students are seen to add value when they foreground their differences from normative white, male, middle-class students. At the same time, these differences are individualized rather than understood as structurally produced, and students’ on-campus experiences continue to remind them of their non-normative, marked, and only partly belonging status in what continues to be white public space. In an article entitled “Nothing Sells like Whiteness,” Shankar (2019) explores how advertising agencies have shifted from multicultural strategies that promoted racial and ethnic specificities to a discourse of “diversity” that sounds safer to white people and upholds white supremacy.

Analyzing a series of official apologies issued by the Canadian government to marginalized groups, McElhinny (2016) notes that this multicultural offer of “inclusion” is itself problematic in relation to Indigenous groups that seek their own sovereignty. Like Shankar and Urciuoli, McElhinny also finds that these apologies often address diversity through a neoliberal lens, treating multiculturalism as a (white) business advantage. Drawing on archival research within the Canadian context, Haque (2012) examines how a national policy of “multiculturalism within a bilingual framework” positions English and French as “founding races,” perpetuating a racial order conducive to the ongoing project of white settler nation-building. Working with public language posted online in Singapore, Pak (2021) explores how a “state listening subject” can criticize and censor racist and anti-racist discourse that it interprets as threatening to the multiracial order it seeks to uphold.

Calling Out Whiteness

Non-white speakers have linguistically challenged white supremacy and the high levels of prestige bestowed upon people who are closely associated with whiteness. Scholars working with and from communities of color have recorded a range of disdainful, mocking, and intentionally critical reactions to white speech that “call out” whiteness and cast it in a far less favorable light. In a classic study of joking imitations of “the Whiteman” performed by Western Apaches, Basso (1979) shows how racial boundaries can be constructed through language in ways that (temporarily) turn white people into the racial Others. Trechter (2001) describes pointed critiques made by the Lakhota at Pine Ridge Indian Reservation in South Dakota where whiteness can be negatively associated with capitalism, individualism, and greed, among other morally irresponsible qualities. Slobe (2018) identifies several linguistic features such as creaky voice and uptalk that are used in a register she calls “Mock White Girl” to create and parody the persona of a contemporary middle-class white female—a racialized figure that is also semiotically linked to social media, blondness, and Starbucks. Along similar lines, Mason Carris (2011) describes how a group of Latinx people distance themselves from white ways of speaking and challenge white hegemony through the policing and mocking of “la voz gringa” (a white woman’s heavily accented Spanish). In another study that emphasizes intersectionality, Barrett (1999) shows how use of white women’s language by African American drag queens allows skillful speakers to create layered identities through linguistic performances that both resist and reproduce race and gender hierarchies.

Breaking down the possibilities afforded through social media genres, Calhoun (2019) explores anti-hegemonic comedic performances that mock white racial insecurity and challenge white participation in racial profiling and linguistic stereotyping. This mockery can be lexicalized and enregistered, as Roth-Gordon (2007) and Reyes (2017) show through examination of out-group social labels like the “ playboy ” (in Brazil) and the “ conyo ” (in the Philippines) that critique youth who are seen to flaunt their race and class status in highly unequal societies. Examining even more fraught situations of interracial interaction, Jacobs-Huey (2006) describes how white women who attempt to participate in conversations about the racially charged topic of Black women’s haircare can be called out by Black women for their failure to carefully consider their own positionality before entering into these in-person and digital discussions. In an examination of the racial consciousness-raising strategies of Black NGOs in Brazil, Silva (2022) argues that participants are taught to speak in anti-racist voices that reveal and critique the hidden racial logics of white supremacy widely circulating in society. This metalinguistic training includes a preference for the double-voiced term escravizado (enslaved) instead of escravo (slave) to denaturalize assumptions of a lack of Black humanity and to challenge white supremacy and anti-blackness.

White people have also begun to call out other white people for their participation in upholding race, and class, hierarchies. In a study that explores linguistic stance taking in responses to the online blog “Stuff White People Like,” Walton and Jaffe (2011) show how strategic moves of alignment against white consumption (as a form of racial capitalism) allow some white people to claim superior white virtue through anti-racist stance-taking and the critiquing of other white people. Delfino (2021) similarly follows an anti-racist Facebook group after the US election of Donald Trump and shows that white allies position themselves as “virtuous” in relation to less “woke”/racist white individuals. She points out their problematic belief that white people who recognize racism can’t themselves be racist and how they ignore the ways that liberal democracies help maintain white supremacy. Outside the US context, Tebaldi (2020) analyzes a public Twitter debate in 2016 when French spelling reforms “canceled” the circumflex over certain vowels. While French nationalists defended the circumflex as an icon of lost Frenchness, others mocked the overt connections made between linguistic purism and racist myths of white genocide and the “great replacement” (a white nationalist belief that white people are being replaced by people of color).

Language and White Supremacy: The Future of the Field

Studies of the mainstreaming of white supremacy have already begun to investigate how language can become explicitly racial and politicized. In a section of the edited collection Language in the Trump Era ( McIntosh and Mendoza-Denton 2020 ), contributors condemn racism, xenophobia, and other white nationalist themes that the former US president promoted from his position as the “leader of the free world.” Durrani (2018) highlights the role language has played in the recent racialization of Muslims and the rising levels of Islamophobia/anti-Muslim racism. Researchers have begun to critique racism and white supremacy within academia as well, including the “sociolinguistic labor” Black undergraduate students are asked to perform in these (white) spaces ( Holliday and Squires 2021 ). Preeminent scholars of AAL, John Rickford (2022) and “Dr. G” Geneva Smitherman (2022) have written memoirs that document their personal experiences living and working in a world that stigmatizes them for the linguistic expression of their “soul”—a term they both include in their titles. They provide historical background and context to their professional experiences fighting for racial and social justice.

Scholars have turned to address how language continues to uphold white supremacy through anti-blackness and settler colonialism but can also work toward decolonization. Two notable collections include The Oxford Handbook of Language and Race ( Alim, Reyes, and Kroskrity 2020 ) and a special issue of the Journal of Linguistic Anthropology dedicated to the theme of “Language and White Supremacy” ( Smalls, Spears, and Rosa 2021 ). In their introduction, Smalls, Spears, and Rosa (2021) note that such scholarship is needed to document and denounce “the grotesqueness of White supremacy in all of its targeted, capacious manifestations—interpersonal and institutional, mundane and spectacular, insidious and obvious, ritualized and emergent, local, and global” (155). Along these lines, Heller and McElhinny (2017) offer a critical history that excavates terms and theories in the field of sociolinguistics through the lens of capitalism and colonialism. Alim et al. (2021) describe how Afrikaaps artists are rewriting history in South Africa by destabilizing the linguistic authority and legitimacy attributed to Afrikaans and shifting the locus of power. They note, “The transformational racial-linguistic-land politics of the Afrikaaps movement call for South Africa to move beyond the rainbow politics of reconciliation and toward the radical politics of redistribution and reorganization” (212).

Spears (2021) argues that we would do best attending to “capitalist, antiblack, racializing, systemic white supremacy” by emphasizing more theory and lived experience (157). Drawing on a broad range of theorists in Black studies and anti-blackness scholarship, Smalls (2018a) examines the discursive means through which Black people are constructed as “immanently and exceptionally violent, and African Americans especially so” (356). Applying this lens to academia, Davis and Smalls (2021) critique practices within linguistic anthropology that continue “Black and Native dis/re/possession.” They offer up critical suggestions for the field, refocus our gaze on white supremacy, and reclaim authorship: “We not only acknowledge that the theft and death and partitioning integral to racial-chattel slavery and settler colonialism are ongoing, but also acknowledge the ways we continue to refuse, create, and live” (276).

  • Alim, H. Samy , and Baugh, John . 2007. Talkin Black Talk: Language, Education, and Social Change . New York: Teachers College Press.
  • Alim, H. Samy , Awad Ibrahim , and Alastair Pennycook , eds. 2009. Global Linguistic Flows: Hip Hop Cultures, Youth Identities, and the Politics of Language . New York: Routledge.
  • Alim, H. Samy , Angela Reyes , and Paul V. Kroskrity . 2020. The Oxford Handbook of Language and Race . New York: Oxford University Press.
  • Alim, H. Samy , John R. Rickford , and Arnetha F. Ball , eds. 2016. Raciolinguistics: How Language Shapes Our Ideas about Race . New York: Oxford University Press.
  • Alim, H. Samy , and Geneva Smitherman . 2012. Articulate While Black: Barack Obama, Language, and Race in the U.S . New York: Oxford University Press.
  • Alim, H. Samy , Quentin E. Williams , Adam Haupt , and Emile Jansen . 2021. “‘Kom Khoi San, kry trug jou land’: Disrupting White Settler Colonial Logics of Language, Race, and Land with Afrikaaps.” Journal of Linguistic Anthropology 31 (2): 194–217.
  • Anzaldúa, Gloria . 1987. Borderlands/la frontera: The New Mestiza . San Francisco: Aunt Lute.
  • Aparicio, Frances R. 2000. “Of Spanish Dispossessed.” In Language Ideologies: Critical Perspectives on the Official English Movement , edited by Roseann Dueñas González and Ildiko Melis , 259–275. Mahwah, NJ: Lawrence Erlbaum.
  • Avineri, Netta , Eric Johnson , Shirley Brice-Heath , Teresa McCarty , Elinor Ochs , Tamar Kremer-Sadlik , Susan Blum , Ana Celia Zentella , Jonathan Rosa , Nelson Flores, H. Samy Alim , and Django Paris . 2015. “Invited Forum: Bridging the ‘Language Gap.’” Journal of Linguistic Anthropology 25 (1): 66–86.
  • Back, Michele , and Virginia Zavala , eds. 2019. Racialization and Language: Interdisciplinary Perspectives from Peru . New York: Routledge.
  • Bailey, Benjamin . 2002. Language and Negotiation of Ethnic/Racial Identity among Dominican Americans . El Paso, TX: LFB Scholarly.
  • Baran, Dominika . 2017. Language in Immigrant America . New York: Cambridge University Press.
  • Barrett, Rusty . 1999. “Indexing Polyphonous Identity in the Speech of African American Drag Queens.” In Reinventing Identities: Gendered Self in Discourse , edited by Mary Bucholtz , A. C. Liang , and Laurel A. Sutton , 313–331. New York: Oxford University Press.
  • Barrett, Rusty . 2006. “Language Ideology and Racial Inequality: Competing Functions of Spanish in an Anglo Owned Mexican Restaurant.” Language in Society 35 (2): 163–204.
  • Barrett, Rusty . 2016. “Mayan Language Revitalization, Hip Hop, and Ethnic Identity in Guatemala.” Language & Communication 47: 144–153.
  • Basso, Keith H. 1979. Portraits of “the Whiteman”: Linguistic Play and Cultural Symbols among the Western Apache . New York: Cambridge University Press.
  • Baugh, John . 2015. “SWB (Speaking While Black): Linguistic Profiling and Discrimination Based on Speech as a Surrogate for Race against Speakers of African American Vernacular English.” In The Oxford Handbook of African American Language , edited by Sonja Lanehart , 755–771. New York: Oxford University Press.
  • Baugh, John . 2018. Linguistics in Pursuit of Justice . New York: Cambridge University Press.
  • Bax, Anna . 2018. “The ‘C-Word’ Meets the ‘N-Word’: The Slur-Once-Removed and the Discursive Construction of ‘Reverse Racism.’” Journal of Linguistic Anthropology 28 (2): 114–136.
  • Bonfiglio, Thomas Paul . 2002. Race and the Rise of Standard American . New York: Mouton de Gruyter.
  • Bonfiglio, Thomas Paul . 2010. Mother Tongues and Nations: The Invention of the Native Speaker . New York: Mouton de Gruyter.
  • Britt, Erica . 2016. “Stylizing the Preacher: Preaching, Performance, and the Comedy of Richard Pryor.” Language in Society 45 (5): 685–708.
  • Bucholtz, Mary . 2011. White Kids: Language, Race, and Styles of Youth Identity . New York: Cambridge University Press.
  • Bucholtz, Mary . 2019. “The Public Life of White Affects.” Journal of Sociolinguistics 23 (5): 485–504.
  • Bucholtz, Mary , and Qiuana Lopez . 2011. “Performing Blackness, Forming Whiteness: Linguistic Minstrelsy in Hollywood Film.” Journal of Sociolinguistics 15 (5): 680–706.
  • Calhoun, Kendra . 2019. “Vine Racial Comedy as Anti-Hegemonic Humor: Linguistic Performance and Generic Innovation.” Journal of Linguistic Anthropology 29 (1): 27–49.
  • Carlan, Hannah . 2018. “‘In the Mouth of an Aborigine’: Language Ideologies and Logics of Racialization in the Linguistic Survey of India.” International Journal of the Sociology of Language 252: 97–123.
  • Charity Hudley, Anne H. 2017. “Language and Racialization.” In The Oxford Handbook of Language and Society , edited by Ofelia García , Nelson Flores , and Massimiliano Spotti , 381–402. New York: Oxford University Press.
  • Chávez, Alex E. 2015. “So ¿te Fuiste a Dallas? (So You Went to Dallas?/So You Got Screwed?): Language, Migration, and the Poetics of Transgression.” Journal of Linguistic Anthropology 25 (2): 150–172.
  • Cho, Jinhyun . 2021. “Constructing a White Mask through English: The Misrecognized Self in Orientalism.” International Journal of the Sociology of Language 271: 17–34.
  • Chun, Elaine W. 2004. “Ideologies of Legitimate Mockery: Margaret Cho’s Revoicings of Mock Asian.” Pragmatics 14 (2–3): 263–289.
  • Chun, Elaine W. , and Adrienne Lo . 2016. “Language and Racialization.” In The Routledge Handbook of Linguistic Anthropology , edited by Nancy Bonvillain , 220–233. New York: Routledge.
  • Cutler, Cecelia . 2014. White Hiphoppers, Language, and Identity in Postmodern America . New York: Routledge.
  • Davis, Jenny L. 2017. “Resisting Rhetorics of Language Endangerment: Reclamation through Indigenous Language Survivance.” Language Documentation and Description 14: 37–58.
  • Davis, Jenny L. , and Krystal A. Smalls . 2021. “Dis/Possession Afoot: American (Anthropological) Traditions of Anti-Blackness and Coloniality.” Journal of Linguistic Anthropology 31 (2): 275–282.
  • De Fina, Anna . 2018. “What Is Your Dream? Fashioning the Migrant Self.” Language & Communication 59: 42–52.
  • Delfino, Jennifer. B. 2021. “White Allies and the Semiotics of Wokeness: Raciolinguistic Chronotopes of White Virtue on Facebook.” Journal of Linguistic Anthropology 31 (2): 238–257.
  • Dick, Hilary Parsons . 2011a. “Language and Migration to the United States.” Annual Review of Anthropology 40 (1): 227–240.
  • Dick, Hilary Parsons . 2011b. “Making Immigrants Illegal in Small Town USA.” Journal of Linguistic Anthropology 21 (1): E35–E55.
  • Dick, Hilary Parsons , and Kristina Wirtz . 2011. “Racializing Discourses.” Journal of Linguistic Anthropology 21 (1): E2–E10.
  • Durrani, Mariam . 2018. “Communicating and Contesting Islamophobia.” In Language and Social Justice in Practice , edited by Netta Avineri , Laura R. Graham , Eric J. Johnson , Robin Conley Riner , and Jonathan Rosa , 44–51. New York: Routledge.
  • Eberhardt, Maeve , and Madeline. Vdoviak-Markow . 2020. “‘I Ain’t Sorry’: African American English as a Strategic Resource in Beyoncé’s Performative Persona.” Language & Communication 72: 68–78.
  • Farr, Marcia . 2010. Rancheros in Chicagoacán: Language and Identity in a Transnational Community . Austin: University of Texas Press.
  • Feliciano-Santos, Sherina . 2021. A Contested Caribbean Indigeneity: Language, Social Practice, and Identity within Puerto Rican Taíno Activism . New Brunswick, NJ: Rutgers University Press.
  • Figueroa, Ariana Mangual . 2013. “¡Hay que hablar! Testimonio in the Everyday Lives of Migrant Mothers.” Language & Communication 33 (4): 559–572.
  • Fix, Sonya . 2014. “AAE as a Bounded Ethnolinguistic Resource for White Women with African American Ties.” Language & Communication 35: 55–74.
  • Flores, Nelson , and Jonathan Rosa . 2015. “Undoing Appropriateness: Raciolinguistic Ideologies and Language Diversity in Education.” Harvard Educational Review 85 (2): 149–171.
  • Fought, Carmen . 2003. Chicano English in Context . New York: Palgrave Macmillan.
  • French, Brigittine M. 2010. Maya Ethnolinguistic Identity: Violence, Cultural Rights, and Modernity in Highland Guatemala . Tucson: University of Arizona Press.
  • García-Sánchez, Inmaculada. M. 2014. Language and Muslim Immigrant Childhoods: The Politics of Belonging . Malden, MA: Wiley-Blackwell.
  • Gaudio, Rudolf P. 2011. “The Blackness of ‘Broken English.’” Journal of Linguistic Anthropology 21 (2): 230–246.
  • Graham, Laura R. 2020. “From ‘Ugh’ to Babble (or Babel): Linguistic Primitivism, Sound-Blindness, and the Cinematic Representation of Native Amazonians.” Current Anthropology 61 (6): 732–762.
  • Grieser, Jessica A. 2022. The Black Side of the River: Race, Language, and Belonging in Washington, DC . Washington, DC: Georgetown University Press.
  • Haque, Eve . 2012. Multiculturalism within a Bilingual Framework: Language, Race, and Belonging in Canada . Toronto: University of Toronto Press.
  • Haviland, John B. 2003. “Ideologies of Language: Some Reflections on Language and U.S. Law.” American Anthropologist 105 (4): 764–774.
  • Heath, Shirley Brice . 1983. Ways with Words: Language, Life, and Work in Communities and Classrooms . New York: Cambridge University Press.
  • Heller, Monica , and Bonnie McElhinny . 2017. Language, Capitalism, Colonialism: Toward a Critical History . Toronto: University of Toronto Press.
  • Hewitt, Roger . 1986. White Talk, Black Talk: Interracial Friendship and Communication amongst Adolescents . Cambridge, UK: Cambridge University Press.
  • Hill, Jane H. 1993. “Hasta la Vista, Baby: Anglo Spanish in the American Southwest.” Critique of Anthropology 13 (2): 145–176.
  • Hill, Jane H. 2001. “The Racializing Function of Language Panics.” In Language Ideologies: Critical Perspectives on the Official English Movement; Volume 2: History, Theory, and Policy , edited by Roseann Dueñas González and Ildiko Melis , 245–267. New York: National Council of Teachers of English.
  • Hill, Jane H. 2002. “‘Expert Rhetorics’ in Advocacy for Endangered Languages: Who Is Listening and What Do They Hear?” Journal of Linguistic Anthropology 12 (2): 119–133.
  • Hill, Jane H. 2008. The Everyday Language of White Racism . Malden, MA: Wiley-Blackwell.
  • Hodges, Adam . 2016. “Accusatory and Exculpatory Moves in the Hunting for ‘Racists’ Language Game.” Language & Communication 47: 1–14.
  • Holliday, Nicole R. , and Lauren Squires . 2021. “Sociolinguistic Labor, Linguistic Climate, and Race(ism) on Campus: Black College Students’ Experiences with Language at Predominantly White Institutions.” Journal of Sociolinguistics 25 (3): 418–437.
  • hooks, bell . 1995. “‘This Is the Oppressor’s Language/Yet I Need It to Talk to You’: Language, a Place of Struggle.” In Between Languages and Cultures: Translation and Cross-Cultural Text , edited by Anuradha Dingwaney Needham and Carol Maier , 295–301. Pittsburgh: University of Pittsburgh Press.
  • Ibrahim, Awad . 2014. The Rhizome of Blackness: A Critical Ethnography of Hip Hop Culture, Language, Identity, and the Politics of Becoming . New York: Peter Lang.
  • Iyengar, Malathi Michelle . 2014. “Not Mere Abstractions: Language Policies and Language Ideologies in US Settler Colonialism.” Decolonization: Indigeneity, Education, and Society 3 (2): 33–59.
  • Jacobs-Huey, Lanita . 2006. From the Kitchen to the Parlor: Language and Becoming in African American Women’s Hair Care . New York: Oxford University Press.
  • Jacobsen, Kristina M. 2017. The Sound of Navajo Country: Music, Language, and Diné Belonging . Chapel Hill: University of North Carolina Press.
  • Kiesling, Scott . 2001. “Stances of Whiteness and Hegemony in Fraternity Men’s Discourse.” Journal of Linguistic Anthropology 11 (1): 101–115.
  • King, Sharese . 2020. “From African American Vernacular English to African American Language: Rethinking the Study of Race and Language in African Americans’ Speech.” Annual Review of Linguistics 6 (1): 285–300.
  • Koven, Michele . 2013. “Antiracist, Modern Selves and Racist, Unmodern Others: Chronotopes of Modernity in Lusodescendants’ Race Talk.” Language & Communication 33 (4): 544–558.
  • Kroskrity, Paul. V. 2021. “Covert Linguistic Racisms and the (Re-)production of White Supremacy.” Journal of Linguistic Anthropology 31 (2): 180–193.
  • Labov, William . 1972. Language in the Inner City: Studies in the Black English Vernacular . Philadelphia: University of Pennsylvania Press.
  • Lane, Nikki . 2019. The Black Queer Work of Ratchet: Race, Gender, Sexuality, and the (Anti)politics of Respectability . New York: Palgrave Macmillan.
  • Lanehart, Sonja L. 2015. “African American Language and Identity: Contradictions and Conundrums.” In The Oxford Handbook of African American Language , edited by Jennifer Bloomquist , Lisa J. Green , and Sonja L. Lanehart , 863–879. New York: Oxford University Press.
  • Leonard, Wesley Y. 2021. “Toward an Anti-racist Linguistic Anthropology: An Indigenous Response to White Supremacy.” Journal of Linguistic Anthropology 31 (2): 218–237.
  • Lippi-Green, Rosina . 2012. English with an Accent: Language, Ideology, and Discrimination in the United States . New York: Routledge.
  • Lo, Adrienne , and Jenna Kim . 2011. “Manufacturing Citizenship: Metapragmatic Framings of Language Competencies in Media Images of Mixed Race Men in South Korea.” Discourse & Society 22 (4): 440–457.
  • Mason Carris, Lauren . 2011. “La voz gringa: Latino Stylization of Linguistic (In)authenticity as Social Critique.” Discourse & Society 22 (4): 474–490.
  • Mays, Kyle T. 2018. Hip Hop Beats, Indigenous Rhymes: Modernity and Hip Hop in Indigenous North America . Albany: State University of New York Press.
  • McElhinny, Bonnie . 2001. “See No Evil, Speak No Evil: White Police Officers’ Talk about Race and Affirmative Action.” Journal of Linguistic Anthropology 11 (1): 65–78.
  • McElhinny, Bonnie . 2016. “Reparations and Racism, Discourse and Diversity: Neoliberal Multiculturalism and the Canadian Age of Apologies.” Language & Communication 51: 50–68.
  • McIntosh, Janet . 2009. “Stance and Distance: Social Boundaries, Self-Lamination, and Metalinguistic Anxiety in White Kenyan Narratives about the African Occult.” In Stance: Sociolinguistic Perspectives , edited by Alexandra Jaffe , 72–91. New York: Oxford University Press.
  • McIntosh, Janet . 2016. Unsettled: Denial and Belonging among White Kenyans . Oakland: University of California Press.
  • McIntosh, Janet . 2020. “Whiteness and Language.” In The International Encyclopedia of Linguistic Anthropology , edited by James Stanlaw . New York: John Wiley & Sons.
  • McIntosh, Janet , and Norma Mendoza-Denton , eds. 2020. Language in the Trump Era: Scandals and Emergencies . Cambridge, UK: Cambridge University Press.
  • McKinney, Julia , and Elaine W. Chun . 2016. “Celebrations of a Satirical Song: Ideologies of Antiracism in the Media.” In Multiple Perspectives on Language Play , edited by Nancy Bell , 377–401. New York: De Gruyter Mouton.
  • Meek, Barbra A. 2006. “And the Injun Goes ‘How!’: Representations of American Indian English in White Public Space.” Language in Society 35 (1): 93–110.
  • Meek, Barbra A. 2020. “Racing Indian Language, Languaging an Indian Race: Linguistic Racisms and Representations of Indigeneity.” In The Oxford Handbook of Language and Race , edited by H. Samy Alim , Angela Reyes , and Paul V. Kroskrity , 369–397. New York: Oxford University Press.
  • Mena, Mike , and Ofelia García . 2020. “‘Converse Racialization’ and ‘Un/Marking’ Language: The Making of a Bilingual University in a Neoliberal World.” Language in Society 50 (3): 343–364.
  • Mendoza-Denton, Norma . 2008. Homegirls: Language and Cultural Practice among Latina Youth Gangs . Malden, MA: Blackwell.
  • Mendoza-Denton, Norma . 2011. “The Semiotic Hitchhiker’s Guide to Creaky Voice: Circulation and Gendered Hardcore in a Chicana/o Gang Persona.” Journal of Linguistic Anthropology 21 (2): 261–280.
  • Messing, Jacqueline . 2007. “Multiple Ideologies and Competing Discourses: Language Shift in Tlaxcala, Mexico.” Language in Society 36 (4): 555–577.
  • Modan, Gabriella Gahlia . 2008. Turf Wars: Discourse, Diversity, and the Politics of Place . Malden, MA: Wiley-Blackwell.
  • Morgan, Marcyliena . 2002. Language, Discourse, and Power in African American Culture . New York: Cambridge University Press.
  • Morgan, Marcyliena . 2009. The Real Hiphop: Battling for Knowledge, Power, and Respect in the LA Underground . Durham, NC: Duke University Press.
  • Morgan, Marcyliena . 2020. “‘We Don’t Play’: Black Women’s Linguistic Authority across Race, Class, and Gender.” In The Oxford Handbook of Language and Race , edited by H. Samy Alim , Angela Reyes , and Paul V. Kroskrity , 261–290. New York: Oxford University Press.
  • Navarro, Jenell . 2016. “WORD: Hip-Hop, Language, and Indigeneity in the Americas.” Critical Sociology 42 (4–5): 567–581.
  • Pagliai, Valentina . 2009. “Conversational Agreement and Racial Formation Processes.” Language in Society 38 (5): 549–579.
  • Pagliai, Valentina . 2011. “Unmarked Racializing Discourse: Facework, and Identity in Talk about Immigrants in Italy.” Journal of Linguistic Anthropology 21 (S1): E94–E112.
  • Pak, Vincent . 2021. “(De)coupling Race and Language: The State Listening Subject and Its Rearticulation of Antiracism as Racism in Singapore.” Language in Society , 1–22.
  • Perea, Juan F. 1998. “Death by English.” In The Latino/a Condition: A Critical Reader , edited by Richard Delgado and Jean Stefancic , 583–595. New York: New York University Press.
  • Perley, Berney C. 2012. “Zombie Linguistics: Experts, Endangered Languages and the Curse of Undead Voices.” Anthropological Forum 22 (2): 133–149.
  • Perrino, Sabina . 2020. Narrating Migration: Intimacies of Exclusion in Northern Italy . New York: Routledge.
  • Perry, Theresa , and Lisa Delpit , eds. 1998. The Real Ebonics Debate: Power, Language, and the Education of African American Children . Boston, MA: Beacon.
  • Philips, Susan U. 1983. The Invisible Culture: Communication in Classroom and Community on the Warm Springs Indian Reservation . New York: Longman.
  • Rampton, Ben . 2005. Crossing: Language and Ethnicity among Adolescents . New York: Longman.
  • Reyes, Angela . 2005. “Appropriation of African American Slang by Asian American Youth.” Journal of Sociolinguistics 9 (4): 510–533.
  • Reyes, Angela . 2017. “Inventing Postcolonial Elites: Race, Language, Mix, Excess.” Journal of Linguistic Anthropology 27 (2): 210–231.
  • Reyes, Angela , and Adrienne Lo , eds. 2009. Beyond Yellow English: Toward a Linguistic Anthropology of Asian Pacific America . New York: Oxford University Press.
  • Reynolds, Jennifer F. , and Marjorie Faulstich Orellana . 2009. “New Immigrant Youth Interpreting in White Public Space.” American Anthropologist 111 (2): 211–223.
  • Rickford, John Russell . 2022. Speaking My Soul: Race, Life, and Language . New York: Routledge.
  • Rickford, John R. , and Sharese King . 2016. “Language and Linguistics on Trial: Hearing Rachel Jeantel (and Other Vernacular Speakers) in the Courtroom and Beyond.” Language 92 (4): 948–988.
  • Rickford, John Russell , and Russell John Rickford . 2000. Spoken Soul: The Story of Black English . Malden, MA: Wiley-Blackwell.
  • Roche, Gerald . 2021. “Lexical Necropolitics: The Raciolinguistics of Language Oppression on the Tibetan Margins of Chineseness.” Language & Communication 76: 111–120.
  • Ronkin, Maggie , and Helen E. Karn . 1999. “Mock Ebonics: Linguistic Racism in Parodies of Ebonics on the Internet.” Journal of Sociolinguistics 3 (3): 360–380.
  • Rosa, Jonathan . 2019. Looking like a Language, Sounding like a Race: Raciolinguistic Ideologies and the Learning of Latinidad . New York: Oxford University Press.
  • Roth-Gordon, Jennifer . 2007. “Racing and Erasing the Playboy: Slang, Transnational Youth Subculture, and Racial Discourse in Brazil.” Journal of Linguistic Anthropology 17 (2): 246–265.
  • Roth-Gordon, Jennifer . 2011. “Discipline and Disorder in the Whiteness of Mock Spanish.” Journal of Linguistic Anthropology 21 (2): 211–229.
  • Roth-Gordon, Jennifer . 2013. “Racial Malleability and the Sensory Regime of Politically Conscious Brazilian Hip Hop.” Journal of Latin American and Caribbean Anthropology 18 (2): 294–313.
  • Roth-Gordon, Jennifer . 2017. Race and the Brazilian Body: Blackness, Whiteness, and Everyday Language in Rio de Janeiro . Oakland: University of California Press.
  • Roth-Gordon, Jennifer , Jessica Harris , and Stephanie Zamora . 2020. “Producing White Comfort through ‘Corporate Cool’: Linguistic Appropriation, Social Media, and @BrandsSayingBae.” International Journal for the Sociology of Language 265: 419–440.
  • Ruiz, Richard . 2016. “Paradox of Bilingualism.” In Honoring Richard Ruiz and His Work on Language Planning and Bilingual Education , edited by Nancy H. Hornberger , 182–190. Bristol: Multilingual Matters.
  • Shankar, Shalini . 2008. Desi Land: Teen Culture, Class, and Success in Silicon Valley . Durham, NC: Duke University Press.
  • Shankar, Shalini . 2019. “Nothing Sells like Whiteness: Race, Ontology, and American Advertising.” American Anthropologist 122 (1): 112–119.
  • Sharma, Nitasha Tamar . 2010. Hip Hop Desis: South Asian Americans, Blackness, and a Global Race Consciousness . Durham, NC: Duke University Press.
  • Silva, Antonio José Bacelar da . 2022. Between Brown and Black: Anti-Racist Activism in Brazil . New Brunswick, NJ: Rutgers University Press.
  • Slobe, Tyanna . 2018. “Style, Stance, and Social Meaning in Mock White Girl.” Language in Society 47 (4): 541–567.
  • Smalls, Krystal A. 2018a. “Fighting Words: Antiblackness and Discursive Violence in an American High School.” Journal of Linguistic Anthropology 28 (3): 356–383.
  • Smalls, Krystal A. 2018b. “Languages of Liberation: Digital Discourses of Emphatic Blackness.” In Language and Social Justice in Practice , edited by Netta Avineri , Laura R. Graham , Eric J. Johnson , Robin Conley Riner , and Jonathan Rosa , 52–60. New York: Routledge.
  • Smalls, Krystal A. 2020. “Race, Signs, and the Body: Towards a Theory of Racial Semiotics.” In The Oxford Handbook of Language and Race , edited by H. Samy Alim , Angela Reyes , and Paul V. Kroskrity , 233–260. New York: Oxford University Press.
  • Smalls, Krystal A. , Arthur K. Spears , and Jonathan Rosa , eds. 2021. “Language and White Supremacy.” Journal of Linguistic Anthropology 31 (2): 149–311.
  • Smitherman, Geneva . 1998. “Word from the Hood: The Lexicon of African-American Vernacular English.” In African-American English: Structure, History, and Use , edited by Guy Bailey , John Baugh , Salikoko S. Mufwene , and John R. Rickford , 203–225. New York: Routledge.
  • Smitherman, Geneva . 2000. Black Talk: Words and Phrases from the Hood to the Amen Corner . New York: Houghton Mifflin.
  • Smitherman, Geneva . 2006. Word from the Mother: Language and African Americans . New York: Routledge.
  • Smitherman, Geneva Napoleon . 2022. My Soul Look Back in Wonder . New York: Routledge.
  • Spears, Arthur K. 2021. “White Supremacy and Antiblackness: Theory and Lived Experience.” Journal of Linguistic Anthropology 31 (2): 157–179.
  • Stasch, Rupert . 2011. “Textual Iconicity and the Primitivist Cosmos: Chronotopes of Desire in Travel Writing about Korowai in West Papua New Guinea.” Journal of Linguistic Anthropology 21 (1): 1–21.
  • Sweetland, Julie . 2002. “Unexpected but Authentic Use of an Ethnically Marked Dialect.” Journal of Sociolinguistics 6 (4): 514–536.
  • Swinehart, Karl F. 2012. “Metadiscursive Regime and Register Formation on Aymara Radio.” Language & Communication 32 (2): 102–113.
  • Tan, Amy . 1995. “Mother Tongue.” In Under Western Eyes: Personal Essays from Asian America , edited by Garrett Hongo , 313–320. New York: Anchor Books.
  • Tebaldi, Catherine . 2020. “‘#JeSuisSirCornflakes’: Racialization and Resemiotization in French Nationalist Twitter.” International Journal of the Sociology of Language 265: 9–32.
  • Telep, Suzie . 2022. “Performing an Ethos of a Dominant Black Woman in Paris through Body and Language: Passing at the Intersection of Race, Gender, and Class.” CFC Intersections 1 (1): 71–84.
  • Tetreault, Chantal . 2015. Transcultural Teens: Performing Youth Identities in French Cités . Malden, MA: Wiley-Blackwell.
  • Trechter, Sara . 2001. “White between the Lines: Ethnic Positioning in Lakhota Discourse.” Journal of Linguistic Anthropology 11 (1): 22–35.
  • Trechter, Sara , and Mary Bucholtz . 2001. “Introduction: White Noise: Bringing Language into Whiteness Studies.” Journal of Linguistic Anthropology 11 (1): 3–21.
  • Urbain, Emilie . 2016. “Towards a ‘Bilingual American Citizen’: Language Ideologies, Citizenship and Race in 19th Century French Louisiana.” Language & Communication 51: 17–29.
  • Urciuoli, Bonnie . 2013. Exposing Prejudice: Puerto Rican Experiences of Language, Race, and Clas s. Long Grove, IL: Waveland.
  • Urciuoli, Bonnie . 2022. Neoliberalizing Diversity in Liberal Arts College Life . New York: Berghahn Books.
  • Valdés, Guadalupe . 1997. “Bilinguals and Bilingualism: Language Policy in an Anti-immigrant Age.” International Journal of the Sociology of Language 127: 25–52.
  • Vigouroux, Cécile B. 2015. “Genre, Heteroglossic Performances, and New Identity: Stand-Up Comedy in Modern French Society.” Language in Society 44 (2): 243–272.
  • Vigouroux, Cécile B. 2017. “The Discursive Pathway of Two Centuries of Raciolinguistic Stereotyping: ‘Africans as Incapable of Speaking French.’” Language in Society 46 (1): 5–21.
  • Walton, Shana , and Alexandra Jaffe . 2011. “‘Stuff White People Like’: Stance, Class, Race, and Internet Commentary.” In Digital Discourse: Language in the New Media , edited by Crispin Thurlow and Kristine Mroczek , 199–219. New York: Oxford University Press.
  • Washington, Adrienne Ronee . 2020. “‘Reclaiming My Time’: Signifying, Reclamation and the Activist Strategies of Black Women’s Language.” Gender & Language 14 (4): 358–385.
  • Weldon, Tracey L. 2021. Middle-Class African American English . New York: Oxford University Press.
  • Williams, Quentin . 2017. Remix Multilingualism: Hip Hop, Ethnography, and Performing Marginalized Voices . New York: Bloomsbury.
  • Williams, Quentin E. , and Christopher Stroud . 2015. “Battling the Race: Stylizing Language and Coproducing Whiteness and Colouredness in a Freestyle Rap Performance.” Journal of Linguistic Anthropology 24 (3): 277–293.
  • Wirtz, Kristina . 2014. Performing Afro-Cuba: Image, Voice, Spectacle in the Making of Race and History . Chicago: University of Chicago Press.
  • Wirtz, Kristina . 2020. “Racializing Performances in Colonial Time-Spaces.” In The Oxford Handbook of Language and Race , edited by H. Samy Alim , Angela Reyes , and Paul V. Kroskrity , 207–229. New York: Oxford University Press.
  • Zentella, Ana Celia . 1997. Growing up Bilingual: Puerto Rican Children in New York . Malden, MA: Blackwell.
  • Zentella, Ana Celia . 2014. “TWB (Talking While Bilingual): Linguistic Profiling of Latina/os, and Other Linguistic Torquemadas.” Latino Studies 12 (4): 620–635.
  • Zentella, Ana Celia . 2018. “‘Linguistically Isolated’: Challenging the U.S. Census Bureau’s Harmful Classification.” In Language and Social Justice in Practice , edited by Netta Avineri , Laura R. Graham , Eric J. Johnson , Robin Conley Riner and Jonathan Rosa , 217–225. New York: Taylor & Francis Group.

Related Articles

  • Anthropology of Education
  • Language as Social Action

Printed from Oxford Research Encyclopedias, Anthropology. Under the terms of the licence agreement, an individual user may print out a single article for personal use (for details see Privacy Policy and Legal Notice).

date: 14 September 2024

  • Cookie Policy
  • Privacy Policy
  • Legal Notice
  • Accessibility
  • [185.80.149.115]
  • 185.80.149.115

Character limit 500 /500

Information

  • Author Services

Initiatives

You are accessing a machine-readable page. In order to be human-readable, please install an RSS reader.

All articles published by MDPI are made immediately available worldwide under an open access license. No special permission is required to reuse all or part of the article published by MDPI, including figures and tables. For articles published under an open access Creative Common CC BY license, any part of the article may be reused without permission provided that the original article is clearly cited. For more information, please refer to https://www.mdpi.com/openaccess .

Feature papers represent the most advanced research with significant potential for high impact in the field. A Feature Paper should be a substantial original Article that involves several techniques or approaches, provides an outlook for future research directions and describes possible research applications.

Feature papers are submitted upon individual invitation or recommendation by the scientific editors and must receive positive feedback from the reviewers.

Editor’s Choice articles are based on recommendations by the scientific editors of MDPI journals from around the world. Editors select a small number of articles recently published in the journal that they believe will be particularly interesting to readers, or important in the respective research area. The aim is to provide a snapshot of some of the most exciting work published in the various research areas of the journal.

Original Submission Date Received: .

  • Active Journals
  • Find a Journal
  • Proceedings Series
  • For Authors
  • For Reviewers
  • For Editors
  • For Librarians
  • For Publishers
  • For Societies
  • For Conference Organizers
  • Open Access Policy
  • Institutional Open Access Program
  • Special Issues Guidelines
  • Editorial Process
  • Research and Publication Ethics
  • Article Processing Charges
  • Testimonials
  • Preprints.org
  • SciProfiles
  • Encyclopedia

information-logo

Article Menu

  • Subscribe SciFeed
  • Google Scholar
  • on Google Scholar
  • Table of Contents

Find support for a specific problem in the support section of our website.

Please let us know what you think of our products and services.

Visit our dedicated information section to learn more about MDPI.

JSmol Viewer

Preface to the special issue on computational linguistics and natural language processing.

linguistic profiling essay

Linguistic Profiling

Higher-order logical representations and methods, deciphering scripts, conflicts of interest.

  • Sudheesh, R.; Mujahid, M.; Rustam, F.; Shafique, R.; Chunduri, V.; Villar, M.G.; Ballester, J.B.; de la Torre Diez, I.; Ashraf, I. Analyzing Sentiments Regarding ChatGPT Using Novel BERT: A Machine Learning Approach. Information 2023 , 14 , 474. [ Google Scholar ]
  • Delmonte, R. Computing the Sound–Sense Harmony: A Case Study of William Shakespeare’s Sonnets and Francis Webb’s Most Popular Poems. Information 2023 , 14 , 576. [ Google Scholar ] [ CrossRef ]
  • Gorman, R. Morphosyntactic Annotation in Literary Stylometry. Information 2024 , 15 , 211. [ Google Scholar ] [ CrossRef ]
  • Abdalla, M.H.I.; Malberg, S.; Dementieva, D.; Mosca, E.; Groh, G. A Benchmark Dataset to Distinguish Human-Written and Machine-Generated Scientific Papers. Information 2023 , 14 , 522. [ Google Scholar ] [ CrossRef ]
  • Zaman, I.-R.; Trausan-Matu, S. A Survey on Using Linguistic Markers for Diagnosing Neuropsychiatric Disorders with Artificial Intelligence. Information 2024 , 15 , 123. [ Google Scholar ] [ CrossRef ]
  • Mendhakar, A. Linguistic Profiling of Text Genres: An Exploration of Fictional vs. Non-Fictional Texts. Information 2023 , 13 , 357. [ Google Scholar ] [ CrossRef ]
  • Gujjar, V.; Mago, N.; Kumari, R.; Patel, S.; Chintalapudi, N.; Battineni, G. A Literature Survey on Word Sense Disambiguation for the Hindi Language. Information 2022 , 14 , 495. [ Google Scholar ] [ CrossRef ]
  • Manca, V. Agile Logical Semantics for Natural Languages. Information 2024 , 15 , 64. [ Google Scholar ] [ CrossRef ]
  • Łabędzki, M.; Unold, O. D0L-System Inference from a Single Sequence with a Genetic Algorithm. Information 2023 , 14 , 343. [ Google Scholar ] [ CrossRef ]
  • Nepal, A.; Perono Cacciafoco, F. Minoan Cryptanalysis: Computational Approaches to Deciphering Linear A and Assessing its Connections with Language Families from the Mediterranean and the Black Sea Areas. Information 2024 , 15 , 73. [ Google Scholar ] [ CrossRef ]
  • Revesz, P.Z. Establishing the West-Ugric Language Family with Minoan, Hattic and Hungarian by a Decipherment of Linear A. WSEAS Trans. Inf. Sci. Appl. 2017 , 14 , 306–335. [ Google Scholar ]
  • Revesz, P.Z. Data Mining Autosomal Archaeogenetic Data to Determine Minoan Origins. In Proceedings of the 25th International Database Engineering & Applications Symposium, New York, NY, USA, 7 September 2021; Association for Computing Machinery: New York, NY, USA, 2021; pp. 46–55. [ Google Scholar ]
  • Revesz, P.Z. Minoan archaeogenetic data mining reveals Danube Basin and western Black Sea littoral origin. Int. J. Biol. Biomed. Eng. 2019 , 13 , 108–120. [ Google Scholar ]
  • Revesz, P.Z.; Varga, G. A Proposed Translation of an Altai Mountain Inscription Presumed to be from the 7th Century BC. Information 2022 , 13 , 243. [ Google Scholar ] [ CrossRef ]
  • Revesz, P.Z. Decipherment Challenges due to Tamga and Letter Mix-Ups in an Old Hungarian Runic Inscription from the Altai Mountains. Information 2023 , 13 , 422. [ Google Scholar ] [ CrossRef ]
The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

Revesz, P.Z. Preface to the Special Issue on Computational Linguistics and Natural Language Processing. Information 2024 , 15 , 281. https://doi.org/10.3390/info15050281

Revesz PZ. Preface to the Special Issue on Computational Linguistics and Natural Language Processing. Information . 2024; 15(5):281. https://doi.org/10.3390/info15050281

Revesz, Peter Z. 2024. "Preface to the Special Issue on Computational Linguistics and Natural Language Processing" Information 15, no. 5: 281. https://doi.org/10.3390/info15050281

Article Metrics

Article access statistics, further information, mdpi initiatives, follow mdpi.

MDPI

Subscribe to receive issue release notifications and newsletters from MDPI journals

Italian Journal of Computational Linguistics

Home Issues 9-1 Linguistic Profile of a Text and ...

Linguistic Profile of a Text and Human Ratings of Writing Quality: a Case Study on Italian L1 Learner Essays

This paper presents a study based on the linguistic profiling methodology to explore the relationship between the linguistic structure of a text and how it is perceived in terms of writing quality by humans. The approach is tested on a selection of Italian L1 learners essays, which were taken from a larger longitudinal corpus of essays written by Italian L1 students enrolled in the first and second year of lower secondary school. Human ratings of writing quality by Italian native speakers were collected through a crowdsourcing task, in which annotators were asked to read pairs of essays and rated which one they believed to be better written. By analyzing these ratings, the study identifies a variety of linguistic phenomena spanning across distinct levels of linguistic description that distinguish the essays considered as ‘winners’ and evaluates the impact of students’ errors on the human perception of writing quality.

1. Introduction

1 With the effect of the global COVID-19 pandemic, the phenomenon of distance learning has become more prevalent showing the importance of endowing teachers and students with advanced language technologies able to support the practice of teaching and learning in online environments. With respect to language learning and teaching, many of the opportunities and challenges that are associated with these new learning paradigms have been tackled by Intelligent Computer-Assisted Language Learning (ICALL), an interdisciplinary research field that aims at integrating insights from computational linguistics and artificial intelligence into computer-aided language learning. Within the last twenty years, this field has experienced a considerable growth especially in the area of assessment thanks to the development of Automated Essay Scoring (AES) systems (Attali and Burstein 2006; Rudner, Garcia, and Welch 2006; Landauer, Laham, and Foltz 2003; McNamara, Crossley, and Roscoe 2013), i.e. computer-based assessment tools able to automatically score or grade the student’s responses by considering appropriate features derived from a training set of annotated responses, or tools for automatic error detection and correction (Ng et al. 2013), which are able to automatically identify linguistic errors of different types in text essays in order to suggest adequate correction but also to provide individualized feedback to learners on exercises and to automatically create and use detailed learner models.

2 A fundamental requirement for developing such a kind of educational applications is the availability of electronically accessible corpora of authentic learners’ productions. Corpora created so far differ in many respects. For instance, considering the types of examined learners, they can gather productions written by second language (L2) students or by native speakers: the former have been built for many languages (e.g. English, Arabic, German, Hungarian, Basque, Czech, Italian), while the latter are mainly available for English.

  • 1 The corpus is freely available for research purposes at the following link: http://www.italianlp.it (...)

3 A further dimension of variation concerns the data collection method. The majority of existing corpora are cross-sectional while very few ones are longitudinal. In the context of Italian as first language (L1) – which is the focus of our contribution –, for the first typology, it is worth mentioning the synchronic corpus of 2,500 compositions written by students of the first year of several high schools in Rome (Borghi 2013), as well as the diachronic one composed by 5,000 productions written by pupils during the five years of elementary school all over Italy (Marconi et al. 1994). The only available longitudinal is represented by CItA ( Corpus Italiano di Apprendenti L1 ), which was jointly developed by the Institute for Computational Linguistics of the Italian National Research Council (CNR) of Pisa and the Department of Social and Developmental Psychology at Sapienza University of Rome (Barbagli et al. 2016): it is the first digitalized collection of essays written by the same group of Italian L1 learners in the first two years of the lower secondary school 1 .

4 CItA contains essays written by the same students chronologically ordered and covering a two–year temporal span. Its diachronic and longitudinal nature makes the corpus particularly suitable to study the evolution of L1 learners’ writing competence over the two years, assuming that many remarkable changes in writing skills occur in this period. For instance, in their recent work, Miaschi, Brunato, and Dell’Orletta (2021) showed that it is possible to automatically learn the writing development curve of students: they extracted a wide set of linguistic features from the essays and used them to train a binary classification algorithm able to predict the chronological order of two productions written by the same pupil at different times. The present study ranks among previous research based on CItA, but chooses a different approach from the one just mentioned: instead of automatically tracking the development of students’ writing competence, we focused here on assessing the perception of writing quality by Italian L1 speakers with the aim of understanding whether it is possible to find a set of linguistic features that are crucially involved in distinguishing ‘good’ and ‘bad’ essays according to the evaluation of our target readership.

Contributions

5 To the best of our knowledge, this is the first study that (i) introduces an Italian dataset of learner essays evaluated in terms of perceived quality by means of a crowdsourcing task, (ii) investigates the contribution of a wide set of linguistic features covering lexical, morpho-syntactic and syntactic phenomena – that altogether define the linguistic profile of a text – in modeling the individual perception of writing quality and (iii) assesses the impact of students’ errors covering different domains on human judgments.

6 In what follows we first discuss some related works in the literature that have approached the problem of modeling writing quality according to human evaluation using NLP techniques (Section 2). We then present our starting corpus, i.e. CItA ( Corpus Italiano di Apprendenti L1 ), and discuss the theoretical and methodological framework that informed its construction (Section 3). Section 4 focuses on the approach we adopted to set up the crowdsourcing task for collecting human judgements of perceived writing quality on a selection of CItA essays. Then, we present the results of our analysis along two main lines: the first one aimed at characterizing the linguistic profile of essays which were on average perceived as well-written (Section 5); the second one focused on understanding whether and to which extent linguistic errors play a role in native speakers’ perception of writing quality (Section 6). In the conclusions, we discuss some relevant applications that this study would enable and propose further improvements in several directions.

2. Related work

2 http://cohmetrix.com/

7 As reported by Crossley and McNamara (2011), progresses in disciplines such as computational linguistics, discourse processing and information retrieval paved the way for computational investigations into the textual features that impact on human judgments of essay quality. According to Crossley et al. (2014), the most common approach to assessing writing proficiency is to identify relationships between linguistic ‘microfeatures’ extracted from a text – covering aspects such as length, complexity, cohesion, relevance, topic, and rhetorical style – and the scores attributed to it by expert human raters. A first insightful contribution towards a better understanding of this relationship was provided by the above-mentioned study by Crossley and McNamara (2011). It was aimed at investigating the role of human perception of coherence in predicting the overall judgments of essay quality by modelling raters’ coherence judgments through several computational indices, which were calculated using the Coh-Metrix tool 2 (McNamara et al. 2014). The particular focus on coherence was motivated by previous studies (McNamara, Crossley, and Mccarthy 2010; Crossley and McNamara 2012) showing that human ratings of text coherence were the most informative predictors of the holistic judgments of writing quality, while no evident relation between cohesion cues and essay quality emerged. The analyses were conducted on a corpus of 135 argumentative essays written by as many college freshmen attending either ‘Composition One’ or ‘Composition Two’ course at Mississippi State University (MSU). Every student was randomly assigned one among two selected SAT ( Scholastic Assessment Test ) prompts to be responded in 25 minutes. Each essay was read and scored by at least two among eight trained composition professors according to both an analytic rubric – whose creation involved the collaboration of experts in composition studies, cognitive scientist and specialized raters – and a holistic one. The choice of first-year students is based on the assumption that learning how to competently convey messages in written texts is a crucial skill for academic and professional success. This makes the understanding of writing and, in particular, the difference between good and poor writing an important objective both for theoretical and applied purposes.

3 http://www.adaptiveliteracy.com/writing-pal

8 Further analyses on a similar corpus, described by McNamara, Crossley, and Roscoe (2013), led to the development of the Writing Pal 3 , an intelligent tutoring system (ITS) designed to assist high school and college students in the acquisition and improvement of writing skills. It provides lessons dealing with the most effective strategies to perform the various phases of writing – i.e., generating and organizing ideas, drafting and revising an essay – in addition to an area where students can put the learned concepts into practice by writing prompt-based essays. The system automatically scores them and returns a (hopefully) meaningful, formative feedback reporting suggestions to improve the structural and rhetorical quality of the essay. For instance, students are taught to write conclusions that succinctly summarize the main arguments without presenting additional or new information. Since students’ responses are open-ended and potentially ambiguous, the performances of such systems in producing a valid feedback depend on the sophistication level of the NLP algorithms that process and interpret the input.

9 As regards L2 written proficiency, it is worth mentioning the investigation by Crossley et al. (2014) on the potential for many computational indices calculated by two automated text analysis tools, the aforementioned Coh-Metrix and the Writing Assessment Tool (WAT), to predict human scores of essay quality. The analyses were carried out on a corpus of 480 texts collected from two administrations of the TOEFL-iBT (Test of English as a Foreign Language Internet-Based Test) on two groups of 240 candidates, pertaining to a variety of home countries and linguistic backgrounds. Each production was firstly assessed by two expert raters trained by the Educational Testing Service (ETS) according to a standardized TOEFL independent writing rubric. Then, it was associated with an overall score, corresponding to the average of the two grades if their difference was smaller than two points; otherwise, a third expert evaluated it and the final score was the average of the two closest ones. By following this approach, the authors of the study could discriminate between higher and lower quality essays. The distinction led them to identification of the linguistic microfeatures that correlate with L2 essay quality and the training of a regression model to automatically score TOEFL essays according to the same dimensions. They finally evaluated the model strengths and weaknesses. Overall, the contribution represents a significant effort towards the modeling of L2 writing quality by means of textual microfeatures.

10 While sharing the purpose of modeling the human perception of writing quality from learner texts, our work differs in many respect: from the language and authors’ characteristics of the analysed essays, to the approaches adopted for gathering human judgments of writing quality and studying how they relate to the features characterising the linguistic structure of text.

3. The CItA Corpus

11 As previously mentioned, our study is based on CItA, a longitudinal corpus of essays written by the same L1 learners in the first two years of lower secondary school and chronologically ordered. It was collected during the two school years 2012-2013 and 2013-2014 as part of a broader study carried out in the framework of the IEA–IPS ( Association for the Evaluation of Educational Achievement ) activities (Lucisano 1988; Lucisano and Benvenuto 1991). The two-year period was chosen based on the hypothesis that native speakers’ writing competence changes significantly in the transition from the first to the second year of middle school, as a consequence of a more formal approach to writing adopted by teachers. According to Barbagli et al. (2016), these transformations can emerge by inspecting the differences over the considered time frame in the distribution of a wide range of linguistic features automatically extracted from the texts.

12 CItA creators also supposed that the evolution of writing skills could be related to the cultural context in which students are born and/or live. To look for evidence of that, the essays were gathered from seven schools – each represented by a class – located in Rome, three of which in the historical center and four in suburbs. The two areas are assumed to be representative of a medium-high socio-cultural context and a medium-low one, respectively. Moreover, all students involved in the collection were asked to fill in a questionnaire to provide information about their biographical, socio-cultural and sociolinguistic background. It consists of 34 questions, divided into two groups: the first thirteen concern learners’ biographical data (e.g. language(s) spoken at home, date and place of birth, parents’ education and employment, etc.), while the remaining twenty-one explore their writing habits. Among the others: if they like writing outside school, which kind of texts is their favourite, how much time they spend writing, reading or listening to music, and so on. The distribution of the answers to the first set of questions seems to reveal the existence of an actual bond between the position of the school and the socio-cultural context: the schools of the center are mostly attended by pupils who usually speak ‘Italian’ or ‘Italian and a foreign language’ at home and whose parents occupy high-paying jobs; on the other hand, peers in the suburbs more frequently speak dialects and foreign languages and their parents hold lower ranked working positions. Interestingly, these results align with previous research in sociolinguistics, such as the study conducted by Chini (2004) and Chini and Andorno (edited by) (2018) which aimed to characterize plurilingualism within the Italian school context.

3.1 Corpus composition

13 The corpus comprises 1,352 essays (369,456 tokens altogether) written by a total of 156 students, 153 in the first year and 155 in the second. Overall, the compositions respond to 124 writing prompts that pertain to five textual typologies: reflexive, narrative, descriptive, expository and argumentative. Each one requires specific communication and writing abilities.

14 Furthermore, all pupils were asked to develop a “common prompt” at the end of each school year. In particular, the one assigned at the end of the first year was the Italian version of Task 9 of the IEA-IPS study (Lucisano 1984; Corda Costa and Visalberghi 1995), i.e. a letter to advise a younger student how they should write in order to get good grades in the school; the one given at the end of the previous year was a modified version of the same Task 9, adapted to learners’ class and age. Table 1 reports their formulation, as well as an example prompt for each typology. The common prompts were aimed at understanding how learners internalize the writing instructions received in the considered period. In this regard, Barbagli et al. (2015) showed that first-year students’ suggestions tend to concern the emotional sphere (e.g. non aver paura , ‘have no fear’, rifletti prima di scrivere , ‘think before writing’), while the second-year pieces of advice focus more on meta–linguistic aspects, such as the use of verbs or the adherence to the prompt.

Table 1. Prompt examples based on the different textual typologies

Reflexive

What does reading a good book or listening good music represent to you? Make some examples if you want.

Narrative

Invent a myth on the following topic: the laughter.

Descriptive

Hi, I am… describe yourself in a detailed way.

Expository

Child exploitation and slavery: a problem that directly affects us.

Argumentative

In your opinion, how much do mass media and advertising influence people’s choices and behaviors?

Common Prompt (I year)

A friend of yours is beginning the fifth year of primary school with your teachers and confessed that is particularly afraid of writing works they will be asked to do. Write them a letter telling about your experience, the positive aspects and also your difficulties in the writing assignments you were asked to do in the fifth grade. Tell them about the works that you liked most and those you liked least and also about the suggestions that teachers gave you to teach you how to write well and how they used to correct writing assignments. Give them useful tips to get by.

Common Prompt (II year)

A boy younger than you has decided to enroll at your school. He wrote to you to ask you how to write an essay that can get good grades by your teachers. Send him a friendly letter describing at least five points that you believe are important for your teachers when they evaluate an essay.

15 Observing the distribution of the five typologies (Table 2), some differences emerge over the two years and the seven schools. The first is merely numerical: the number of prompts given by teachers in the historical center tends to be higher than in the suburban schools. According Barbagli, Lucisano, and Sposetti (2017), two teachers in the suburbs decided to get their pupils to practice in class and at home, proposing them only one examination per quarter, after realising that their starting language competence was very low. Secondly, if ‘reflexive’ is the most frequent textual type in both years, from the first to the second year the amount of narrative prompts is halved while the expository and argumentative ones are doubled. This different distribution is a consequence of teachers’ approach to teach writing: composing a narrative text is considered an easier task – since it requires more rudimentary cognitive and writing skills – than writing an argumentative or expository essay, for which more complex linguistic and discourse-structuring abilities become relevant (Kellogg 2008; Barbagli et al. 2016).

Table 2. Distribution of the textual typologies in CItA

First year

Reflexive

25

13

38

Narrative

18

4

22

Descriptive

2

1

3

Expository

0

1

1

Argumentative

2

2

4

Sub-total

47

21

68

Second year

Reflexive

24

5

29

Narrative

3

6

9

Descriptive

0

0

0

Expository

4

5

9

Argumentative

5

4

9

Sub-total

36

20

56

3.2 Error annotation

16 In addition to the longitudinal nature, the most significant trait that distinguishes CItA from the other Italian L1 learners’ corpora is the annotation of many types of errors with the corresponding corrections. It has to be noted that error annotation is a quite challenging matter for at least two reasons: first of all, it assumes the occurrence of a deviation from a linguistic norm, that in itself is a conventionally accepted arbitrary concept. Secondly, while this kind of annotation is commonly practiced on L2 corpora in order to e.g. investigate the properties of interlanguage (Brooke and Hirst 2012) or automatically detect and correct errors (Dahlmeier, Ng, and Wu 2013), an L1 error taxonomy did not exist for the Italian language.

17 To fill this lack and be able to annotate the errors contained in CItA essays, a new annotation schema was defined. In line with the literature on evaluation of written skills of L1 Italian learners (Corda Costa and Visalberghi 1995; De Mauro 1983; Colombo 2011), Berruto’s definition of "neo-standard Italian" (Berruto 1987) was adopted as linguistic norm. Similarly to those already existing in other languages (e.g. the one defined by Granger (2003) for French L2 learners’), it is a three-level schema including: the macro-class of error (i.e. grammatical, orthographic and lexical); the class of error, that is to say the linguistic element involved; and the corresponding type of modification required to correct it (Table 3). According to the format introduced by Ng et al. (2013), CItA errors are annotated as follows:

[...] Non mi sembra giusto che uno <M t="112" c="sia">è</M> “uguale” agli altri avendo un Samsung Galaxy e che se uno <M t="112" c="compra">comprato</M> un iPhone diventa subito popolare [...] ("[...] it does not seem fair to me that one who has a Samsung Galaxy is "equal" to others and that, if one buys an iPhone, they immediately becomes popular [...]")

18 The tags <M> and </M> (‘Mistake’) mark the textual area occupied by the error, the attribute t (‘type’) specifies its macro-class and class, and c (‘correction’) indicates the correct form. In the reported example, there are two mistakes related to the misuse of verbal moods: the indicative form è instead of the subjunctive sia and the past participle of the verb comprare (‘to buy’) instead of the third person singular of the present indicative. Applying the scheme and following this format, CItA errors were manually annotated by a teacher of middle school, helped by two undergraduate students in Digital humanities who had been adequately trained.

19 The statistical distribution of errors (Table 3) seems to support the hypothesis underlying the collection of CItA – several common trends in the evolution of writing competence occur during the transition from the first to the second year – since most categories of errors (marked with an asterisk) vary in a statistically significant way over the two years. It can be noted that in both years (rows ‘Total’) orthographic and grammatical errors have the highest frequencies (47.63/44.72% and 46.41/48.7%, respectively) while lexical ones are far less (about 6%). Going into detail, the unclassified orthographic mistakes (i.e. the class ‘Other’) are the most frequent ones (22.32%), followed by the incorrect use of verb tenses (11.26%), the unclassified grammatical errors (6.37%) and the wrong use of prepositions (6.6%).

20 Concerning error frequency distribution per year, it emerges that almost all categories are similarly distributed in the two years. However, second-year essays include a considerably higher percentage of mistakes referring to verb morphology, especially in terms of incorrect tense inflection. As previously stated, from one year to the next narrative prompts are replaced by argumentative and expository ones, that involves more complex linguistic and discourse–structuring abilities. Moreover, older students are more aware that "good" writing requires organizing ideas in larger passages, in which the temporal relationships between actions and events should be reconstructed through appropriate shifts in verb tenses and moods. Therefore, the higher number of errors related to verbs could depend both on the more challenging prompts assigned in the second year, and on students’ intention to put into practice teachers’ writing instructions in order to produce more elaborate essays. Nevertheless, this ability develops across school years, as indicated by previous studies in the literature (Wilcox, Yagelski, and Yu 2013).

Table 3. Error annotation schema. Error categories varying significantly over the two years (i.e. p < 0.05) are marked with an asterisk

3-4

Freq %

Freq %

Grammar

Verbs

Use of tense *

7.78

15.67

Use of mood *

4.25

4.92

Subject-Verb agreement *

2.85

4

Prepositions

Erroneous use

6.48

6.75

Omission/Redundancy

1.03

0.72

Pronouns

Erroneous use

5.09

3.54

Omission *

0.41

0.59

Redundancy

2.70

1.57

Erroneous use of relative pronoun *

2.13

1.70

Articles

Erroneous use

5.81

3.54

Conjunctions

Erroneous use

0.57

0.52

Other

7.31

5.18

Total

46.41

48.7

Orthography

Double consonants

Omission *

6.74

5.05

Redundancy

3.27

3.67

Use of h

Omission *

3.21

1.64

Redundancy

1.66

1.11

Monosyllables

Erroneous use of monosyllabic words *

4.87

4.07

adverb and instead of

1.66

1.64

Apostrophe

Erroneous use *

4.82

4.52

Other

21.77

23.02

Total

47.63

44.72

Lexicon

Vocabulary

Erroneous use

5.60

6.56

21 To conclude, it is worth mentioning that the statistical distribution of grammatical errors varies significantly with respect to the city areas. As shown in Table 4, their average frequency diminishes over the two years in all the schools located in the historical center and in two suburban institutes, increasing in the remaining two. Surprisingly, the highest amount (on average) is observed in a school of the center, even though its difference over the years is doubled as compared to the other six. Instead, orthographic errors do not vary significantly in relation to any background information. This aligns with previous studies suggesting that mastering orthography requires a longer time (Colombo 2011; Ferreri 1971; Lavinio 1975; De Mauro 1977). However, it could also indicate a general insensitivity to spelling mistakes at this level of education and within this particular age group, potentially reflecting a common characteristic of the neo-standard Italian.

Table 4. Average number of grammatical errors with respect to school years and city areas.

Center

1

2.6

0.9

1.7

2

5.2

3.1

2.1

3

15.1

9.3

5.8

Suburbs

4

3.5

8.2

-4.8

5

6.4

4.6

1.9

6

5.4

4.6

0.8

7

1.5

2.8

-1.3

4. Dataset construction

22 To fulfill the main two purposes of our investigation – i.e., identifying which are the linguistic features that make an essay perceived as good and evaluating the impact of linguistic errors on such a perception – we needed to collect evidences of what a well written production is according to a native speaker. We thus decided to model the perception of writing quality as a manual classification task: proposing two texts to our target user, we wanted them to choose the best written one. By gathering a substantial amount of preferences on a couple of essays, the underlying idea was that we could assume that the most chosen one was actually the best. In order to collect judgments on many couples, we rounded essay pairs up to obtain different questionnaires and we administered a crowdsourcing task. In a broader meaning, crowdsourcing is a methodology that refers to any typology of online collaborative activity but many different – and often contrasting – definitions were given. Analysing about forty of them, Estellés and González (2012) extracted the common elements and proposed the following integrated definition:

“Crowdsourcing is a type of participative online activity in which an individual, an institution, a non-profit organization, or company proposes to a group of individuals of varying knowledge, heterogeneity, and number, via a flexible open call, the voluntary undertaking of a task. The undertaking of the task, of variable complexity and modularity, and in which the crowd should participate bringing their work, money, knowledge and/or experience, always entails mutual benefit. The user will receive the satisfaction of a given type of need, be it economic, social recognition, self-esteem, or the development of individual skills, while the crowdsourcer will obtain and utilize to their advantage that what the user has brought to the venture, whose form will depend on the type of activity undertaken.”

23 In our study, the “participative online activity” is the completion of a survey and the “group of individuals” involved is formed by Italian native speakers of different ages and cultural background. We would like to underline that the reliability of data obtained via crowdsourcing has been well acknowledged in recent years also in the linguistics and computational linguistics communities. For instance, the thorough survey by (Munro et al. 2010) has shown that the quality of findings obtained from the crowd is often comparable, if not higher, to controlled laboratory experiments. Besides, crowdsourcing allows to reach a broader population, in terms of age, education, profession and etc. and it is thus more suitable to catch the ‘layman’ perception of writing quality, which is an aspect that qualifies our study with respect to similar ones, which instead focused on judgments given by experts (namely, teachers).

4.1 Essay selection

24 To collect native speakers’ evaluations, we designed ten surveys, each including ten pairs of essays of the same grade. We selected 200 essays from CItA that ranged from a minimum of 141 tokens to a maximum of 1153 tokens and whose average length was 359.4 tokens.

25 Table 5 reports the criteria we defined for the selection of the couples of texts to be included in the questionnaires: the first comprises ten pairs – five for each school year – responding to the common prompts given at the end of the years: such a composition allows the comparison between texts simultaneously written by students attending different schools and discussing the same topic. Questionnaires 2-8 gather essays that develop prompts pertaining to the same textual typology, paired according to the school year in which they were written. This choice was based on the assumption that their similarity with regard to the content could let the annotator focus on stylistic issues to orient their judgment. For example, it was meant to avoid a text on a serious and committed topic being preferred to a better written fairy tale. They were designed according to the already seen (Table 2) distribution of textual typologies in CItA: both narrative and reflective texts are dedicated two questionnaires, one per year. Instead, each of the other three typologies (i.e. descriptive, expository, argumentative) occupies only one questionnaire, in which the proportions of pairs with respect to the school year reflect their general distribution in the corpus. Finally, in surveys 9 and 10 the essays were paired according to their number of errors: for each year, we divided the range between the minimum amount of errors (0) and the maximum one (49 for the first year, 43 for the second one) into ten error bins and designed the two surveys choosing a couple of productions for each bin. Surveys comparing essays with a similar amount of errors were meant to investigate which categories of errors have a greater impact on human judgment.

26 While designing the essay selection criteria, we also took the spatial dislocation of schools into consideration. Indeed, 30 out of the total 100 pairs – 16 for the first and 14 for the second year – gather a text written in a suburban school and one in the historical center.

Table 5. Composition of the ten questionnaires

I year

II year

1

Common prompts

5

5

2

Narrative

10

0

3

Narrative

0

10

4

Reflexive

10

0

5

Reflexive

0

10

6

Descriptive

8

2

7

Expository

3

7

8

Argumentative

3

7

9

Error bins

10

0

10

Error bins

0

10

4.2 Creation and distribution of the questionnaires

4 https://questbase.com/en/home-questbase/

27 After designing the surveys, we moved on to their implementation. We went through the main free web applications dedicated to the creation of questionnaires (e.g. Google Forms, Microsoft Forms), but none was equipped with the customization facilities we needed. Therefore, we choose to rely on the QuestBase platform 4 . As shown in Figure 1, we juxtaposed the essays through its built-in HTML editor and gave surveys a graphical layout with a CSS stylesheet.

Figure 1. Comparison of a pair of essays extracted from one of the ten surveys

Figure 1. Comparison of a pair of essays extracted from one of the ten surveys

  • 5 For the sake of completeness, we report the English translation of the guidelines: "Hello! This sur (...)

28 The definitive structure of the questionnaire comprises twelve pages: the first one reports the filling-in instructions ; the second contains the personal data entry form : we asked the annotators to provide some personal information (i.e. age, sex, education), in the total guarantee of anonymity and only for statistical purposes. Finally, each of the remaining ten pages is occupied by two side by side essays and a field with a radio button that has to be used to express the answer (Figure 1). They have to choose the option ‘1’ if they prefer the first essay, ‘2’ otherwise. Once the form is submitted, the following message is displayed: Hai completato il sondaggio. Grazie per il tuo prezioso contributo! (‘You completed the survey. Thank you for your precious help!’). It is worth focusing a little more deeply on the submission instructions. Trying to provide clear and exhaustive directions, we proposed the following guidelines 5 :

Ciao! Il presente sondaggio è rivolto a partecipanti di madrelingua italiana. La sua compilazione richiede circa 20 minuti. Pima di proseguire, dando il consenso alla partecipazione, ti spieghiamo in cosa consiste. Nelle pagine che seguono leggerai dieci coppie di temi scritti da studenti del primo e del secondo anno di scuola media. I testi possono contenere un certo numero di errori. Per ciascuna coppia ti chiediamo di indicare quale dei due temi ritieni sia scritto meglio . Non esistono risposte giuste o sbagliate: conta semplicemente quello che pensi! Tieni presente che i temi di una stessa coppia possono trattare argomenti diversi, ma questo non deve influire sul tuo giudizio. La tua partecipazione al sondaggio è completamente libera. Se in qualsiasi momento dovessi cambiare idea e volessi interrompere il test, potrai farlo liberamente. Un’ultima cosa: prima di iniziare il sondaggio, ti chiediamo di darci alcune tue informazioni anagrafiche, che serviranno solo a fini statistici. I dati rimarranno completamente anonimi e in nessun modo le risposte verranno associate alla tua persona. Se hai dubbi, curiosità o proposte di miglioramento, scrivimi all’indirizzo: [email protected]. Buona lettura!

29 The users were simply asked to choose the best text of each pair. It is an intentionally generic indication, since we wanted them to rely on their native speaker’s intuition, instead of focusing on specific aspects (e.g. topics discussed or linguistic errors). In other words, their answers had to arise from an instinctive reaction to a quick reading of essay pairs, based on the entirety of linguistic knowledge learnt over time.

30 To assess the adequacy of the defined structure for our purposes, we created a test survey and distributed its link through WhatsApp and social networks (i.e. Facebook and Instagram). It included eight essay couples randomly extracted from CItA and two ‘control pairs’, consisting of clearly unbalanced texts, whose aim was to evaluate the annotation accuracy. The administration returned interesting results. As sign of the efficiency of the propagation method, the survey was correctly submitted 43 times by an heterogeneous sample of people ranging from 17 to 51 years of age, mostly holding a high school diploma (48.8%) or an academic degree (41.9%). The answer to ‘control pairs’ also satisfied our expectations, since the two better essays were preferred 40/43 and 37/43 times, respectively.

6 https://linktr.ee/

31 At this point, we started collecting evaluations. Using Linktree 6 we added the ten questionnaires URLs to a single web page and shared its link through the previously mentioned social media platforms: clicking on it, users were redirected to the page and could access every survey.

4.3 Collection and analysis of human judgments

32 We collected 223 annotations distributed quite homogeneously among the ten surveys, except for the first that was submitted 28 times. It is interesting to focus on the heterogeneous composition of the readers cross-section. Concerning ‘gender’, the majority of answers (183 units = 82.1%) were given by women, against the 38 (17%) by men; just two people preferred not to specify it. As regards ‘age’, the sample was partitioned into six bins (Figure 2): ‘20-24 years’ was the most frequent class (97 units), followed by ‘25-29 years’ (64 units). This means that the great majority of the readers (72.5%) ranged from 20 to 29 years of age. Furthermore, 35 evaluations (15.8%) were made by natives between 30 and 39 years of age, while people belonging to the remaining bins contributed to the collection for an overall 11.7%. Finally, Figure 3 shows the distribution of submissions with respect to readers’ education: almost all the annotations (91.9%) were given by people holding an academic degree (118 units, equal to 53.2%) or a high school diploma (86 units, equal to 38.7%). 12 annotators (5.4%) had a middle school certificate and just 4 (1,8%) held a doctoral degree. The last two indicated a non-specific ‘Other’.

Figure 2. Distribution of annotations with respect to readers’ age bins

Figure 2. Distribution of annotations with respect to readers’ age bins

Figure 3. Distribution of annotations with respect to readers’ education

Figure 3. Distribution of annotations with respect to readers’ education

33 Since the questionnaires received an amount of responses ranging from 20 to 28, we decided to select the same number of most coherent annotations for each. For this purpose, we defined a selection function to discard the most inaccurate submissions and consequently improve the quality of dataset. We firstly built the average vector of every survey ( \(a\) ) as the set of ten values ‘1’ or ‘2’ chosen according to the most assigned label to each pair of essays; then, we calculated the distance between each survey average vector and all its annotations relying on the euclidean metric generalized to the n -dimensional space (Equation 1 ) that computes the distance between two vectors as the square root of the sum of their sizes squared difference.

34 However, simply calculating the difference between the average vector of a survey and each of its submissions is a partial evaluation unless the other annotations of the same questionnaire are also taken into account. In other words, if an essay pair is given an answer diverging from the average, the impact of the deviation should be higher the fewer annotators made it. So, to give relevance to the deviating degree of answers differing from the average, we assigned each couple a weight ( \(w_{i}\) ) equal to the number of preferences received by the ‘winning’ essay. Thus, we computed the weighted distance between annotations and average vectors (Equation 2 ).

\[\tag{1}\displaystyle d(a,v)=\sqrt{\sum_{i=1}^{n} (a_{i} - v_{i})^2}\]\[\tag{2}\displaystyle wd(a,v)=\sqrt{\sum_{i=1}^{n} w_{i} (a_{i} - v_{i})^2}\]

35 Then, for every survey we created two rankings of annotations – from the closest to the most different from the average – by sorting weighted and unweighted distances in ascending order. To choose the most consistent group per survey, we estimated the Inter-Annotator Agreement (IAA) of the first 15 and 20 annotations according on Krippendorff’s alpha ( \(\alpha\) ), a coefficient that expresses IAA in terms of observed ( \(D_{o}\) ) and casual ( \(D_{e}\) ) disagreement (Krippendorff 2011):

\[\tag{3}\displaystyle\alpha = 1-\frac{D_{o}}{D_{e}}\]

  • 7 The corpus of evaluated essays is available at the following link: http://www.italianlp.it/Evaluate (...)

36 We noticed that IAA values of the first 15 annotations ordered by their increasing values of weighted distance were the highest. Therefore, we took them into account (150 total annotations) for the following analysis and discarded the remaining 73 7 . It is noteworthy that the selection led us to an average IAA of 0.26, that is a much higher value than the initial 0.12.

37 The analysis of discarded submissions reveal interesting trends with respect to annotators’ personal data. Concerning ’gender’, in addition to those by the two people who did not specify it, 54 rejected responses were made by women and 14 by men. However, the percentage of the former is higher than that of the latter (36,8% and 31,1%, respectively). Regarding ‘age’, ‘50-64 years’ is the class of which the highest percentage of annotations was dropped (9 out of 16 = 56,25%), while the lowest proportions – just over 20% in both cases – refer to the bins ‘30-39’ and ‘40-49’ (8 out of 35 and 1 out of 5, respectively). The two most populated classes (‘20-24’ and ‘25-29’) lost about 32% of responses (31 out of 97 and 21 out of 64, respectively). As for ‘education’, most submissions by people with a ‘Middle school certificate’ were rejected (7 out of 12 = 58,3%). The classes ‘High school diploma’ and ‘Academic degree’ had about the same number of discards (32 out of 86 and 33 out of 118, respectively), but in percentage terms the gap is wider: 37% for the former and 28% for the latter. These values could suggest that the higher the cultural level of natives, the more accurate their annotation are.

38 Relying on the selected annotations, we established the ‘winning’ and ‘loser’ essays of every pair: the former was the one that received an higher number of preferences and the latter was the less chosen one. Consequently, we could split our annotated corpus into two subsets of 100 texts each, one comprising all ‘winning’ essays and the other the ‘loser’ ones.

39 As discussed in Section 3 , the collection of CItA was also based on the assumption that the development of L1 learners’ writing competence could be affected by some variables of their socio-cultural background, among which the school position. Thus, the essays were gathered from schools in both the historical center and the suburbs. In Section 4.1, we already commented that 30 pairs of essays – 16 for the first year and 14 for the second – set a comparison between texts composed in the two city areas. Interestingly enough, in 18 cases (60%), the ‘winning’ production was made by a center student. This would support the hypothesis that they have higher writing skills than their suburban peers. Considering each year independently, we found out a significant difference: while essays of the downtown schools were preferred in 11/16 first-year pairs (68,75%), in those of the second year the amounts of ‘winning’ texts coincide (7 for both areas). This could be a further proof of the "two different speeds of development" mentioned by Barbagli et al. (2016): suburban students’ starting level of linguistic competence is lower but it improves more rapidly from the first to the second year of lower secondary school.

5. Data Analysis

5.1 studying the linguistic phenomena underlying the perception of writing quality.

8 http://linguistic-profiling.italianlp.it/

9 https://universaldependencies.org/

40 The first purpose of our investigation aimed at identifying whether essays perceived as well-written have a peculiar style which can be represented in terms of a specific set of linguistic features. To this end we adopted the linguistic profiling framework, a NLP-based methodology in which a large array of linguistically-motivated features automatically extracted from annotated texts are used to obtain a vector-based representation of it. Such representations can be then compared across texts representative of different textual genres and varieties to identify the peculiarities of each (Montemagni 2013; Halteren). To perform this analysis, we relied on Profiling-UD 8 , a recently introduced tool that implements the underpinnings of the linguistic profiling methodology and allows the extraction of a wide set of features covering lexical, morpho-syntactic and syntactic phenomena from a text (or collection of texts) linguistically annotated according to the Universal Dependencies (UD) 9 formalism. An overview of the features computed by Profiling-UD and used in this study is shown in Table 6. For a complete description of them, the reader is referred to Brunato et al. (2020).

Table 6. Overview of the linguistic features used in this study

Raw text properties

Total number of sentences n_sentences Total number of

tokens n_tokens

Avg. number of tokens per sentence

tokens_per_sent

Avg. number of characters per word

char_per_tok

Lexical variety

Type/Token Ratio in the first 100 or 200 lemmas

ttr_lemma_chunks_100,

ttr_lemma_chunks_200

Type/Token Ratio in the first 100 or 200

words ttr_form_chunks_100,

ttr_form_chunks_200

POS tagging

Dist. of the 17 UD POS-tags

upos_dist_*

Lexical density (i.e., content words/total words)

lexical_density

Inflectional morphology

Dist. of verbs and auxiliaries according to their tense, mood, form, gender, number and person

verbs_*_dist_* aux_*_dist_*

Verbal predicate structure

Avg. dist. of verbal heads

verbal_head_per_sent

Avg. dist. of roots headed by a verbal lemma

verbal_root_perc

Verbal arity

avg_verb_edges

Dist. of verbs for arity class (from 0 to 6)

verb_edges_dist_*

Global and local parsed tree structures

Mean of the maximum tree depths of each sentence

avg_max_depth

Avg. number of tokens per clause

avg_token_per_clause

Avg. length of dependency links

avg_links_len

Mean of the longest dependency links of each sentence

avg_max_links_len

Length (n. tokens) of the longest dependency link

max_links_len

Avg. length of prepositional chains

avg_prepositional_chain_len

Total number of prepositional chains

n_prepositional_chains

Dist. of prepositional chains by depth (from 1 to 4)

prep_dist_*

Order of elements

Dist. of subjects/objects preceding the verb

subj_pre, obj_pre

Dist. of subjects/objects following the verb

subj_post, obj_post

Syntactic relations

Avg. dist. of UD 37 universal dependency relations

dep_dist_*

Use of subordination

Dist. of principal/subordinate clauses

principal_proposition_dist

subordinate_proposition_dist

Dist. of subordinate clauses following the main clause

subordinate_post

Dist. of subordinate clauses preceding the main clause

subordinate_pre

Avg. length of subordinate chains

avg_subordinate_chain_len

Dist. of subordinate chains by depth (from 0 to 5)

subordinate_dist_*

41 It has to be noted that these features turned out to be highly predictive in many scenarios, all related to modeling formal aspects of a text rather than its content, such as in authorship profiling analyses where they showed to be helpful in identifying specific traits of an author or groups of authors (e.g. gender, native language) from the texts they write (Cocciu et al. 2018; Cimino et al. 2018), or in the case of the automatic assessment of ‘perceived’ linguistic complexity according to conscious readers’ judgments (Brunato et al. 2018; Iavarone, Brunato, and Dell’Orletta 2021). In light of this, we expect them to be useful also for investigating how they might influence human judgments of writing quality.

42 Using Profiling-UD, we first analyzed each text comprised in the two subsets – i.e. the ‘winning’ and the ‘loser’ essays – thus converting each text into its feature-based vector representation, where each dimension of the vector corresponds to the average value of a given linguistic feature in the examined essay. We then estimated three statistical indices for each considered feature in the two groups: the arithmetic mean to summarize the set of values associated to the same feature, the standard deviation as an indicator of data dispersion around the average and the coefficient of variation to normalize and make comparable phenomena measured on different scales. Table 7 shows the mean and standard deviation of an excerpt of the tracked characteristics. As it can be noticed, average values computed for the same feature in the two subsets are often similar, but in some cases they diverge considerably. To have a better understanding of these data, we carried out two separate statistical evaluations sharing the goal of identifying which linguistic features impact more on the rating assigned by annotators.

Table 7. Mean and (standard deviation) of an excerpt of the tracked phenomena with respect to ‘winning’ and ’loser’ essays. Features varying in a statistically significant way between the two groups are marked with an asterisk. Features highlighted in bold are also the ones that turned out to be more uniformly widespread in the ‘winning’ group, according to the ranking established by the coefficient of variation (see Subsection 5.3)

n_sentences

17.46 (9.40)

16.12 (8.14)

n_tokens *

374.95 (127.33)

342.74 (116.27)

tokens_per_sent

24.48 (10.77)

23.60 (8.35)

char_per_tok

4.40 (0.20)

4.38 (0.22)

ttr_lemma_chunks_100

0.61 (0.05)

0.60 (0.06)

ttr_lemma_chunks_200

0.48 (0.13)

0.47 (0.13)

ttr_form_chunks_100 *

0.72 (0.057)

0.71 (0.06)

ttr_form_chunks_200

0.58 (0.154)

0.57 (0.15)

upos_dist_ADJ

5.08 (1.91)

5.13 (2.07)

upos_dist_ADV

7.049 (2.29)

6.80 (2.46)

upos_dist_CCONJ

4.17 (1.28)

4.51 (1.61)

upos_dist_DET

14.03 (2.36)

14.42 (2.43)

upos_dist_NOUN *

16.31 (2.49)

16.98 (2.63)

upos_dist_PRON

8.36 (2.38)

7.98 (2.52)

upos_dist_PUNCT

10.17 (3.35)

9.27 (2.86)

upos_dist_SCONJ

2.33 (1.25)

2.15 (1.15)

upos_dist_VERB

12.95 (2.17)

12.97 (2.58)

lexical_density

0.49 (0.03)

0.49 (0.03)

2.75 (4.37)

2.47 (6.90)

verbs_tense_dist_Past

41.29 (23.73)

40.79 (24.55)

verbs_tense_dist_Pres

42.61 (28.27)

43.07 (28.19)

verbs_mood_dist_Ind

93.91 (8.87)

94.33 (6.78)

verbs_mood_dist_Sub

2.51 (3.79)

3.16 (4.80)

verbs_form_dist_Fin

52.31 (15.20)

54.47 (16.41)

verbs_form_dist_Ger *

3.13 (3.52)

2.32 (3.25)

verbs_form_dist_Inf

24.32 (11.05)

22.79 (12.55)

verbs_form_dist_Part

20.29 (13.75)

20.42 (15.69)

verbs_num_pers_dist_+3

0.02 (0.22)

0.022 (0.22)

verbs_num_pers_dist_Plur+

0.05 (0.52)

0.05 (0.40)

aux_tense_dist_Fut

2.55 (7.38)

3.35 (11.74)

aux_tense_dist_Imp

26.32 (27.94)

21.98 (25.41)

aux_tense_dist_Past

5.0 (8.67)

6.43 (11.10)

aux_mood_dist_Ind

90.52 (11.82)

91.10 (10.61)

aux_mood_dist_Sub *

4.41 (7.22)

2.48 (4.51)

aux_form_dist_Fin

92.28 (8.17)

92.72 (7.83)

verbal_head_per_sent

3.56 (1.53)

3.48 (1.48)

verbal_root_perc

88.63 (10.45)

87.94 (10.57)

avg_verb_edges

2.73 (0.23)

2.72 (0.24)

1.23 (1.62)

1.06 (1.74)

13.45 (5.44)

12.48 (6.30)

avg_max_depth

4.632 (1.11)

4.56 (1.03)

avg_max_links_len

11.28 (5.99)

10.68 (3.92)

avg_links_len

2.766 (0.47)

2.73 (0.41)

max_links_len

31.23 (17.07)

32.01 (17.82)

n_prepositional_chains *

10.70 (6.29)

9.5 (5.92)

31.35 (13.02)

30.017 (15.87)

obj_post

68.65 (13.02)

69.983 (15.87)

subj_pre

83.59 (11.64)

83.707 (11.47)

subj_post

16.41 (11.64)

16.293 (11.47)

dep_dist_compound

0.09 (0.17)

0.18 (0.33)

1.85 (0.98)

1.93 (1.24)

0.27 (0.26)

0.24 (0.30)

0.03 (0.14)

0.02 (0.17)

0.31 (0.52)

0.32 (0.79)

0.13 (0.21)

0.15 (0.31)

dep_dist_punct

10.167 (3.36)

9.24 (2.84)

dep_dist_root

4.62 (1.44)

4.69 (1.42)

principal_proposition_dist

36.41 (11.95)

37.64 (12.21)

subordinate_proposition_dist

63.59 (11.95)

62.35 (12.21)

subordinate_dist_1

74.89 (13.39)

75.87 (13.71)

43 In what follows we describe the method underlying the two evaluations and discuss our most interesting findings.

5.2 Linguistic features that vary significantly

44 The first evaluation was meant at assessing whether the variation between the average values of feature extracted from the ‘winning’ and the ‘losing’ essays was statistically significant. Relying on the Wilcoxon rank sum test (Wilcoxon, Katti, and Wilcox 1970), we found out that seven linguistic characteristics (marked with an asterisk in Table 7) varies significantly ( \(p<0.05\) ) between the groups, though no variation turned out to be strongly significant ( \(p<0.001\) ).

45 In particular, it emerged that ‘winning’ compositions are, on average, longer (+32.2 tokens) than ’losing’ ones in terms of number of tokens ( n_tokens ). This finding may suggest that longer productions are evaluated as more developed, organised and content-rich. Although this is usually true, Crossley, Roscoe, and McNamara (2014) reasonably warn that it may not be the case for all writers. Interestingly, this also reflects the perception that CItA learners have about school writing instructions. Indeed, in an investigation focused on the essays that respond to ‘common prompts’, showed that two of the most frequent suggestions given by students to an hypothetical younger friend are Leggi/scrivi molto (‘Read/write a lot’) and Lavora sodo, fai vedere che ti impegni (‘Work hard, show your dedication’). Thus, pupils possibly write more so as to show their dedication and get higher grades. Not by chance, the most salient term extracted from second-year texts is Voti al tema (‘grades assigned to essays’). Secondly, we observed that a richer vocabulary (in terms of Type/Token Ratio, ttr_form_chunks_100 ) plays a crucial role in annotators’ judgment. This reflects another piece of advice included in the above-mentioned ranking, that is Usa un vocabolario ricco ed espressivo (‘Use a rich and expressive vocabulary’). It could be a consequence of teachers’ encouragement to vary the vocabulary in writing assignments by using synonyms to write clearer and more readable compositions and avoid word repetition as much as possible. The impact of these two features on quality perception was already shown by previous studies dealing with corpora by English L2 learner: higher rated essays comprise more words (Carlson et al. 1985; Ferris 1994; Reid 1990) and exhibit greater lexical diversity (Engber 1995; Grant and Ginther 2000; Jarvis 2002; Reppen 2002). Also Crossley et al. (2014) found out that the strongest quality predictor is the number of word types in a text – i.e., its vocabulary – which strongly correlates ( r = .836) with the number of words. This would indicate that essays containing more types (and thus more words) receive higher scores. Values related to the third feature ( upos_dist_NOUN ) reveal that ‘winning’ essays contain less nouns, although the difference with respect to ‘loser’ ones is very narrow (-0.67%). As observed in the literature, (see (Montemagni 2013; Biber, Conrad, and Reppen 1998), among others), the nominal style is typical of written texts, and especially of highly informative ones (e.g., newspaper articles, laws), while genres closer to speech contain more verbs. Creative texts like learner essays lie in between and we can expect that the typology of prompts will play a role in emphasizing the similarities with written prose or spoken texts. Our results suggest that readers prefer essays less complex and closer to spoken discourse, which is something similar to what already shown by Crossley et al. (2014), who demonstrated that more ‘verbal’ essays are rated higher than essays relying more on nouns and nominalizations. However, we intend to deepen this analysis by also considering the typology of prompts under evaluation. With regard to verbal inflection, ‘better’ productions include on average more future verbs (+0.28%) ( verbs_tense_dist_Fut ), gerund verbs (+0.81%) ( verbs_form_dist_Ger ) and subjunctive auxiliaries (+1.93%) ( aux_mood_dist_Sub ). Verbal tenses differing from present and moods differing from indicative require elevated linguistic skills, which positively influence annotators’ judgments. In this regard, also Crossley and McNamara (2011), Crossley et al. (2014) noticed the high effect of complex verb forms on the positive evaluation of a text. Once again, according to above mentioned survey by Barbagli et al. (2015), this is something L1 learners are well aware of: specifically, Usa correttamente pronomi, verbi e congiunzioni (‘Use correctly pronouns, verbs and conjunctions’) and Usa correttamente i verbi, modi e tempi (‘Use correctly verbs, moods and tenses’) are among the most frequent suggestions given in the first and the second year; in addition, Uso dei verbi (‘Usage of verbs’) is the most salient expression in second-year texts. The last feature significantly varying between the two groups is the number of prepositional chains ( n_prepositional_chains ), which is a feature of syntactic complexity: ‘winning’ compositions have, on average, +1.2 of them.

46 To sum up, it can be stated that phenomena pertaining to all levels of linguistic description are involved in the choice of a ’better’ essay over a ‘worse’ one: the average length in tokens is a raw text property, while the Type/Token Ratio index belongs to the class of lexical features; the distribution of nouns, verbs and auxiliaries in different moods and tenses are morpho-syntactic characteristics and the presence of prepositional chains is a syntactic one. However, it is thought-provoking that only one feature belongs to the last category, that is the most populated one (see again Table 6). The most likely reason of this has to be sought in the same nature of syntax: being the deepest and most fine-grained level, two much larger subsets are needed to capture the phenomena whose mean values vary in a statistically significant way.

Degree of variability of linguistic features

47 As a second evaluation, we calculated the degree of variability of each linguistic feature in the two subset of essays in order to identify which features are more uniformly distributed in the ‘winning’ set, on the assumption that these features exemplify those linguistic phenomena that are likely to cause the annotator to perceive an essay as better written. For this purpose, we firstly ranked the features of each subset by ordering them according to their increasingly coefficients of variation. Table 8 reports the characteristics that occupy the first twenty positions – that is, the most uniform ones – in both subsets. Given their homogeneous distribution, we can assume that they are intrinsic properties of both the Italian language and the literary genre ‘middle school essay’. The former include, e.g., the average word length – in terms of number of characters – and the lexical density, calculated as the ratio between the number of content words over the total number of words. Not coincidentally, they are positioned at the beginning of both lists. The same applies to the distribution of subjects that precede verbs ( subj_pre ), since the canonical, unmarked constituent order of the Italian sentence is SVO (Subject–Verb–Object). Among the latter, instead, it is worth mentioning the vocabulary variation with respect to word forms ( ttr_form_chunks_100 ) and lemmas ( ttr_lemma_chunks_100 ), the distribution of verbs and auxiliaries in indicative mood ( verbs_mood_dist_Ind , aux_mood_dist_Ind ) and finite form ( aux_form_dist_Fin ) and the distribution of first-degree subordinates ( subordinate_dist_1 ), i.e., directly depending on the main clause.

Table 8. First twenty linguistic features of the two subsets ordered by increasing coefficient of variation.

1

char_per_tok

char_per_tok

2

lexical_density

lexical_density

3

ttr_form_chunks_100

verbs_mood_dist_Ind

4

avg_verb_edges

ttr_form_chunks_100

5

ttr_lemma_chunks_100

aux_form_dist_Fin

6

aux_form_dist_Fin

avg_verb_edges

7

verbs_mood_dist_Ind

ttr_lemma_chunks_100

8

verbal_root_perc

avg_prepositional_chain_len

9

subordinate_post

aux_mood_dist_Ind

10

aux_mood_dist_Ind

verbal_root_perc

11

avg_prepositional_chain_len

prep_dist_1

12

subj_pre

subj_pre

13

prep_dist_1

subordinate_post

14

upos_dist_NOUN

avg_links_len

15

avg_token_per_clause

upos_dist_NOUN

16

upos_dist_VERB

avg_subordinate_chain_len

17

upos_dist_DET

upos_dist_DET

18

avg_links_len

avg_token_per_clause

19

avg_subordinate_chain_len

subordinate_dist_1

20

subordinate_dist_1

subordinate_proposition_dist

48 To pursue our objective, we then computed another ranking based on the difference between each feature position in the previous classification of ‘better’ essays and the corresponding one in that of ‘worse’ ones and putting the results in ascending order. Table 7 reports (in bold) the last ten linguistic characteristics of the new list, i.e., those that, maximally vary in the ‘losing’ subset, are more uniformly widespread in the ‘winning’ one. Among them, it is worth mentioning the distribution of future verbs ( verbs_tense_dist_Fut ). As already mentioned in Subsection 5.2, their frequency is higher in ‘better’ essays. This may give a further evidence supporting the view that native speakers tend to interpret the use of complex verbal forms as an indicator of higher writing skills. Moving on, another feature that is more homogeneous among the ‘winners’ is the distribution of the parataxis dependency relation ( dep_dist_parataxis ); since its average value is slightly higher in the ‘loser’ subset, it can be deduced that annotators prefer hypotaxis. This is not surprising: it allows to build more complex and elegant periods that require refined knowledge and mastery of subordination relationships. This syntactic observation seems to find evidence also at the morphosyntactic level, given that ‘better’ compositions include -0.34% coordinating conjunctions ( upos_dist_CCONJ ), that connect sentences in paratactic periods. In this regard, also found out that higher rated essays include more subordination. It also appears that ‘worse’ productions have +0.08% copulas (dep_dist_cop), whose use is to link the subject to a subject complement in a nominal predicate structure. This could suggest that annotators do not appreciate this kind of predication in a sentence. Moreover, it is curious that ‘better’ essays have, on average, +0.1% foreign terms ( dep_dist_flat:foreign ) and -0.1% compound proper nouns ( dep_dist_flat:name ). Finally, it is worth highlighting a higher and more uniform percentage of verbs with few modifiers in the ‘winning’ corpus ( verb_edges_dist_0 , verb_edges_dist_1 ).

6. Studying the impact of errors on human ratings

49 The last phase of our investigation was aimed at assessing the influence of students’ errors on human ratings of writing quality.

50 The 200 evaluated texts contained a total of 1,595 errors, out of which 785 (48.7%) refer to ‘Grammar’, 721 (44.7%) to ‘Orthography’ and 98 (6,14%) to ‘Lexicon’. As predictable, ‘loser’ essays contain more errors than ‘winner’ ones (56.9%) vs 43.1%, respectively). This could be interpreted as a first evidence of the connection between errors and annotators’ choice. Further evidence is given by simply counting the essay couples whose ‘winning’ essay includes less errors than the ‘loser’, those in which the latter has more than the former and those in which both share the number of errors: in 56 out of the 100 pairs, the essay with fewer errors is the most preferred, while only in 21 cases the ‘winner’ comprises more errors. The remaining 23 couples pertain to the third category and they are particularly concentrated in the last two questionnaires (see again Table 5).

51 In order to identify which error macro-classes are more involved in the distinction between ‘better’ and ‘worse’ essays, we firstly calculated the average number of errors and the standard deviation for each macro-class in both subsets. Then, relying again on the Wilcoxon rank sum test , we found out that grammatical and orthographic mistakes vary significantly between the two groups (Table 9 ). As expected, ‘loser’ essays have, on average, a higher quantity of grammatical and orthographic errors (+1.29 and +0.85, respectively). It is worth adding that orthographic mistakes variation ( \(p=0.007\) ) is more significant than the other ( \(p=0.029\) ). This could be an indication that native speakers probably judge orthographic deviations worse than grammatical ones. Once again, our findings are in line with : Usa una corretta ortografia (‘Use correct orthography’) and Ortografia aspetti generali (‘Orthography general aspects) are the and the of the most frequent suggestions given in the second year; moreover, Errori di ortografia (“Orthography errors”) occupies the and the position among the most salient terms respectively of the first and the second year. The non-significant variations of lexical errors ( \(p=0.581\) ) is probably related to their scarce amount in the analysed corpus. This is the same reason why we preferred to generically take the three error macro-classes into account, rather than considering the many error classes included in CItA annotation scheme (as seen in Table 3). Such a study would certainly have given more significant and interesting results, but it would have required a much higher amount of annotated essays in order to curb the problem of data sparsity. We plan to do it in the continuation of our research.

Table 9. Average number of errors in the two subsets per macro-class. Categories whose mean varies significantly between the two subsets are marked with an asterisk

2-3

Avg (StDev)

Avg (StDev)

Grammar *

3.28 (5.52)

4.57 (6.13)

Orthography *

3.18 (4.52)

4.03 (4.83)

Lexicon

0.41 (0.71)

0.48 (0.82)

52 Interestingly enough, the pair on which all annotators agreed assigning their preference to the second essay is also the maximally unbalanced one with respect to the number of errors: the first text has 34 mistakes (i.e., 14 grammatical, 19 orthographic and 1 lexical), while the second only 9 (i.e., 6 grammatical and 3 orthographic). It is worth noticing that the two productions respond to the same prompt, which seems to reduce the influence of the topic on the assessment. In what follows, we report and comment this couple so as to concretely show how errors are crucial in determining an essay quality and the native speaker’s perception of it.

Il film si svolge in Belfast, Irlanda, il film narra di un ragazzo, Jerry che va a Londra con una borsa dentro salsicia, soldi e maglette. Va a Londra con un suo amico. Arrivato a Londra entra in una casa dentro un gruppo di persona che chiamato IRA poi va alla zia a dare le salsicie. Una notte Jerry e il suo amico decidono a dormire nel parco dove incontrarono un vecchio senza casa che dorme in una sedia nel parco, dopo in poi incontrano una prostituta che cade le sue chiave della casa, Jerry entra nella casa della prostituta e ruba dei soldi, poi cambia scena che scopia una bomba in appartamento. Torna a Belfast con soldi e vestito elegante. Mentre Dorme Jerry entrano le polizie che arrestano Jerry. Va in carciere per 30 anni con il Padre. Un anno dopo, il vero che ha messo la bomba confessa alla polizia. Diciendo che ci sono inocenti. 15 anni dopo esce dal carciere perché hanno sapputo la verita.

Il episodio che mi è piacciuto e quando Jerry viene liberato con i tre compagni che escono nella porta principale. E quella di meno è quando le polizie turturano il suo amico e Jerry per dire solo che hanno messo la bomba.

Il personaggio mi ha colpito è Jerry perché lui rischia di restare in carciere, perché non vuole dire la verita alla vocato. In senzo negativo i capo della polizia perché hanno trattato male Jerry. Le scene che mi hanno colpito è quando Jerry torna a Belfast vestito elegante con soldi, Quando stava morendo il Padre, Quando Jerry ha saputo che morto il padre che gli altri persone in carciere votano un foglio con fuoco nella finestra, E quando Jerry viene liberato.

Il film narra la storia di un ragazzo, Jerry Conlon, nato a Belfast che venne accusato di un reato non commesso in Inghilterra, durante la guerra. Venne messo in un carcere a vita con il padre, la zia Annie e i suoi tre figli di cui uno aveva tredici anni. Dopo un po di tempo venne rinchiuso il vero colpevole che confessò di aver fatto scoppiare la bomba nel PUB, i giudici e il direttore erano al corrente che Jerry non era colpevole ma per far vedere al popolo che erano capaci di catturare i colpevoli misero Jerry in carcere, quando scoprirono che un uomo aveva confessato il reato cercarono di fare qualcosa in modo tale che nessuno al di fuori del carcere sapesse dell’accaduto. Dopo quindici lunghi anni si fece un altra sentenza sul caso in cui vi partecipò l’avvocato interpretato da Emma Thompson che da poco aveva scoperto che i giudici e il direttore erano consapevoli che Jerry era innocente e che per far bella figura non dicevano niente. L’avocatessa prese l’articolo di Jerry Conlon e scoprì l’accaduto, in tribunale lo fece vedere al giudice che archiviò tutti gli accusati di reato non commesso. Quando Jerry fu archiviato uscì dalla porta principale e disse: «esco dalla porta principale perché ora sono un cittadino libero», fuori lo aspettavano giornalisti con macchine fotografiche e registrazione, venne intervistato e fu riportato in televisione, inquadrando solo il suo volto. Il messaggio del film è che ci può essere un buon rapporto tra padre e figlio in qualunque situazione.

Mi è colpito molto l’affetto di Jerry nei confronti del padre dopo tutto il periodo in cui non sono andati d’accordo, alla fine sono stati molto uniti e Jerry quando il padre è morto gli dispiaque tantissimo.

53 In the first essay, almost all error classes included in the annotation scheme (see again Table 3 ) are represented. As regards verbs, some mistakes concern the misuse of tense (e.g., decidono di dormire nel parco dove incontrarono , ‘they decide to sleep in the park where they met’) or mood (e.g., un gruppo di persona che chiamato IRA , ‘A group of person that called IRA’), as well as the missed agreement between subject and verb (e.g, Le scene che mi hanno colpito è quando , ‘The scenes that touched me is when’). Secondly, we detect many mistakes related to the ‘Erroneous use’ of prepositions (e.g., in Belfast instead of a Belfast , decidono a dormire rather than decidono di dormire ) or articles (e.g. il episodio instead of l’episodio , gli altri persone rather than le altre persone ). Moreover, several misspellings refer to the ‘Omission’ of double consonants (e.g., inocenti instead of innocenti ) or their ‘Redundancy’ (e.g., sapputo rather than saputo ) or pertain to the category ‘Other’ (e.g., carciere instead of carcere ). Besides, the punctuation is totally arbitrary. Also the second composition has errors, related, for example, to the use of apostrophes (e.g., un altra sentenza ) or the use of the adverb po instead of po’ (e.g., un po di tempo ), but their amount is clearly lower. The above, combined with a more canonical use of punctuation and a more structured organization of content, made all annotators prefer the latter text.

7. Conclusions

54 In this article we have presented a first study for the Italian language aimed at assessing the relationship between the linguistic structure of a text and the native speaker’s perception of its writing quality. We motivated our investigation within the framework of linguistic profiling, a NLP-based methodology that allows to characterize a text in terms of the distribution of a wide set of features representative of phenomena spanning across language domains, with the purpose of understanding which of them are more involved in the human assessment of writing quality.

55 Although our study falls within a longstanding research area focusing on the interplay between the textual features of a composition and the written proficiency of its author (Crossley et al. 2014), the typology of texts we examined represents quite a novelty in this scenario. In fact, the majority of existing contributions focuses on English learners’ corpora, especially of L2 speakers, and takes into account few linguistic phenomena, such as those involved in text coherence or lexical sophistication. A further distinction from previous work concerns the approach adopted for modeling human perception. Instead of resorting to some kind of structured scoring rubric, as made by Crossley and McNamara (2011) and Crossley et al. (2014), we tackled quality assessment as a manual binary classification task: given a pair of essays, readers were asked to choose the one they considered better written. The simplicity of this evaluation method allowed us to propose it as a crowdsourcing task and to gather ratings from native speakers of various ages and cultural background, rather than limiting it to only expert raters. Based on a careful analysis of the distribution of raters’ preferences among the collected annotations, we were able to establish the ‘better’ and ‘worse’ essay for each pair and, consequently, to split the corpus into two subsets, comprising the former and the latter, respectively. Statistical analyses carried out on the linguistic profiles characterizing the two subcorpora yielded some significant results. For example, we found out that longer compositions are preferred to shorter ones and that lexical variety as well as the use of non-indicative mood and non-present tense verbs positively affect the perceived quality of an essay, while an overuse of nouns over verbs does it negatively. It also seems that annotators appreciate more a subordinating style, reasonably because a prose constructed via hypotaxis is more organized and elegant. Interestingly enough, not only are some of these findings in line with those provided by previous studies on writing quality perception, but also reveal a quite unexpected correspondence between annotators’ judgments and the way L1 learners receive writing instructions by teachers (Barbagli et al. 2015). Such a finding could be motivated by the fact that readers – especially the youngest ones – were given similar instructions during their schooling. Comparing the average number of students’ errors per category in the two subsets, we confirmed our starting idea that mistakes substantially affect human judgements, also discovering that grammatical and orthographic ones do it in a stronger way.

56 Altogether, our findings appear consistent enough to be interpreted as indicators of the reliability of our collected data and, more in general, could suggest the effectiveness of crowdsourcing techniques to gather large and reliable amounts of annotated data. They would be valuable resources to train and test NLP algorithms, above all if considering the lack of Italian corpora of graded essays. Despite the promising findings, the limited size of our dataset certainly reduced the amount of results, as already touched upon in Section 5.2. This motivates us to enlarge it by (i) creating and distributing new surveys grouping other essay pairs and (ii) collecting more annotations for the already existing ones. Carrying out again the same analyses on a wider dataset, we expect to be able to identify stronger linguistic predictors that are more likely associated to well-written perceived compositions. Besides, following Miaschi, Brunato, and Dell’Orletta (2021), we could rely on the results to train a binary classification model that, given a pair of texts, automatically performs the task of predicting the best one. Such a tool could be the starting point for the development of an automated scorer able to grade a composition and return a (hopefully) formative feedback, exactly like the Writing Pal (McNamara, Crossley, and Roscoe 2013). Without presuming to replace teachers, AES systems can be a valuable teaching aid for both teachers and students: the former, freed from many time consuming and cost prohibitive elements of essay grading, can focus more on some aspects that these tools are poor at assessing (e.g., argumentation, style, and idea development) (Crossley et al. 2014); the latter can get an immediate and preliminary self-assessment on their written productions so as to better understand their mistakes and hopefully avoid repeating them. Generally speaking, these systems reduce the demands and complications often associated with human writing assessment, such as time, cost, and reliability (Page 2003; Burstein 2003; Bereiter 2003). An AES system for Italian L1 written productions would be particularly useful if integrated into educational processes based on distance learning paradigms, which in turn need adequate technological infrastructures to be really efficient.

Bibliography

Yigal Attali and Jill Burstein. 2006. “Automated Essay Scoring With e-Rater. 2.” The Journal of Technology, Learning, and Assessment 4 (3). https://ejournals.bc.edu/index.php/jtla/article/view/1650/1492 .

Alessia Barbagli, Pietro Lucisano, Felice Dell’Orletta, and Giulia Venturi. 2015. “Il Ruolo Delle Tecnologie Del Linguaggio Nel Monitoraggio Dell’evoluzione Delle Abilità Di Scrittura: Primi Risultati.” Italian Journal of Computational Linguistics (IJCoL) 1 (1): 99–117. http://www.italianlp.it/wp-content/uploads/2016/03/IJCL_competenze_linguistiche.pdf .

Alessia Barbagli, Lucisano Pietro, Felice Dell’Orletta, Simonetta Montemagni, and Giulia Venturi. 2016. “CItA: An L1 Italian Learners Corpus to Study the Development of Writing Competence.” In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16) , 88–95. Portorož, Slovenia: European Language Resources Association (ELRA). https://aclanthology.org/L16-1014.pdf .

Carl Bereiter. 2003. Automated Essay Scoring: A Cross Disciplinary Approach. Foreword . Lawrence Erlbaum Associates: Mahwah, NJ: Mark D. Shermis; Jill C. Burstein Eds.

Gaetano Berruto. 1987. Sociolinguistica Dell’italiano Contemporaneo . Roma: Carocci.

Douglas Biber, Susan Conrad, and Randi Reppen. 1998. “Corpus Linguistics - Investigating Language Structure and Use.” In Cambridge Approaches to Linguistics .

Carlotta Caterina Borghi. 2013. “Analisi Di Produzioni Scritte. Valutazioni e Misure Automatizzate Di Elaborati Scolastici.” PhD thesis, Università di Roma, La Sapienza.

Julian Brooke and Graeme Hirst. 2012. “Measuring Interlanguage: Native Language Identification with L1-Influence Metrics.” In Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC ’12) , 779–84. Istanbul, Turkey: European Language Resources Association (ELRA). http://www.lrec-conf.org/proceedings/lrec2012/pdf/129_Paper.pdf .

Dominique Brunato, Lorenzo De Mattei, Felice Dell’Orletta, Benedetta Iavarone, and Giulia Venturi. 2018. “Is This Sentence Difficult? Do You Agree?” In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing , 2690–99. Brussels, Belgium: Association for Computational Linguistics. https://doi.org/10.18653/v1/D18-1289 .

Jill Burstein. 2003. The e-Rater Scoring Engine: Automated Essay Scoring with Natural Language Processing . Lawrence Erlbaum Associates Publishers.

Sybil B. Carlson, Brent Bridgeman, Roberta Camp, and Janet Waanders. 1985. Relationship of Admission Test Scores to Writing Performance of Native and Non-Native Speakers of English . Princeton, New Jersey (USA): ETS.

Andrea Cimino, Felice Dell’Orletta, Dominique Brunato, and Giulia Venturi. 2018. “Sentences and Documents in Native Language Identification.” In Proceedings of the Fifth Italian Conference on Computational Linguistics (CLiC-It 2018) . Torino, Italy.

Eleonora Cocciu, Dominique Brunato, Giulia Venturi, and Felice Dell’Orletta. 2018. “Gender and Genre Linguistic Profiling: A Case Study on Female and Male Journalistic and Diary Prose.” In Proceedings of the Fifth Italian Conference on Computational Linguistics (CLiC-It 2018) . Torino, Italy.

Adriano Colombo. 2011. «A Me Mi»: Dubbi, Errori, Correzioni Nell’italiano Scritto . Milano: FrancoAngeli.

Maria Corda Costa and Aldo Visalberghi, eds. 1995. Misurare e Valutare Le Competenze Linguistiche: Guida Scientifico-Pratica Per Gli Insegnanti . Firenze: La Nuova Italia.

Scott A. Crossley, Kristopher Kyle, Laura K. Allen, Liang Guo, and Danielle S. McNamara. 2014. “Linguistic Microfeatures to Predict L2 Writing Proficiency: A Case Study in Automated Writing Evaluation.” Journal of Writing Assessment 7 (1). http://files.eric.ed.gov/fulltext/ED585968.pdf .

Scott A. Crossley and Danielle S. McNamara. 2012. “Predicting Second Language Writing Proficiency: The Roles of Cohesion and Linguistic Sophistication.” Journal of Research in Reading 35 (2): 115–35. https://doi.org/10.1111/j.1467-9817.2010.01449.x .

Daniel Dahlmeier, Hwee Tou Ng, and Siew Mei Wu. 2013. “Building a Large Annotated Corpus of Learner English: The NUS Corpus of Learner English.” In Proceedings of the Eighth Workshop on Innovative Use of NLP for Building Educational Applications , 22–31. Atlanta, Georgia: Association for Computational Linguistics. https://aclanthology.org/W13-1703.pdf .

Tullio De Mauro. 1977. Scuola e Linguaggio . Roma: Editori Riuniti.

Tullio De Mauro. 1983. “Per Una Nuova Alfabetizzazione.” In Teoria e Pratica Del Glotto-Kit: Una Carta d’identità Per l’educazione Linguistica , edited by Stefano Gensini and Massimo Vedovelli, 19–29. Milano: FrancoAngeli.

Cheryl A. Engber. 1995. “The Relationship of Lexical Proficiency to the Quality of ESL Compositions.” Journal of Second Language Writing 4 (2): 139–55. https://doi.org/10.1016/1060-3743(95)90004-7 .

Silvana Ferreri. 1971. “Italiano Standard, Italiano Regionale e Dialetto in Una Scuola Media Di Palermo.” In L’insegnamento Dell’italiano in Italia e All’estero: Atti Del Quarto Convegno Internazionale Di Studi , 205–24. Roma, italy: Bulzoni Editore.

Dana R. Ferris. 1994. “Lexical and Syntactic Features of ESL Writing by Students at Different Levels of L2 Proficiency.” TESOL Quarterly 28 (2): 414–20.

Leslie Grant and April Ginther. 2000. “Using Computer-Tagged Linguistic Features to Describe L2 Writing Differences.” Journal of Second Language Writing 9 (2): 123–45. https://doi.org/10.1016/S1060-3743(00)00019-9 .

Hans van Halteren. “Linguistic Profiling for Author Recognition and Verification.” In Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics (ACL ’04) , 200–207. Barcelona, Spain.

Benedetta Iavarone, Dominique Brunato, and Felice Dell’Orletta. 2021. “Sentence Complexity in Context.” In Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics , 186–99. Online: Association for Computational Linguistics. https://doi.org/10.18653/v1/2021.cmcl-1.23 .

Scott Jarvis. 2002. “Short Texts, Best-Fitting Curves and New Measures of Lexical Diversity.” Language Testing 19 (1): 57–84.

Ronald T. Kellogg. 2008. “Training Writing Skills: A Cognitive Developmental Perspective.” Journal of Writing Research (JoWR) 1 (1): 1–26. https://neillthew.typepad.com/files/training-writing-skills.pdf .

Klaus Krippendorff. 2011. “Computing Krippendorff’s Alpha-Reliability.” University of Pennsylvania. https://repository.upenn.edu/cgi/viewcontent.cgi?article=1043&context=asc_papers .

Thomas K. Landauer, Darrell Laham, and Peter Foltz. 2003. “Automated Scoring and Annotation of Essays with the Intelligent Essay Assessor.” In Automated Essay Scoring: A Cross-Disciplinary Perspective , edited by Mark D. Shermis and Jill C. Burstein. Mahwah, New Jersey (USA): Lawrence Erlbaum Associates.

Maria Cristina Lavinio. 1975. L’insegnamento Dell’italiano. Un’inchiesta Campione in Una Scuola Media Sarda . Cagliari: Edes.

Pietro Lucisano. 1984. “L’indagine IEA Sulla Produzione Scritta.” Giornale Italiano Della Ricerca Educativa 5: 41–46.

Pietro Lucisano. 1988. “La Ricerca IEA Sulla Produzione Scritta.” Giornale Italiano Della Ricerca Educativa 1: 3–13.

Pietro Lucisano and Guido Benvenuto. 1991. “Insegnare a Scrivere: Dalla Parte Degli Insegnanti.” Scuola e Città 6: 265–79.

Lucia Marconi, Michela Ott, Elia Presenti, Daniela Ratti, and Mauro Tavella. 1994. Lessico Elementare. Dati Statistici Sull’italiano Scritto e Letto Dai Bambini Delle Elementari . Bologna: Zanichelli.

Danielle S., McNamara, Scott A. Crossley, and Philip Mccarthy. 2010. “Linguistic Features of Writing Quality.” Written Communication 27 (1): 57–86. https://doi.org/10.1177/0741088309351547 .

Danielle S. McNamara, Scott A. Crossley, and Rod Roscoe. 2013. “Natural Language Processing in an Intelligent Writing Strategy Tutoring System.” Behavior Research Methods 45 (2): 499–515. https://doi.org/ https://doi.org/10.3758/s13428-012-0258-1 .

Danielle S. McNamara, Arthur C. Graesser, Philip M. McCarthy, and Zhiqiang Cai. 2014. Automated Evaluation of Text and Discourse with Coh-Metrix . Cambridge: Cambridge University Press.

Simonetta Montemagni. 2013. “Tecnologie Linguistico-Computazionali e Monitoraggio Della Lingua Italiana.” Studi Italiani Di Linguistica Teorica e Applicata (SILTA) , 145–72. http://www.italianlp.it/wp-content/uploads/2014/04/montemagni_silta_submission_rif.pdf .

Robert Munro, Steven Bethard, Victor Kuperman, Vicky T. Lai, Robin Melnick, Christopher Potts, Tyler Schnoebelen, and Harry Tily. 2010. “Crowdsourcing and Language Studies: The New Generation of Linguistic Data.” In NAACL Workshop on Creating Speech and Language Data with Amazon’s Mechanical Turk , 122–30. Los Angeles, CA, USA: Association for Computational Linguistics.

Hwee Tou Ng, Siew Mei Wu, Yuanbin Wu, Christian Hadiwinoto, and Joel Tetreault. 2013. “The CoNLL-2013 Shared Task on Grammatical Error Correction.” In Proceedings of the Seventeenth Conference on Computational Natural Language Learning: Shared Task , 1–12. Sofia, Bulgaria: Association for Computational Linguistics. https://aclanthology.org/W13-3601.pdf .

Ellis Batten Page. 2003. “Project Essay Grade: PEG.” Automated Essay Scoring: A Cross-Disciplinary Perspective .

Joy Reid. 1990. “Responding to Different Topic Types: A Quantitative Analysis from a Contrastive Rhetoric Perspective.” In Second Language Writing: Research Insights for the Classroom , edited by Barbara Kroll, 191–210. Cambridge (England): Cambridge University Press.

Randi Reppen. 2002. “A Genre-Based Approach to Content Writing Instruction.” In Methodology in Language Teaching: An Anthology of Current Practice , edited by Jack C. Richards and Willy A. Renandya, 321–27. Cambridge (England): Cambridge University Press.

Lawrence M. Rudner, Veronica Garcia, and Catherine Welch. 2006. “An Evaluation of the IntelliMetric Essay Scoring System.” Journal of Technology, Learning, and Assessment 4 (4): 1–21. https://ejournals.bc.edu/index.php/jtla/article/view/1651/1493 .

Kristen Campbell Wilcox, Robert Yagelski Yagelski, and Fang Yu. 2013. “The Nature of Error in Adolescent Student Writing.” Reading and Writing 27 (6): 1073–94.

Frank Wilcoxon, S. K. Katti, and Roberta A. Wilcox. 1970. “Critical Values and Probability Levels for the Wilcoxon Rank Sum Test and the Wilcoxon Signed Rank Test.” In Selected Tables in Mathematical Statistics , 1:171–259. Providence, Rhode Island (USA): American Mathematical Society.

1 The corpus is freely available for research purposes at the following link: http://www.italianlp.it/resources/cita-corpus-italiano-di-apprendenti-l1/

5 For the sake of completeness, we report the English translation of the guidelines: "Hello! This survey is addressed to Italian native speakers. Its submission requires about 20 minutes. By completing it, you give your consent to participation. Before going on, we explain to you what it consists of. In the following pages you will read ten pairs of essays written by Italian L1 learners during the first two years of lower secondary school. The essays may contain linguistic errors. For each pair, you are asked to choose the best written of the two essays. No answers are right or wrong: you only have to express your opinion! Bear in mind that the essays of a pair can concern different topics, but this must not affect your judgment. Your participation to the survey is completely free. You may withdraw from it at any time. Before starting the survey, we ask you to provide some personal information that will be used for statistical purposes. Data will remain completely anonymous and will not be connected to you in any way. If you have doubts, curiosities or improvement proposals, please write me to the address: [email protected]. Have a good read!"

7 The corpus of evaluated essays is available at the following link: http://www.italianlp.it/EvaluatedEssays.zip

List of illustrations

Title Figure 1. Comparison of a pair of essays extracted from one of the ten surveys
File image/jpeg, 337k
Title Figure 2. Distribution of annotations with respect to readers’ age bins
File image/jpeg, 39k
Title Figure 3. Distribution of annotations with respect to readers’ education
File image/jpeg, 38k

Electronic reference

Aldo Cerulli , Dominique Brunato and Felice Dell’Orletta , “Linguistic Profile of a Text and Human Ratings of Writing Quality: a Case Study on Italian L1 Learner Essays” ,  IJCoL [Online], 9-1 | 2023, Online since 01 August 2023 , connection on 14 September 2024 . URL : http://journals.openedition.org/ijcol/1104; DOI : https://doi.org/10.4000/ijcol.1104

About the authors

Aldo cerulli.

Dipartimento di Filologia Letteratura Linguistica Piazza Torricelli, 2, 56126, Pisa. E-mail: [email protected]

Dominique Brunato

ItaliaNLP Lab (www.italianlp.it), CNR–ILC - Via Moruzzi, 1 - 56124, Pisa, Italy. E-mail: [email protected]

By this author

  • Probing Linguistic Knowledge in Italian Neural Language Models across Language Varieties [Full text] Published in IJCoL , 8-1 | 2022
  • Lost in Text : A Cross-Genre Analysis of Linguistic Phenomena within Text [Full text] Published in IJCoL , 6-1 | 2020
  • ISACCO: a corpus for investigating spoken and written language development in Italian school–age children [Full text] Published in IJCoL , 2-1 | 2016

Felice Dell’Orletta

ItaliaNLP Lab (www.italianlp.it), CNR–ILC - Via Moruzzi, 1 - 56124, Pisa, Italy. E-mail: [email protected]

  • Linguistically-driven Selection of Difficult-to-Parse Dependency Structures [Full text] Published in IJCoL , 6-2 | 2020
  • Il ruolo delle tecnologie del linguaggio nel monitoraggio dell’evoluzione delle abilità di scrittura: primi risultati [Full text] The Role of Language Technologies in Monitoring the Evolution ofWriting Skills: First Results Published in IJCoL , 1-1 | 2015

CC-BY-NC-ND-4.0

The text only may be used under licence CC BY-NC-ND 4.0 . All other elements (illustrations, imported files) are “All rights reserved”, unless otherwise stated.

Full text issues

  • 10-1 | 2024 Italian Journal of Computational Linguistics vol. 10, n. 1 june 2024
  • 9-2 | 2023 Italian Journal of Computational Linguistics vol. 9, n. 2 december 2023
  • 9-1 | 2023 Italian Journal of Computational Linguistics vol. 9, n.1 june 2023
  • 8-2 | 2022 Italian Journal of Computational Linguistics vol. 8, n. 2 december 2022
  • 8-1 | 2022 Miscellanea
  • 7-1, 2 | 2021 Special Issue: Computational Dialogue Modelling: The Role of Pragmatics and Common Ground in Interaction
  • 6-2 | 2020 Further Topics Emerging at the Sixth Italian Conference on Computational Linguistics
  • 6-1 | 2020 Emerging Topics at the Sixth Italian Conference on Computational Linguistics
  • 5-2 | 2019 Further Topics Emerging at the Fifth Italian Conference on Computational Linguistics
  • 5-1 | 2019 Emerging Topics from the Fifth Italian Conference on Computational Linguistics
  • 4-2 | 2018 Emerging Topics at the Fourth Italian Conference on Computational Linguistics (Part 2)
  • 4-1 | 2018 Emerging Topics at the Fouth Italian Conference on Computational Linguistics (Part 1)
  • 3-2 | 2017 Special Issue: Natural Language and Learning Machines
  • 3-1 | 2017 Emerging Topics at the Third Italian Conference on Computational Linguistics and EVALITA 2016
  • 2-2 | 2016 Special Issue: Digital Humanities and Computational Linguistics
  • 2-1 | 2016 Emerging Topics at the Second Italian Conference on Computational Linguistics
  • 1-1 | 2015 Emerging Topics at the First Italian Conference on Computational Linguistics

About the Journal

  • About the journal
  • Information for authors
  • Publication Ethics and Publication Malpractice Statement

Information

  • Website credits
  • Publishing policies

RSS feed

Newsletters

  • OpenEdition Newsletter

In collaboration with

Logo AILC

Electronic ISSN 2499-4553

Read detailed presentation  

Site map  – Contacts  – Website credits  – Syndication

Privacy Policy  – About Cookies  – Report a problem

OpenEdition member  – Published with Lodel  – Administration only

You will be redirected to OpenEdition Search

IMAGES

  1. (PDF) Linguistic Profiling of a Neural Language Model

    linguistic profiling essay

  2. Essay- Translation

    linguistic profiling essay

  3. (DOC) CRITICAL REVIEW OF LINGUISTIC PROFILING

    linguistic profiling essay

  4. Linguistic profiling of texts for the purpose of language verification

    linguistic profiling essay

  5. Linguistic Features of Language Free Essay Example

    linguistic profiling essay

  6. (PDF) Sub-Profiling by Linguistic Dimensions to Solve the Authorship

    linguistic profiling essay

VIDEO

  1. A Level English Language (9093) Paper 4- Section B: Language and the Self (Part 2)

  2. Convergent Linguistic Evolution #etymology #linguistics #language

  3. Some Linguistic Differences

  4. Linguistic Profiling

  5. linguistic profiling experiment

  6. Usage-Based Approaches to SLA

COMMENTS

  1. Linguistic Profiling and Language-Based Discrimination

    The phrase "linguistic profiling" first appeared in this article, and that concept has since expanded to include prejudicial, and often illegal, reactions to the speech or writing of individuals whose language usage was used as the basis of discrimination against them. Haugen, Einar. 1972. The ecology of language.

  2. Linguistic Profiling across International Geopolitical Landscapes

    Abstract. Voice recognition lies at the heart of linguistic profiling, a discriminatory practice whereby goods, services, or opportunities that might otherwise be available are denied to someone, typically sight unseen, based on the sound of their voice. The technology that faithfully recreates one's voice during phone conversations provides the basis on which nefarious, if not illegal, voice ...

  3. 17 Linguistic Profiling and Discrimination

    Linguistic profiling is not restricted to the United States, and this discussion describes the relevance of this phenomenon globally, including relevance to language attitudes, education, and employment. Efforts to promote greater linguistic acceptance in support of improved social harmony are advocated in the conclusion.

  4. ASHA Voices: The Effects of Linguistic Profiling

    Baugh, president of the Linguistic Society of America, also shares how dialect can be used to discriminate against people, which he refers to as linguistic profiling. Baugh explains how linguistic profiling can affect all facets of people's lives.

  5. Linguistic Profiling Rooted in Implicit Bias, Stereotyping

    It becomes linguistic profiling once someone attaches their implicit biases to the fact that callers sound as if they are part of a certain group or community and decides to discriminate against them. It is an easy way for people to quickly make assumptions about a caller and decide if they are part of a community they have a prejudice against.

  6. Language & Social Justice in the United States: An Introduction

    Those materials cover a comprehensive range of language issues related to social justice. The collection of essays in this Dædalus volume is unique in its breadth of coverage and extends from issues including linguistic profiling, raciolinguistics, and institutional linguicism to multilingualism, language teaching, migration, and climate change.

  7. PDF Microsoft Word

    Introduction. Linguistic profiling occurs when a listener uses auditory cues to identify social characteristics, such as race, gender, sexual orientation, or geographic origin. Linguistic profiling is a natural and automatic psychological process. Though it is not itself inherently discriminatory, it can contribute to racial profiling, which is ...

  8. Linguistic Profiling across International Geopolitical Landscapes

    Linguistic Profiling across International Geopolitical Landscapes. Abstract Voice recognition lies at the heart of linguistic profiling, a discriminatory practice whereby goods, services, or opportunities that might otherwise be available are denied to someone, typically sight unseen, based on the sound of their voice.

  9. Linguistic Profiling across International Geopolitical Landscapes

    Linguistic Profiling across. International Geopolitical Landscapes. John Baugh. Voice r ecognition lies at the heart of linguistic profiling, a discriminatory practice. whereby goods, services ...

  10. The Significance of Linguistic Profiling

    What is Linguistic Profiling and why is it so prominent in our society? Dr. Baugh explores the field and explains dialects, accents, and our linguistic herit...

  11. Linguistic Profiling and Discrimination

    Applied Linguistics. 2024. TLDR. Analysis of the architectures of large language models being used increasingly in daily life exposes emerging AI accent modification technology and services as agents of racial commodification and linguistic dominance, as it rests on the perceived superiority of standardized US English.

  12. The sound of racial profiling: When language leads to discrimination

    Current Washington University and former Stanford professor John Baugh coined the term 'linguistic profiling' several years ago in response to the discrimination he himself experienced when looking for a house in majority white Palo Alto. In a study he designed with colleagues, Baugh, using either African-American, Chicano or Standard ...

  13. Linguistic profiling

    Linguistic profiling is the practice of identifying the social characteristics of an individual based on auditory cues, in particular dialect and accent. The theory was first developed by Professor John Baugh to explain discriminatory practices in the housing market based on the auditory redlining of prospective clientele by housing administrators. Linguistic profiling extends to issues of ...

  14. PDF Other People's English Accents Matter: Challenging Standard ...

    e linguistic r minority groups speaking with accents deviating from the alleged Standard English accent inguists, such as Baugh (2018), examined the salient ef discrimination on linguistic minorities. Specifically, Baugh's scholarly linguistic work helps understand various kinds of accent discrimination, including linguistic profiling, that ...

  15. Linguistic Profiling across International Geopolitical ...

    Abstract Voice recognition lies at the heart of linguistic profiling, a discriminatory practice whereby goods, services, or opportunities that might otherwise be available are denied to someone, typically sight unseen, based on the sound of their voice. The technology that faithfully recreates one's voice during phone conversations provides the basis on which nefarious, if not illegal, voice ...

  16. Guest editorial: Linguistic profiling and implications for career

    Linguistic profiling is a term used to describe inferences derived from a person's speech ( Smalls, 2004) and has been shown to influence individuals' development in many ways. When linguistic profiling describes discriminatory practices, it can be considered the auditory equivalent of racial profiling ( Smalls, 2004 ).

  17. Linguistic profiling: The sound of your voice may determine if you get

    Many Americans can guess a caller's ethnic background from their first hello on the telephone. Can the sound of your voice be used against you?However, the inventor of the term "linguistic profiling" has found that when a voice sounds African-American or Mexican-American, racial discrimination may follow. In studying this phenomenon through hundreds of test phone calls, John Baugh, Ph.D., the ...

  18. Full article: The complexities of linguistic discrimination

    ABSTRACT Linguistic discrimination is a complex phenomenon. How should it be investigated? Evidential pool is of key importance. In this paper, we present specific conceptual and methodological challenges in the study of linguistic discrimination, with a focus on linguistic discrimination resulting from implicit attitudes and the steadily growing research on biases and structural approaches to ...

  19. Linguistic Profiling in Education: How Accent Bias Denies Equal ...

    Linguistic Profiling in Education: How Accent Bias Denies Equal Educational Opportunity to Students of Color 12 Scholar 355 (2009-2010) 30 Pages Posted: 21 Jun 2017

  20. Introduction: Language & Social Justice in the United States

    Those materials cover a comprehensive range of language issues related to social justice. The collection of essays in this Dædalus volume is unique in its breadth of coverage and extends from issues including linguistic profiling, raciolinguistics, and institutional linguicism to multi­lingualism, language teaching, migration, and climate change.

  21. Language and White Supremacy

    Racial and linguistic hierarchies work together to falsely connect whiteness and the use of "standard" (officially sanctioned) language with rationality, intelligence, education, wealth, and higher status. Under these racial logics, speakers of languages associated with non-whiteness are readily linked to danger, criminality, a lack of intelligence or ability, primitivism, and foreignness ...

  22. Preface to the Special Issue on Computational Linguistics and ...

    Linguistic Profiling Many of the papers proposed various novel methods of linguistic profiling and categorizing texts. The paper "Computing the Sound-Sense Harmony: A Case Study of William Shakespeare's Sonnets and Francis Webb's Most Popular Poems" by Delmonte [2] proposes a novel sentiment analysis metric.

  23. Linguistic Profile of a Text and Human Ratings of Writing Quality:

    This paper presents a study based on the linguistic profiling methodology to explore the relationship between the linguistic structure of a text and how it is perceived in terms of writing quality by humans. The approach is tested on a selection of Italian L1 learners essays, which were taken from a larger longitudinal corpus of essays written by Italian L1 students enrolled in the first and ...