Amidst the ongoing “India AI Impact Summit 2026”, a report highlighted a concern that many researchers have been raising. If you give an AI model just an Indian surname, it often guesses a person’s social background — sometimes linking dominant-caste names to high-status jobs and Dalit names to less respected work.
India is not the only country with this problem. Even language models trained on ordinary web text can pick up human-like biases, as demonstrated by Aylin Caliskan and her colleagues in 2017. These are the same associations psychologists find in implicit-bias tests. The argument was not that machines are biased, but rather that social hierarchies are reflected in language, and if that is the intention of the models, they will pick up on these patterns.
Real-world systems have shown this pattern. After learning to penalise signs associated with women, like the word “women’s,” in club leadership roles, Amazon discontinued an internal AI recruiting tool, according to a 2018 Reuters report. Although engineers attempted to correct it, the key takeaway was obvious: The system interprets biased data as significant signals if it is used for training.
Because caste is heavily encoded in text, particularly in surnames, it is ideally suited to become a “signal” in India. Correlations such as which surnames co-occur with “IIT/IIM,” “manager,” and “English-medium,” and which surnames more frequently appear near “contract work,” “relief,” “scavenging,” or descriptions of deprivation, will unavoidably be picked up by a model trained to minimise next-word prediction error. It just needs enough frequent occurrences to understand what caste is.
These patterns are transformed into shapes in the model’s data behind the scenes. Tolga Bolukbasi and his colleagues demonstrated approximately a decade ago that bias manifests in the way names and words are grouped together, which can reinforce stereotypes. They used the comparison between “programmer” and “homemaker” as an example to illustrate how meaning can be arranged unevenly. Since the math only follows recurring patterns rather than social categories, caste can be represented in the same way as gender.
For this reason, research on India is crucial. A number of models covering social, economic, educational, and political aspects are examined in the 2025 DECASTE by Prashanth Vijayaraghavan and colleagues. It discovers that the addition of caste cues, such as Indian surnames and personas, consistently reinforces stereotypes. Another point that is frequently overlooked in public discussions is that safety precautions do not always address more serious issues. Biased patterns can be retained within a model even if it is modified to reject or soften overt casteist cues. Similar to people who support equality but still harbour unconscious associations, systems that pass clear-cut bias tests can nevertheless harbour hidden biases, according to a 2025 PNAS paper by Xuechunzi Bai and her colleagues. A chatbot might, for instance, refrain from using derogatory language but nevertheless make assumptions that support social hierarchies.
Why does this matter beyond chat? Because the most consequential uses of AI are not free-form conversation; they are pipelines: screening, ranking, summarising, and recommending. ProPublica’s decade-old report by Julia Angwin and colleagues on the COMPAS risk-assessment tool in the US showed how statistical systems can create disputes even when accuracy is comparable across groups — because the errors and their costs can be distributed unequally. Different domain, same governance problem: Once an automated score enters decision-making, the harm is often incremental, cumulative, and hard to contest, case by case.
While measurement capacity is still catching up, India is currently entering that exact phase — deploying AI into hiring, credit, and citizen-facing governance. According to Nature’s report by Mohana Basu on caste bias in AI models, tools to identify it are starting to appear, but it is more difficult to reduce bias while maintaining usefulness than it is to detect it.
This is evident from Khyati Khandelwal and her colleagues’ 2024 work on Indian-BhED, which shows that Indian social categories need special consideration and that a large portion of the bias literature has traditionally concentrated on Western contexts. Specifically related to caste and religion, they give examples of both stereotypical and anti-stereotypical behaviour. They also discover that a number of popular models exhibit a quantifiable tendency to favour stereotypical continuations in these Indian-centric prompts.
The construction of some of the most robust measurement works is underway. According to G S Santhosh and colleagues’ 2025 work, IndiCASA, tested models exhibit stereotypical tendencies to differing degrees and suggest an India-context benchmark for stereotypes and anti-stereotypes across caste and other identities. Instead of importing Western-only bias checklists, India needs to move in that direction: testing that takes into account Indian social structure.
However, this is the exact point at which the policy gap appears. The nation runs the risk of institutionalising bias as “automation” if AI is being acquired and implemented more quickly than these audits become commonplace, whether through public-sector chatbot rollouts, HR tech procurement, or procurement regulations. A framework for this is provided by the accountability literature. The 2020 FAccT paper by Inioluwa Deborah Raji and colleagues makes the case for end-to-end internal auditing as a governance discipline, rather than as a post-scandal afterthought.
Therefore, whether AI “knows” caste is not the most pressing question. Before these systems subtly transform into infrastructure, determining who is perceived as employable, credible, risky, or deserving, the question is whether India will measure what it deploys. Addressing this is crucial if we aim for “Welfare for All, Happiness for All”.
The writer works on caste discrimination in digital spaces
