Syntactic Variation in Urdu, Saraiki, and English
A Construction-Grammar Account of Individuals, Dialects, and Registers
by Riaz Laghari, Lecturer in English, NUML, Islamabad
Part I: Conceptual Foundations
1: Why Compare Urdu, Saraiki, and English?
1.1 Theoretical imbalance in syntactic variation research1.2 Indo-Aryan vs Germanic: typological contrasts
1.3 Urdu–Saraiki as a dialect–language continuum
1.4 English as a global comparative baseline
1.5 Contribution to Construction Grammar and variation theory
Argument: Typological distance + genealogical proximity = ideal test bed for CxG.
2.1 Constructions beyond Eurocentric syntax
2.2 Word order (SOV vs SVO) as constructional choice
2.3 Morphological exponence and constructional packaging
2.4 Argument structure constructions across the three languages
2.5 Productivity, entrenchment, and abstraction cross-linguistically
3.1 Complex adaptive systems in syntax
3.2 Constructional networks in analytic vs inflectional systems
3.3 Degree of abstraction in Urdu and Saraiki constructions
3.4 English periphrasis vs South Asian morphology
3.5 Predicting variation from system architecture
4.1 Challenges of corpus building in under-resourced languages
4.2 Written vs spoken corpora
4.3 Dialectal representation in Saraiki
4.4 Regional and global English corpora
4.5 Register annotation across three languages
5.1 Identifying constructions across languages
5.2 Morphosyntactic glossing and abstraction
5.3 Measuring similarity without one-to-one equivalence
5.4 Computational models adapted for morphologically rich systems
5.5 Evaluation and reproducibility
6.1 Multilingual exposure in Pakistan
6.2 Urdu–Saraiki code-interaction as constructional blending
6.3 Individual construction inventories
6.4 Stability and fluctuation across registers
6.5 Cognitive implications
7.1 Measuring individual differences
7.2 Which constructions vary most?
7.3 Morphological vs periphrastic constructions
7.4 English speakers vs Urdu/Saraiki speakers
7.5 Summary of individual variation patterns
8.1 Saraiki as a dialect cluster
8.2 Grammatical variation across regions
8.3 Contact with Punjabi, Sindhi, and Urdu
8.4 Exposure-based dialect emergence
8.5 Dialect networks in Pakistan
9.1 Dialect in morphologically rich vs analytic languages
9.2 Core vs peripheral constructions in Urdu and Saraiki
9.3 Why English dialects pattern differently
9.4 Generalization and convergence
9.5 Implications for dialect theory
10.1 Formality, function, and genre
10.2 Literary vs administrative Urdu
10.3 Oral vs written Saraiki
10.4 Academic, media, and digital English
10.5 Register as construction–function mapping
11.1 Register variation and grammatical core
11.2 Function-driven selection of constructions
11.3 Why register behaves differently from dialect
11.4 Cross-linguistic comparison
11.5 Case studies
12.1 Where variation lives in each language
12.2 Order of emergence and abstraction
12.3 Morphology vs syntax in variation
12.4 Cross-linguistic convergence and divergence
12.5 A unified model
13.1 Individual exposure as the engine of variation
13.2 Generalization as a stabilizing force
13.3 Why Urdu and Saraiki converge at higher abstraction
13.4 Why English relies on periphrastic smoothing
13.5 System-level explanation
14.1 Moving beyond rule-based descriptions
14.2 Reinterpreting “free word order”
14.3 Construction Grammar for Indo-Aryan languages
14.4 Implications for linguistic typology
15.1 Teaching grammar in multilingual societies
15.2 Myth of a single “standard Urdu”
15.3 Register awareness in education
15.4 English teaching in Pakistan
15.5 Applied outcomes
16.1 What comparative CxG reveals
16.2 Variation as emergent across languages
16.3 Directions for future research
A. Construction inventories for Urdu, Saraiki, and English
B. Glossing conventions (Leipzig)
C. Corpus metadata and reproducibility
Syntactic Variation in Urdu, Saraiki, and English
A Construction-Grammar Account of Individuals, Dialects, and Registers
Part I: Conceptual Foundations
1: Why Compare Urdu, Saraiki, and English?
1.1 Theoretical Imbalance in Syntactic Variation Research
Research on syntactic variation has long suffered from a structural imbalance. Despite decades of empirical work in sociolinguistics and dialectology, theoretical models of variation remain disproportionately grounded in a narrow set of languages, most notably English and a handful of Western European varieties. As a consequence, many generalizations about how and where syntactic variation arises have been inferred from languages with relatively analytic morphosyntax, limited inflectional marking, and well-documented standard varieties. This skewed empirical base has produced theories that are elegant but parochial.
Within formal syntax, variation has traditionally been treated as peripheral, either relegated to the lexicon, attributed to parametric toggles, or explained away as performance noise. Even when variation is acknowledged, it is often operationalized through a small number of arbitrarily selected variables, detached from the wider grammatical system. Such approaches struggle to explain why variation clusters in certain constructions, why it weakens at higher levels of abstraction, or why different types of variation (individual, dialectal, register-based) behave differently across the grammar.
Usage-based and constructionist approaches have gone some way toward correcting this imbalance by foregrounding frequency, exposure, and gradient representations. Yet even here, large-scale comparative work remains rare, particularly for languages outside the Euro-American core. South Asian languages, despite their rich morphology, flexible word order, and intense contact dynamics, remain underrepresented in computational and construction-based models of syntactic variation. This theoretical gap is not merely descriptive; it limits our ability to test claims about the universality of grammatical organization and the emergence of variation in complex linguistic systems.
This post responds to this imbalance by situating syntactic variation within a comparative framework that deliberately spans typological distance and genealogical proximity. Urdu, Saraiki, and English provide an empirically and theoretically powerful constellation for rethinking how variation emerges, stabilizes, and diffuses across grammars.
1.2 Indo-Aryan vs Germanic: Typological Contrasts
From a typological perspective, Urdu and Saraiki stand in sharp contrast to English. Both Indo-Aryan languages exhibit predominantly SOV word order, rich verbal morphology, overt case marking, and relatively flexible constituent ordering conditioned by discourse and information structure. English, by contrast, is largely analytic, rigidly SVO, and reliant on periphrastic constructions rather than inflectional morphology to encode tense, aspect, and modality.
These contrasts are not superficial. They reflect fundamentally different ways in which grammatical information is distributed across constructions. In Urdu and Saraiki, grammatical relations are often encoded morphologically within the verb complex or through postpositions, allowing a single construction to carry a dense bundle of syntactic and semantic features. In English, similar functions are frequently distributed across multiple constructions, auxiliary sequences, and fixed word orders.
For Construction Grammar, such typological differences raise important questions. Does variation behave differently in grammars where constructions are morphologically compact versus periphrastically distributed? Are highly inflected constructions more resistant to variation, or do they merely localize variation differently? How does abstraction operate in languages where surface variability is high but underlying constructional patterns remain stable?
By juxtaposing Indo-Aryan and Germanic systems, this study treats typological contrast not as a complication but as a diagnostic tool. Differences in architectural design allow us to observe how the same cognitive and usage-based mechanisms, exposure, entrenchment, generalization, play out in grammars with radically different surface properties.
1.3 Urdu–Saraiki as a Dialect–Language Continuum
While English provides typological distance, the relationship between Urdu and Saraiki offers genealogical proximity combined with sociolinguistic complexity. Saraiki is often described as a dialect cluster rather than a single, standardized language, while Urdu occupies a unique position as both a lingua franca and an ideologically elevated standard. The boundary between the two is not sharp; it is porous, negotiated, and deeply shaped by education, media exposure, and regional identity.
From a constructionist perspective, this continuum is especially revealing. Rather than assuming discrete grammars corresponding to named languages, Construction Grammar allows us to model overlapping construction inventories with differing degrees of entrenchment and productivity. Many constructions are shared across Urdu and Saraiki, yet differ in frequency, morphological realization, or contextual restriction. Others are peripheral in one variety but central in another.
This continuum challenges traditional dichotomies between “language” and “dialect.” Variation does not align neatly with political or standardization boundaries; instead, it reflects gradients of exposure and contact within a shared constructional space. By modeling Urdu–Saraiki variation at the level of constructions rather than variables, this book demonstrates how population-level differences emerge from individual experience without presupposing categorical divisions.
1.4 English as a Global Comparative Baseline
English occupies a different role in this comparative framework. Unlike Urdu and Saraiki, it is neither genealogically related nor geographically co-localized in most contexts of use. Instead, it functions as a global language with extensive internal variation across regions, registers, and speaker populations. Its inclusion is not motivated by prestige or availability of data alone, but by its value as a benchmark for testing theoretical claims.
English has been the primary testing ground for Construction Grammar, usage-based models, and computational approaches to syntax. As such, it provides a well-mapped reference system against which claims derived from South Asian data can be evaluated. At the same time, English itself is far from uniform. Global Englishes exhibit substantial syntactic variation shaped by contact, institutional norms, and communicative function.
By placing English alongside Urdu and Saraiki, this study avoids treating it as a theoretical default. Instead, English becomes one instantiation of a constructional system shaped by exposure, abstraction, and social network structure, subject to the same forces as any other language, though manifested through different formal means.
1.5 Contribution to Construction Grammar and Variation Theory
The central contribution of this post lies in demonstrating that syntactic variation can only be fully understood when typological distance and genealogical proximity are examined together within a unified constructional framework. Urdu, Saraiki, and English collectively form an ideal test bed: close enough to share comparable functional pressures, yet distant enough to reveal how grammatical architecture shapes the distribution of variation.
This comparative perspective advances Construction Grammar in three key ways. First, it extends constructionist theory beyond its traditional empirical core, testing its claims against morphologically rich and under-described languages. Second, it reconceptualizes variation not as an attribute of isolated constructions but as an emergent property of interacting constructional networks. Third, it integrates individual, dialectal, and register-based variation within a single explanatory model grounded in exposure and generalization.
The guiding argument of this section, and of the entire post as a whole, is therefore straightforward: typological distance combined with genealogical proximity provides the strongest possible empirical foundation for a theory of syntactic variation in Construction Grammar. By leveraging this foundation, the sections that follow aim to move variation theory from fragmented observation to system-level explanation.
2. Construction Grammar and Typological Variation
2.1 Constructions beyond Eurocentric Syntax
Construction Grammar (CxG) was developed largely in response to limitations within rule-based and derivational models of syntax, yet much of its empirical grounding has remained implicitly Eurocentric. Core examples and theoretical intuitions have been drawn predominantly from English and closely related languages, often assuming relatively fixed word order, limited inflection, and a clear separation between syntax and morphology. While Construction Grammar is, in principle, typologically neutral, its explanatory reach can only be evaluated by extending it to languages whose grammatical organization departs radically from these assumptions.
Urdu and Saraiki present precisely such a challenge. Their grammars blur boundaries that are often taken for granted in Eurocentric syntax: morphology and syntax are tightly integrated, word order is flexible rather than rigid, and grammatical relations are frequently encoded through postpositions and verbal agreement rather than positional cues. From a constructionist perspective, these properties are not anomalies requiring special mechanisms. Instead, they invite a more explicit recognition that constructions may package information differently across languages while remaining cognitively and functionally comparable.
By situating Construction Grammar within a typologically diverse empirical domain, this post treats constructions as abstract pairings of form and meaning that are not tied to specific surface configurations. This move allows constructionist theory to be evaluated on its own terms: if constructions are truly usage-based and gradient, they must accommodate the full range of morphosyntactic diversity observed across languages without resorting to ad hoc distinctions.
2.2 Word Order (SOV vs SVO) as Constructional Choice
Traditional syntactic theory often treats word order as a parametric property of a language, fixed at a high level of grammatical organization. In contrast, Construction Grammar views word order as an emergent property of constructions shaped by usage, information structure, and communicative efficiency. From this perspective, the SOV order characteristic of Urdu and Saraiki, and the SVO order dominant in English, are not global constraints imposed on all syntactic structures, but default patterns associated with highly entrenched constructions.
In Urdu and Saraiki, canonical SOV order coexists with substantial flexibility, particularly in contexts involving topicalization, focus, or emphasis. This variability does not signal syntactic freedom in an unconstrained sense; rather, it reflects a rich inventory of constructions in which linear order interacts with discourse function. English, while more rigid overall, also exhibits construction-specific deviations from its default SVO pattern, such as passives, clefts, and extraposition.
Viewing word order as constructional choice rather than parametric setting allows us to compare these languages without privileging one system over another. The relevant question is not why Urdu and Saraiki “permit” scrambling or why English “forbids” it, but how different constructional inventories encode similar communicative functions through different linear and morphological strategies. This shift in perspective is essential for understanding how syntactic variation emerges within and across typologically distinct systems.
2.3 Morphological Exponence and Constructional Packaging
One of the most salient differences between the languages under study lies in the distribution of grammatical information across morphology and syntax. Urdu and Saraiki rely heavily on inflectional morphology and postpositional marking to encode tense, aspect, agreement, and argument roles. English, by contrast, externalizes much of this information through periphrastic constructions involving auxiliaries and fixed word order.
Construction Grammar provides a natural framework for capturing these differences through the notion of constructional packaging. Rather than assuming a universal mapping between grammatical functions and structural positions, CxG allows constructions to bundle semantic, pragmatic, and syntactic information in language-specific ways. A single verbal construction in Urdu or Saraiki may correspond functionally to a multi-word construction in English, without implying any deeper asymmetry in grammatical complexity.
This perspective has important implications for variation. In morphologically rich systems, variation may concentrate in affixal selection, agreement patterns, or case marking, whereas in analytic systems it may surface in auxiliary choice, word order, or optional elements. Yet at a higher level of abstraction, these differences often align: both types of systems instantiate comparable constructional schemas shaped by frequency and communicative need. Recognizing this alignment is crucial for developing cross-linguistic models of syntactic variation that are not biased toward surface form.
2.4 Argument Structure Constructions across the Three Languages
Argument structure provides a particularly revealing domain for comparative constructional analysis. Across Urdu, Saraiki, and English, speakers routinely express similar event types, transfer, causation, motion, perception, yet the grammatical resources used to encode participants differ markedly. English relies heavily on fixed argument positions and prepositional phrases, while Urdu and Saraiki employ case marking, agreement, and complex predicates.
From a constructionist standpoint, these differences are best modeled as distinct argument structure constructions rather than as derivations from a universal underlying representation. For example, ditransitive constructions in English foreground word order and prepositional alternations, whereas their counterparts in Urdu and Saraiki foreground morphological marking and verb serialization. Despite these differences, the constructions are functionally comparable and participate in similar patterns of productivity and restriction.
Importantly, variation within argument structure constructions does not distribute evenly across languages. In Urdu and Saraiki, highly abstract argument structure schemas tend to be stable across dialects, while lower-level constructions show greater variability. In English, variation often appears in the selection and extension of periphrastic constructions. These asymmetries provide a testing ground for claims about abstraction, entrenchment, and the localization of variation within the grammar.
2.5 Productivity, Entrenchment, and Abstraction Cross-Linguistically
A central claim of Construction Grammar is that constructions differ in productivity and entrenchment depending on frequency of use and degree of abstraction. This claim gains empirical depth when examined across typologically diverse languages. Urdu, Saraiki, and English exhibit different pathways through which constructions become entrenched and generalized, shaped by their morphological and syntactic architectures.
In Urdu and Saraiki, frequent exposure to morphologically complex constructions promotes early entrenchment of concrete patterns, which are later generalized into higher-order schemas. In English, abstraction often proceeds through repeated exposure to periphrastic patterns that span multiple constructions. Despite these differences, the end result is strikingly similar: highly abstract constructions tend to be more stable and less variable across speakers and populations, while low-level, lexically specific constructions remain fertile ground for variation.
This cross-linguistic convergence supports the view that productivity and abstraction are emergent properties of usage rather than language-specific rules. Variation, in turn, is not randomly distributed but systematically constrained by the architecture of the constructional network. By examining these processes across Urdu, Saraiki, and English, this section lays the groundwork for the empirical section that follow, where individual, dialectal, and register-based variation will be mapped onto the constructional systems of each language.
3. Language as a Complex System across Typologies
3.1 Complex Adaptive Systems in Syntax
Recent advances in linguistic theory increasingly converge on the view that language is best understood as a complex adaptive system. Such systems are characterized by large numbers of interacting components, non-linear dynamics, sensitivity to initial conditions, and emergent structure arising from local interactions rather than centralized control. When applied to syntax, this perspective shifts attention away from static rule inventories and toward the dynamic processes through which grammatical patterns are formed, reinforced, and reshaped through use.
Within this framework, syntactic knowledge is not a fixed set of rules shared uniformly across speakers, but a probabilistic network of constructions shaped by individual experience and social interaction. Variation is therefore not an anomaly requiring explanation, but a natural outcome of the adaptive nature of the system. Each speaker’s grammar reflects a unique trajectory of exposure, while population-level regularities emerge from overlapping patterns of usage across individuals.
Importantly, a complex-systems approach predicts that properties observed at one level of analysis, such as individual preferences or dialectal tendencies, cannot be straightforwardly extrapolated to the grammar as a whole. This insight is particularly relevant for typologically diverse languages, where surface differences may obscure deeper commonalities in system behavior. By adopting a complex adaptive systems perspective, this study seeks to explain syntactic variation not through isolated mechanisms, but through the interaction of grammatical architecture, usage frequency, and population structure.
3.2 Constructional Networks in Analytic vs Inflectional Systems
Construction Grammar conceptualizes grammar as a network of interconnected constructions varying in specificity, abstraction, and frequency. This network-based view aligns naturally with complex-systems thinking, as it emphasizes relationships among constructions rather than linear derivations. However, the structure of constructional networks differs systematically across typological profiles.
In analytic systems such as English, constructional networks tend to be relatively wide and shallow, with grammatical functions distributed across multiple constructions linked through periphrastic patterns. Information about tense, aspect, modality, and argument structure is often spread across auxiliaries, fixed word order, and lexical items, resulting in networks with numerous nodes connected through shared schematic properties.
In contrast, inflectional systems such as Urdu and Saraiki exhibit denser and more vertically organized networks. Individual constructions often bundle multiple grammatical functions into compact morphological units, creating tightly integrated clusters within the network. These clusters support strong local cohesion, while higher-level generalizations link them to broader constructional schemas.
These architectural differences have direct consequences for how variation propagates through the system. In analytic networks, variation may diffuse across multiple constructions simultaneously, while in inflectional networks it may remain localized within specific morphological paradigms. Understanding these differences is essential for predicting where variation is likely to emerge and how it will be constrained.
3.3 Degree of Abstraction in Urdu and Saraiki Constructions
Urdu and Saraiki provide especially revealing cases for examining the role of abstraction in constructional systems. Both languages exhibit a rich inventory of low-level constructions tied closely to specific morphological forms and lexical items. These constructions are often acquired early and become highly entrenched through repeated exposure, particularly in spoken registers.
Over time, however, speakers generalize across these concrete patterns to form higher-order schemas that capture commonalities in argument structure, agreement, and word order. Crucially, this process of abstraction does not eliminate surface variability; instead, it organizes it. Lower-level constructions retain their idiosyncrasies, while higher-level schemas impose probabilistic constraints on their distribution.
From a complex-systems perspective, this stratification explains why variation in Urdu and Saraiki is often concentrated in peripheral or first-order constructions, while core, highly abstract constructions remain remarkably stable across individuals and dialects. The shared mechanisms of abstraction act as a smoothing force, reducing the impact of divergent exposure at higher levels of the grammatical network.
3.4 English Periphrasis vs South Asian Morphology
The contrast between English periphrasis and South Asian morphological packaging offers a particularly clear illustration of how different system architectures give rise to comparable adaptive outcomes. In English, grammatical distinctions are frequently expressed through combinations of auxiliaries, particles, and fixed linear arrangements. This periphrastic strategy creates multiple overlapping constructions that jointly encode meaning.
In Urdu and Saraiki, similar distinctions are often realized through inflectional morphology and complex predicates, resulting in constructions that are formally compact but functionally rich. While these strategies differ markedly at the surface level, both represent adaptive responses to communicative pressures and historical development.
From the standpoint of syntactic variation, the key insight is that periphrastic and morphological systems distribute variability differently. English allows fine-grained variation across auxiliary selection and word order, whereas Urdu and Saraiki localize variation within morphological choices and agreement patterns. Yet in both cases, variation is constrained by the architecture of the constructional network, not by arbitrary rules.
3.5 Predicting Variation from System Architecture
A central advantage of viewing language as a complex adaptive system is that it allows variation to be predicted, rather than merely described. System architecture, specifically, the organization of constructional networks, the degree of abstraction, and the distribution of grammatical information, places strong constraints on where variation can arise and how robust it will be.
Across the three languages examined here, a consistent pattern emerges. Variation is most pronounced in low-frequency, low-abstraction constructions that occupy peripheral positions in the network. As constructions become more abstract, more frequent, and more central, variation diminishes. This pattern holds despite substantial typological differences, suggesting that it reflects a general property of constructional systems rather than language-specific idiosyncrasy.
Typological contrasts do, however, shape the location and form of variation. Analytic systems favor distributed variation across multiple constructions, while inflectional systems concentrate variation within tightly integrated morphological domains. By integrating these insights, this book advances a system-level theory of syntactic variation in which exposure, abstraction, and network structure jointly determine how grammars differ across individuals, populations, and contexts.
Transitional Paragraph
Having established a comparative and system-theoretic foundation, the next part turns to the methodological implications of modeling syntactic variation at scale. Specifically, it addresses how corpora, computational models, and constructional annotation can be used to observe complex grammatical systems without reducing them to isolated variables.
Part II: Data and Methods
4: Corpora for Urdu, Saraiki, and English
4.1 Challenges of Corpus Building in Under-Resourced Languages
Large-scale studies of syntactic variation presuppose access to representative corpora, yet such resources are unevenly distributed across languages. English benefits from decades of corpus development, standardized annotation practices, and open-access infrastructures. By contrast, Urdu and especially Saraiki remain under-resourced, not because of limited speaker populations, but due to historical, political, and institutional factors that have shaped language documentation priorities.
The challenges of corpus building in under-resourced languages extend beyond data scarcity. Orthographic variation, inconsistent standardization, limited availability of spoken data, and uneven genre coverage complicate the creation of balanced corpora. In Urdu, multiple orthographic conventions and script-related encoding issues affect tokenization and morphological analysis. In Saraiki, the lack of a universally accepted written standard introduces additional variability that must be treated as an empirical feature rather than a methodological flaw.
This study approaches these challenges from a constructionist perspective. Rather than attempting to impose uniformity through heavy normalization, variation in form is preserved where possible, allowing constructional patterns to emerge from usage. Corpus design thus prioritizes breadth of exposure and contextual diversity over strict standardization, reflecting the assumption that grammatical knowledge itself is shaped by heterogeneous input.
4.2 Written vs Spoken Corpora
The distinction between written and spoken corpora is central to any study of syntactic variation, particularly in languages where literacy practices and spoken norms diverge significantly. Written data tend to reflect institutional norms, editorial intervention, and genre-specific conventions, while spoken data provide access to more immediate, interaction-driven constructions. Both are indispensable for modeling the full constructional network of a language.
In Urdu, written corpora are dominated by literary, journalistic, and administrative genres, often exhibiting conservative syntactic patterns and high degrees of morphological regularity. Spoken Urdu, by contrast, reveals greater flexibility in word order, reduced morphological marking, and extensive contact-induced features. Saraiki shows an even sharper contrast, as spoken usage far exceeds written production in both volume and functional range.
English, while benefiting from extensive coverage in both modalities, also exhibits systematic differences between spoken and written registers, particularly in the use of periphrastic constructions and clause-combining strategies. For comparative purposes, this study treats written and spoken corpora not as competing representations of a single grammar, but as complementary windows onto different regions of the constructional network. Register-sensitive analysis ensures that observed variation is interpreted in light of modality rather than misattributed to dialectal or individual differences.
4.3 Dialectal Representation in Saraiki
Saraiki presents a unique methodological case due to its internal diversity and sociolinguistic status. Often described as a dialect continuum rather than a unified language, Saraiki encompasses multiple regional varieties with distinct phonological, lexical, and morphosyntactic profiles. Capturing this diversity is essential for any serious account of syntactic variation.
Rather than selecting a single “representative” variety, this study incorporates data from multiple regions, including both rural and urban contexts. Dialectal variation is treated as an object of analysis rather than noise to be filtered out. This approach aligns with the complex-systems view that population-level patterns emerge from overlapping but non-identical individual grammars.
Importantly, dialectal representation is achieved through exposure-based sampling rather than rigid geographic boundaries. Speakers are grouped based on shared interactional environments and communicative networks, allowing dialectal similarity to be modeled probabilistically. This method avoids reifying political or administrative divisions and instead reflects the lived linguistic reality of Saraiki speakers.
4.4 Regional and Global English Corpora
English serves a dual function in this study: it provides a typological contrast to Urdu and Saraiki, and it offers a benchmark for evaluating methodological choices. To fulfill these roles, the English data include both regional corpora representing localized varieties and global corpora capturing transnational usage across registers.
Regional English corpora allow for direct comparison with Saraiki dialects, particularly in examining how syntactic variation correlates with population contact and mobility. Global English corpora, in turn, reveal how constructions adapt to diverse communicative contexts, often decoupled from specific geographic locations. These data are especially valuable for analyzing register variation, as many global English registers, such as academic writing or digital communication, operate across national boundaries.
Crucially, English data are not treated as theoretically privileged. Instead, they are subjected to the same constructional modeling and similarity measures applied to Urdu and Saraiki. This symmetrical treatment ensures that cross-linguistic comparisons reflect genuine differences in system architecture rather than disparities in data richness.
4.5 Register Annotation across Three Languages
Register annotation poses particular challenges in a multilingual, typologically diverse study. Registers are not universally defined categories; they are language- and culture-specific configurations of context, function, and form. A register classification scheme suitable for English cannot be applied wholesale to Urdu or Saraiki without distortion.
This study adopts a function-oriented approach to register annotation, focusing on communicative purpose rather than surface features. Registers are defined in terms of recurring situational contexts, such as narration, instruction, persuasion, or interpersonal interaction, and are mapped onto constructions that preferentially realize these functions. This approach allows register variation to be compared across languages even when the formal realization differs.
Register annotation is applied consistently across written and spoken corpora, enabling analysis of how context-sensitive constructions interact with typological constraints. By integrating register information into the corpus design, this study ensures that subsequent analyses of variation can distinguish between differences arising from exposure, population structure, and communicative function.
Summary
This section has outlined the corpus foundation of the study, emphasizing methodological decisions shaped by typological diversity and theoretical commitments. By combining written and spoken data, preserving dialectal diversity, and applying function-based register annotation, the corpora provide a robust empirical basis for modeling syntactic variation as an emergent property of complex constructional systems.
The next section turns to the computational and analytical methods used to extract constructional patterns from these corpora and to measure similarity and variation across individuals, populations, and registers.
5: Modeling Constructional Similarity
5.1 Identifying Constructions across Languages
A central methodological challenge in comparative Construction Grammar is identifying constructions in a way that allows meaningful cross-linguistic comparison without presupposing structural equivalence. Constructions are language-specific pairings of form and meaning, and their surface realizations differ substantially across Urdu, Saraiki, and English. The goal, therefore, is not to identify identical constructions across languages, but to identify functionally comparable constructional schemas.
This study adopts a bottom-up, usage-based approach to construction identification. Rather than imposing predefined syntactic categories, constructions are inferred from recurring form–function pairings observed in corpus data. In Urdu and Saraiki, this process foregrounds morphological patterns, postpositional frames, and verb-complex structures. In English, it foregrounds word order patterns, auxiliary sequences, and clause-level configurations.
Crucially, constructions are identified at multiple levels of abstraction. Low-level constructions capture lexically specific or morphologically bound patterns, while higher-level schemas generalize across these instances. This multi-level representation enables comparison across languages without collapsing distinct grammatical systems into a single template.
5.2 Morphosyntactic Glossing and Abstraction
Morphosyntactic glossing plays a critical role in mediating between surface diversity and abstract representation. Glossing is not treated merely as a descriptive convenience, but as an analytical step that enables abstraction across typologically distinct systems. By consistently glossing grammatical functions, such as tense, aspect, agreement, and case, this study creates a shared analytical space in which constructions can be compared without erasing language-specific form.
For Urdu and Saraiki, Leipzig-style glossing captures rich inflectional detail, allowing constructions to be analyzed in terms of functional components rather than orthographic forms. For English, glossing highlights periphrastic sequences and syntactic dependencies that encode comparable functions. This approach ensures that abstraction proceeds from observed usage rather than theoretical fiat.
Importantly, abstraction is gradual and probabilistic. Constructions are not reduced to maximally general schemas prematurely; instead, higher-order abstractions emerge through observed clustering of lower-level patterns. This mirrors the cognitive process of grammatical generalization and preserves the distributional information necessary for modeling variation.
5.3 Measuring Similarity without One-to-One Equivalence
Comparative syntactic research has often assumed that similarity requires one-to-one correspondence between structures. This assumption is untenable in a constructionist framework, particularly when comparing morphologically rich and analytic languages. Instead, similarity must be understood as a gradient relation based on shared functional, distributional, and network properties.
This study operationalizes constructional similarity through a combination of usage profiles and network overlap. Constructions are compared based on their distribution across contexts, their connectivity within the constructional network, and their functional associations. Two constructions may be considered similar if they occupy analogous positions in their respective networks, even if their surface forms differ radically.
This approach allows, for example, an English periphrastic tense construction and an Urdu inflectional tense construction to be treated as comparable without being equated. Similarity is thus relational rather than absolute, reflecting the idea that grammatical systems can be functionally aligned without structural isomorphism.
5.4 Computational Models Adapted for Morphologically Rich Systems
Computational models of syntax have historically been optimized for analytic languages with relatively simple morphology. Applying these models to Urdu and Saraiki requires careful adaptation. Tokenization, morphological segmentation, and sparsity management are particularly challenging in morphologically rich systems, where a single word form may encode multiple grammatical functions.
This study employs models that operate over constructional features rather than raw tokens, reducing sensitivity to surface variation. Morphological features are treated as integral components of constructions, enabling the model to capture systematic variation within paradigms. Network-based representations further mitigate sparsity by linking related constructions through shared abstract features.
By using the same modeling principles across all three languages, this approach ensures comparability while respecting typological differences. The resulting models are not language-neutral in a superficial sense, but language-adaptive in a theoretically principled way.
5.5 Evaluation and Reproducibility
Evaluation in comparative constructional modeling must balance predictive accuracy with interpretability. Rather than relying solely on aggregate performance metrics, this study evaluates models based on their ability to recover known patterns of variation and to generate plausible generalizations across levels of abstraction. Cross-validation across individuals, dialects, and registers provides a robust test of model stability.
Reproducibility is treated as a foundational principle. All analytical steps, from corpus preprocessing and glossing conventions to construction identification and similarity measurement, are documented in detail. Where possible, data and code are made available to enable independent replication and extension of the analyses.
By foregrounding reproducibility, this study aligns with broader methodological shifts in computational linguistics and sociolinguistics. More importantly, it reinforces the central theoretical claim of the book: that syntactic variation can only be understood through transparent, scalable models that respect the complexity of grammatical systems.
Summary
This section has outlined a constructionist and computational framework for modeling syntactic similarity across typologically diverse languages. By identifying constructions bottom-up, abstracting through morphosyntactic glossing, and measuring similarity relationally rather than structurally, the methodology enables systematic comparison without theoretical distortion. The empirical sections that follow build on this framework to examine how variation unfolds across individuals, populations, and contexts.
Part III: Individual-Level Variation
6: Individual Grammars in Multilingual Contexts
6.1 Multilingual Exposure in Pakistan
Pakistan presents one of the most linguistically dense environments in the contemporary world. For many speakers, daily linguistic experience involves regular interaction across multiple languages and varieties, including Urdu, regional languages such as Saraiki, Punjabi, and Sindhi, and English in educational, professional, and digital domains. This layered exposure challenges monolingual assumptions that have long underpinned models of grammatical competence.
From a constructionist perspective, multilingual exposure does not entail the coexistence of separate, autonomous grammars. Instead, it results in overlapping constructional networks shaped by differential frequency and contextual activation. Speakers acquire constructions through use, and in multilingual settings, exposure to structurally related and unrelated constructions occurs simultaneously. The outcome is not grammatical instability but a highly adaptive system capable of context-sensitive selection.
This ecological reality has profound implications for the study of syntactic variation. Individual grammars in Pakistan are best understood as dynamic systems in which exposure histories differ not only in quantity but in quality. The variation observed at the individual level is therefore a direct reflection of the heterogeneity of linguistic experience, rather than a deviation from an abstract norm.
6.2 Urdu–Saraiki Code-Interaction as Constructional Blending
Interaction between Urdu and Saraiki provides a particularly instructive case of constructional blending. Speakers frequently alternate between the two varieties within conversations, and sometimes within clauses, yet this behavior cannot be reduced to simple code-switching in the traditional sense. Rather than switching between discrete grammatical systems, speakers draw on a shared pool of constructions with varying degrees of entrenchment.
In many cases, Urdu and Saraiki constructions overlap functionally while differing in form, morphological realization, or pragmatic nuance. Individual speakers may favor one construction over another depending on interlocutor, topic, or setting. Over time, repeated exposure to such alternations leads to blending, whereby constructions are partially generalized across languages, producing hybrid usage patterns.
Construction Grammar offers a principled way to model this phenomenon. Blending is not viewed as grammatical interference or erosion, but as a natural consequence of network overlap. Constructions that are functionally similar but formally distinct become linked within the individual’s grammar, allowing probabilistic selection rather than categorical choice. This mechanism explains why multilingual speakers often exhibit stable yet non-uniform grammatical patterns across contexts.
6.3 Individual Construction Inventories
At the heart of individual-level variation lies the notion of a construction inventory: the set of constructions that a speaker has acquired, along with their associated frequencies, contexts of use, and degrees of abstraction. No two speakers possess identical inventories, even within the same community, because no two exposure histories are identical.
In multilingual settings, individual inventories are especially diverse. Some speakers exhibit dense inventories in Urdu with peripheral Saraiki constructions, while others show the reverse pattern. English constructions may occupy central or peripheral positions depending on educational background and professional domain. These differences are not merely quantitative; they shape the internal structure of the constructional network itself.
This study models individual grammars as probabilistic subsets of the broader population network. Variation arises from differences in which constructions are entrenched, how strongly they are connected, and which higher-order schemas have emerged through generalization. Importantly, variation is distributed unevenly: highly abstract constructions tend to be shared across individuals, while low-level constructions are the primary locus of individual differentiation.
6.4 Stability and Fluctuation across Registers
Individual grammars are not static; they are context-sensitive systems that adapt to communicative demands. Register variation provides a key lens through which to observe this adaptability. The same speaker may employ different constructions when writing academically in English, engaging in informal conversation in Urdu, or narrating personal experiences in Saraiki.
Despite this surface variability, a core of stability persists. Highly entrenched constructions, particularly those associated with fundamental argument structure and clause organization, remain consistent across registers. Fluctuation is concentrated in peripheral constructions and in the selection among functionally equivalent options. This pattern aligns with the complex-systems view that stability emerges from highly connected nodes in the network, while flexibility is localized at the edges.
By examining register-sensitive usage at the individual level, this study demonstrates that variability does not imply grammatical inconsistency. Rather, it reflects the adaptive deployment of a shared constructional repertoire across contexts. Understanding this distinction is essential for interpreting individual variation without conflating it with change or error.
6.5 Cognitive Implications
The findings of individual-level variation have important implications for theories of grammatical cognition. First, they challenge the notion of a uniform mental grammar shared across speakers. Instead, grammatical knowledge appears as a gradient, experience-dependent system shaped by exposure and use. Second, they support models of cognition in which abstraction emerges gradually through repeated encounters with concrete patterns, rather than being innately specified.
Multilingual speakers in particular illustrate the cognitive flexibility of constructional systems. The ability to blend constructions across languages without loss of communicative effectiveness suggests that grammatical representations are not rigidly compartmentalized. Instead, they are organized around functional and usage-based similarities that transcend language boundaries.
Finally, individual variation underscores the importance of probabilistic representation in grammatical theory. Speakers do not choose constructions categorically; they navigate a landscape of weighted options influenced by frequency, context, and social factors. This probabilistic nature of grammar is not a deficiency but a defining feature of language as a complex adaptive system.
Summary
This section has situated individual-level syntactic variation within the multilingual ecology of Pakistan, demonstrating how exposure, constructional blending, and register sensitivity shape individual grammars. By modeling individual grammars as dynamic constructional networks, the section provides a foundation for the quantitative analyses of individual similarity and variation developed in subsequent chapters.
7: Idiolectal Variation across the Three Languages
7.1 Measuring Individual Differences
Idiolectal variation refers to systematic differences in grammatical behavior at the level of individual speakers. Within a constructionist framework, such variation is not treated as noise but as an inherent property of usage-based grammars. Measuring individual differences therefore requires methods that capture gradient preferences rather than categorical distinctions.
In this study, individual variation is operationalized through constructional frequencies, distributional patterns, and degrees of abstraction within individual grammars. Rather than assuming uniform competence, each speaker is modeled as possessing a probabilistic constructional network shaped by exposure, register, and interactional context. Similarity between individuals is thus assessed not by identical forms, but by the relative weighting of shared constructions.
Crucially, measurement avoids imposing one-to-one equivalence across languages. Instead, constructions are compared at an abstract functional level, allowing variation to be assessed even when formal realizations differ. This approach enables meaningful comparison between morphologically rich systems such as Urdu and Saraiki and more periphrastic systems such as English.
7.2 Which Constructions Vary Most?
Idiolectal variation is not evenly distributed across the grammatical system. Across the three languages, variation clusters around low-frequency, low-level constructions rather than highly abstract schemas. Core constructions governing basic clause structure, argument realization, and tense–aspect organization show strong stability across individuals.
In Urdu and Saraiki, variation is most pronounced in constructions involving optional morphology, pragmatic marking, and information-structural choices. Speakers differ in their preference for particular postpositions, aspectual markers, or clause-final particles, reflecting differences in exposure and sociolinguistic alignment. These constructions are highly sensitive to context and therefore more susceptible to individual differentiation.
In English, idiolectal variation concentrates in periphrastic constructions, such as auxiliary selection, modal usage, and complex predicate formation. While English lacks the dense inflectional morphology of Urdu and Saraiki, it compensates through a wide range of analytic constructions whose usage patterns vary considerably across speakers. Thus, variation emerges not from morphological choice but from constructional competition within the periphrastic system.
7.3 Morphological vs Periphrastic Constructions
A central finding of this comparative analysis is that morphological richness and periphrastic complexity give rise to different loci of idiolectal variation. In Urdu and Saraiki, morphological constructions tend to be highly entrenched and therefore relatively stable at higher levels of abstraction. Variation emerges primarily in the selection and combination of morphological exponents, rather than in the underlying constructional schema.
By contrast, English periphrastic constructions exhibit greater variability at the level of constructional packaging. Because grammatical meaning is distributed across multiple elements, auxiliaries, particles, and word order, speakers differ more noticeably in how they assemble these components. This leads to a wider range of individual preferences, even when communicative intent remains constant.
From a Construction Grammar perspective, this contrast underscores the role of form, meaning mappings in shaping variation. Morphological constructions favor compact, routinized mappings that resist individual divergence, whereas periphrastic constructions offer greater combinatorial flexibility, increasing the space for idiolectal differentiation.
7.4 English Speakers vs Urdu/Saraiki Speakers
Comparing English speakers with Urdu and Saraiki speakers reveals important asymmetries in the nature of individual variation. English-speaking individuals tend to show greater variation in constructional choice across registers, particularly between spoken and written domains. This reflects the strong role of stylistic conventions and genre-specific constructions in English.
Urdu and Saraiki speakers, by contrast, exhibit stronger stability across registers at the level of core constructions, with variation emerging more subtly through morphological alternations and discourse-sensitive elements. Individual differences are often encoded in frequency and distribution rather than overt structural divergence.
In multilingual speakers, these patterns interact. English constructions may display greater fluctuation within the same individual, while Urdu and Saraiki constructions remain comparatively stable. This asymmetry supports the claim that typological structure interacts with multilingual exposure to shape idiolectal profiles, rather than individual variation being uniform across languages.
7.5 Summary of Individual Variation Patterns
This section has demonstrated that idiolectal variation across Urdu, Saraiki, and English is systematic, structured, and typologically conditioned. Variation is concentrated in peripheral constructions, while highly abstract schemas remain stable across individuals. Morphological systems constrain variation through entrenchment, whereas periphrastic systems expand the space for individual choice.
The comparative perspective further reveals that individual grammars cannot be meaningfully analyzed without reference to typological architecture. English, Urdu, and Saraiki do not merely differ in form; they distribute variability across different levels of the constructional network. Understanding idiolectal variation therefore requires a model that integrates usage, typology, and cognition.
Transition Paragraph
The patterns identified here set the stage for the final synthesis of the study, where individual-level variation is integrated with population-level tendencies to evaluate broader claims about constructional universals and language-specific constraints.
Part IV: Population and Dialectal Variation
8: Saraiki Dialects and Regional Urdu
8.1 Saraiki as a Dialect Cluster
Saraiki is best understood not as a monolithic language but as a cluster of closely related dialects distributed across southern Punjab, parts of Sindh, Khyber Pakhtunkhwa, and Balochistan. From a constructionist perspective, this clustering reflects overlapping constructional inventories rather than sharply bounded grammatical systems.
Speakers across the Saraiki-speaking region share a core set of constructions governing clause structure, argument realization, and tense–aspect marking. At the same time, peripheral constructions, particularly those involving phonologically conditioned morphology, case marking, and discourse particles, vary systematically across regions. These patterns support a view of Saraiki as a network of partially overlapping grammars rather than a discrete entity.
This dialect-cluster model aligns naturally with Construction Grammar, which treats grammars as gradient, usage-based systems. Rather than asking whether a variety “is” or “is not” Saraiki, the relevant question becomes how strongly a given speaker’s constructional network overlaps with other nodes in the Saraiki continuum.
8.2 Grammatical Variation across Regions
Regional grammatical variation within Saraiki and between Saraiki and regional Urdu manifests most clearly in morphosyntactic constructions that are sensitive to frequency and contact. These include differential object marking, auxiliary selection, agreement patterns, and postpositional choice.
Some regions exhibit stronger retention of conservative constructions, while others show innovation or convergence toward Urdu-like patterns. Importantly, these differences are not evenly distributed across the grammar. Core constructions remain highly stable, while variation concentrates in lower-level constructions with weaker entrenchment.
Regional Urdu varieties likewise display constructional differentiation, particularly in argument structure alternations and auxiliary constructions. These regional grammars diverge from standardized Urdu not through categorical deviations but through shifting probabilities across competing constructions. Such variation is best captured through population-level modeling rather than discrete feature lists.
8.3 Contact with Punjabi, Sindhi, and Urdu
Language contact plays a central role in shaping constructional variation across Pakistan. Saraiki speakers are often in sustained contact with Punjabi, Sindhi, and Urdu, resulting in complex patterns of constructional borrowing, calquing, and realignment.
From a Construction Grammar perspective, contact-induced change does not require direct transfer of forms. Instead, speakers may map familiar functions onto constructions drawn from another language, gradually reshaping their constructional networks. For example, Urdu-based periphrastic constructions may coexist with Saraiki morphological constructions, with individual and regional preferences determining relative frequency.
Crucially, contact effects are asymmetric. Punjabi influence is often evident in phonologically conditioned morphology, Sindhi influence in certain argument-structure patterns, and Urdu influence in high-register and written constructions. These influences accumulate through repeated exposure, reinforcing some constructions while weakening others.
8.4 Exposure-Based Dialect Emergence
Dialect emergence in Saraiki and regional Urdu can be modeled as an outcome of differential exposure across populations. Communities with higher levels of interregional mobility, education, and media access show greater convergence toward supraregional constructions, particularly those associated with Urdu.
Conversely, geographically or socially less mobile communities exhibit denser local constructional networks, preserving region-specific variants. This pattern mirrors findings in computational sociolinguistics, where dialect similarity correlates strongly with contact intensity rather than geographic distance alone.
Within a complex-systems framework, dialects emerge not through top-down standardization but through the aggregation of individual grammars shaped by shared exposure. Over time, these shared exposure profiles stabilize into recognizable regional patterns, even in the absence of formal codification.
8.5 Dialect Networks in Pakistan
Viewing dialects as networks provides a powerful lens for understanding variation in Pakistan’s linguistic landscape. Each locality can be modeled as a node defined by its aggregate constructional profile, with edges representing degrees of similarity based on shared usage patterns.
Such networks reveal that Saraiki dialects form dense clusters with gradual transitions rather than sharp boundaries. Regional Urdu varieties often occupy intermediary positions, linking otherwise distinct clusters. These network structures explain why speakers frequently perceive mutual intelligibility alongside grammatical difference.
Most importantly, this network-based view challenges traditional taxonomies that treat languages and dialects as discrete units. Instead, it supports a constructionist model in which population-level variation emerges from continuous interaction among overlapping grammars.
Transition Paragraph
This section has shown that population-level variation in Pakistan is best understood as an emergent property of exposure-driven constructional networks. The next section extends this perspective by comparing dialectal variation with register-based variation, revealing how social context reorganizes constructional preferences independently of geography.
9: Comparing Dialectal Variation Cross-Linguistically
9.1 Dialect in Morphologically Rich vs Analytic Languages
Dialectal variation manifests differently depending on typological profile. Urdu and Saraiki, as morphologically rich languages, encode syntactic and semantic distinctions within inflectional and derivational morphology. This richness constrains variation in core constructions while permitting localized differences in peripheral forms, such as optional case marking, aspectual morphology, or discourse particles.
By contrast, English, an analytic language, relies heavily on word order and periphrastic constructions. Variation in English dialects therefore emerges less from inflectional alternation and more from preferences in auxiliary selection, clause-combining strategies, and pragmatic marking. Consequently, while Urdu and Saraiki dialects exhibit stability in central grammatical schemas, English dialects demonstrate greater flexibility in core syntactic constructions, reflecting typological affordances rather than sociolinguistic divergence alone.
9.2 Core vs Peripheral Constructions in Urdu and Saraiki
Empirical evidence from Saraiki dialects and regional Urdu reveals that variation is systematically concentrated in peripheral constructions. Core constructions, such as canonical SOV order, agreement paradigms, and argument structure templates, remain stable across localities. Peripheral constructions, including optional postpositions, discourse markers, and stylistic particles, show high local specificity and sensitivity to contact with neighboring languages.
This core–periphery distinction aligns with the network perspective in Construction Grammar. Highly connected nodes representing core constructions resist divergence, while peripheral nodes, less entangled in the network, are free to vary and adapt to local usage. Such patterns illustrate how grammar behaves as a complex adaptive system, with dialectal variation emerging in low-stakes nodes while core functionality is maintained.
9.3 Why English Dialects Pattern Differently
English dialects display a distinct profile of variation due to typology and historical sociolinguistic processes. Unlike Urdu and Saraiki, where morphology encodes much grammatical information, English relies on distributed periphrastic constructions. This structural difference leads to greater flexibility in what are “core” constructions in English, such as tense–aspect auxiliaries, modal verbs, and clause-chaining strategies.
Furthermore, historical settlement patterns, media exposure, and educational standardization have shaped English dialects differently from South Asian languages. Variation is often register-sensitive, occurring across spoken, written, and digital contexts, whereas in Urdu and Saraiki, register variation overlays but does not overwhelm the underlying dialectal network.
9.4 Generalization and Convergence
Across languages, two mechanisms mediate dialectal variation: generalization and convergence. In Urdu and Saraiki, exposure-based generalization reduces individual differences in higher-order constructions, smoothing variation across dialects. Peripheral constructions, however, remain locally divergent, producing measurable dialectal identity.
In English, generalization operates across both core and peripheral constructions, particularly in highly mobile or connected populations. Convergence is further facilitated by contact through media, education, and interregional communication, leading to less rigid dialect boundaries. This contrast illustrates that typology interacts with social exposure to shape both the locus and magnitude of dialectal variation.
9.5 Implications for Dialect Theory
Comparative analysis of Urdu, Saraiki, and English dialects challenges several traditional assumptions in dialectology. First, the notion of discrete, bounded dialects is insufficient; exposure-based networks better capture the continuous and gradient nature of variation. Second, core constructions are typologically constrained: morphologically rich languages preserve them, while analytic languages allow greater fluctuation even in high-centrality constructions. Third, register, contact, and individual exposure interact in complex ways, making dialectal identity an emergent property rather than a fixed attribute.
From a Construction Grammar perspective, these findings reinforce the view that dialects are not simply variations on a common underlying grammar. Instead, dialectal patterns emerge from the interaction of individual grammars within a population, mediated by typology, contact, and functional pressure. This perspective bridges micro-level idiolectal variation with macro-level population patterns, offering a unified framework for understanding cross-linguistic syntactic variation.
Transition Paragraph
The patterns of dialectal variation elucidated here provide a foundation for the final integrative analysis. The next chapter examines register and contextual variation, highlighting how functional pressures reorganize constructional networks independently of geography, while linking findings to both idiolectal and dialectal structures.
Part V: Register and Contextual Variation
10: Register in Urdu, Saraiki, and English
10.1 Formality, Function, and Genre
Register reflects systematic variation in linguistic behavior driven by social, functional, and situational constraints. In Construction Grammar terms, registers emerge as probabilistic shifts in the activation and selection of constructions according to communicative need. These shifts are not random but patterned, reflecting the mapping of constructions onto functionally defined contexts.
In Urdu, Saraiki, and English, registers are distinguished along multiple dimensions: formality, domain, and modality. Formal registers, such as literary Urdu, administrative correspondence, or academic English, favor highly entrenched and abstract constructions. Informal registers, such as oral Saraiki, social media English, or colloquial Urdu, allow more peripheral constructions, idiomatic patterns, and morphosyntactic innovation. Across languages, register operates as a lens through which individual and population-level variation is filtered and expressed.
10.2 Literary vs Administrative Urdu
Urdu provides a rich case study of register-driven constructional variation. Literary Urdu exhibits elaborate nominalization, postpositional marking, and syntactic embedding, reflecting both historical codification and stylistic preferences. Administrative Urdu, in contrast, prioritizes clarity, brevity, and standardization, leading to constructions that are more analytic, minimally embedded, and predictable.
These differences are mirrored in constructional networks. Literary constructions occupy high-centrality positions with complex dependency structures, while administrative constructions emphasize peripheral, formulaic schemas optimized for functional efficiency. Speakers navigating multiple registers adapt their constructional choices, illustrating register-sensitive flexibility within a single grammar.
10.3 Oral vs Written Saraiki
Saraiki exhibits pronounced oral–written variation, with oral registers characterized by flexible word order, discourse markers, and context-dependent morphology. Written Saraiki, often influenced by Urdu and standardized orthographies, exhibits greater stability, regularization, and formalized constructions.
From a constructionist perspective, these patterns reveal the interaction of frequency, exposure, and functional necessity. Oral constructions are highly adaptive and responsive to real-time communicative pressures, whereas written constructions are more generalized and abstract. Individual variation interacts with register variation, as speakers must select and deploy constructions appropriate for context, blending conservative and innovative patterns.
10.4 Academic, Media, and Digital English
English demonstrates both typological and domain-specific register effects. Academic English favors complex nominal and clause-level constructions, dense subordination, and specialized periphrastic patterns. Media English, including journalism and broadcast, prioritizes clarity, concision, and narrative strategies. Digital English, particularly social media, foregrounds creative constructions, ellipsis, and hybrid forms that often blur standard grammatical boundaries.
Across these registers, constructional selection is shaped by functional pressures rather than innate grammar. The same speaker may deploy markedly different constructions across academic writing, news reporting, and online interaction, illustrating that register is a primary driver of constructional variability.
10.5 Register as Construction–Function Mapping
A unifying insight from cross-linguistic analysis is that register operates primarily through construction–function mapping. While individual and dialectal differences reflect exposure and typology, register variation reflects context-specific functional demands. Constructions are differentially activated depending on situational requirements, communicative goals, and genre conventions.
This mapping explains why some constructions, stable across dialects or idiolects, vary systematically with register. Peripheral constructions and low-abstraction forms are particularly sensitive, while core, highly abstract schemas remain robust. The result is a layered pattern of variation in which language is both stable and flexible, reflecting the interplay of typology, exposure, and functional adaptation.
Summary
This section has shown that register is a potent source of syntactic variation, distinct from individual or dialectal differences yet operating within the same constructional network. By comparing Urdu, Saraiki, and English, we see that register effects are functionally motivated and typologically mediated, shaping how constructions are selected, generalized, and combined across contexts. Register provides a crucial lens for understanding grammar as a complex, adaptive system, responsive to both social and cognitive pressures.
11: Functional Pressure and Core Constructions
11.1 Register Variation and the Grammatical Core
Core constructions, those governing basic clause structure, argument realization, and central morphological paradigms, remain highly stable across individuals and dialects. However, register introduces functional pressures that selectively influence constructional activation without destabilizing the core network.
In Urdu, literary and administrative registers reveal that while peripheral constructions shift dramatically, core clause-level constructions (e.g., SOV order, agreement paradigms) remain robust. Similarly, in Saraiki, core argument structure and tense–aspect marking are largely invariant across oral and written contexts. In English, despite greater flexibility in periphrastic constructions, core constructions, such as auxiliary–main verb sequences in declaratives, are maintained across academic, journalistic, and digital registers.
These patterns demonstrate that functional pressure acts more strongly on constructions whose role is peripheral or optional, allowing the core to function as an anchor in the constructional network.
11.2 Function-Driven Selection of Constructions
Constructional selection is determined by communicative function as much as by typological constraints. Speakers deploy constructions probabilistically based on the functional requirements of the register: clarity, brevity, politeness, narrative structure, or informational density.
For example, in academic Urdu, constructions favor nominalization and embedded clauses to convey complex information, while in informal oral Saraiki, speakers rely on flexible word order and discourse particles for pragmatic emphasis. In English, digital registers favor ellipsis, contractions, and hybrid constructions for efficiency and social signaling.
This functional perspective aligns with Construction Grammar: form–meaning pairings are activated in context-sensitive ways, and usage frequency within functional domains determines entrenchment and productivity.
11.3 Why Register Behaves Differently from Dialect
Unlike dialectal variation, which arises from differential exposure and population-level network effects, register variation is contextually induced. While dialectal differences are relatively stable across settings, register effects are transient and adaptive.
Core constructions remain resistant to register pressure, while peripheral constructions are selectively amplified or suppressed depending on the communicative context. This distinction clarifies why the same speaker can maintain dialectal identity while varying constructions across registers. The functional flexibility inherent in registers thus complements rather than competes with idiolectal or dialectal variation.
11.4 Cross-Linguistic Comparison
Cross-linguistic comparison highlights typology-specific responses to functional pressure. Morphologically rich languages such as Urdu and Saraiki favor peripheral variation in morphology and optional constructions, preserving core schemas. Analytic languages such as English exhibit more variation in periphrastic core constructions due to distributed encoding of grammatical information.
Despite these differences, the principle of function-driven selection holds universally: registers prioritize constructions that best serve communicative goals, producing systematic and predictable patterns of variation. Core–periphery dynamics, frequency effects, and network connectivity explain cross-linguistic convergence in the resilience of core constructions.
11.5 Case Studies
Literary Urdu vs Administrative Urdu: Embedded relative clauses and nominalized predicates are dominant in literary contexts, while administrative texts reduce subordination, favoring analytic verb phrases. Peripheral discourse markers are attenuated in formal registers.
Oral vs Written Saraiki: Oral narratives exploit flexible SOV–OSV alternation and optional particles to mark focus, whereas written Saraiki, influenced by Urdu, stabilizes canonical SOV structures and reduces optional morphology.
Academic vs Digital English: Academic writing favors nominalizations, complex clauses, and formal auxiliary use. Digital English, in contrast, promotes ellipsis, multiword units, and informal modals, illustrating register-induced peripheral shifts while maintaining declarative auxiliaries as core constructions.
These cases illustrate the interaction of functional pressure, register, and typological structure in shaping constructional networks at both peripheral and core levels.
Summary
Functional pressure selectively organizes constructional deployment across registers while leaving the core grammar largely intact. Peripheral constructions, sensitive to context and frequency, adapt flexibly to communicative demands. Cross-linguistically, morphologically rich and analytic systems differ in which constructions are peripheral and which are core, but the underlying mechanism of function-driven selection is consistent. This reinforces the view of language as a complex, adaptive system, in which core constructions anchor stability while peripheral constructions accommodate flexibility and functional requirements.
Part VI: Integrated Analysis
12: Mapping Variation onto Constructional Networks
12.1 Where Variation Lives in Each Language
Variation is not uniformly distributed across grammatical systems. In Urdu and Saraiki, variation is concentrated in peripheral morphological constructions, optional particles, and discourse-sensitive forms, while core SOV syntax, argument structure, and agreement paradigms remain stable across individuals and dialects.
In English, variation is more pronounced in periphrastic constructions, including auxiliary sequences, clause-combining strategies, and modality. Despite this, core declarative and interrogative patterns remain resilient across speakers, dialects, and registers.
Mapping these loci of variation reveals that each language exhibits a distinctive pattern, shaped by typology, historical development, and frequency-driven entrenchment. The core–periphery distinction provides a consistent framework for locating where variation emerges and stabilizes.
12.2 Order of Emergence and Abstraction
Constructional networks develop hierarchically: concrete, first-order constructions emerge from early exposure, and higher-order constructions arise through abstraction and generalization. Individual differences, dialectal divergence, and register effects interact differently with these levels:
First-order constructions: High susceptibility to individual and regional variation, sensitive to frequency and exposure.
Second- and third-order constructions: Partially generalized, showing moderate stability with localized flexibility.
High-order abstract constructions: Highly generalized, resistant to both individual and dialectal variation, but still responsive to register-driven functional pressures.
Across all three languages, order of emergence predicts variation density: peripheral constructions at lower abstraction levels vary more, while core abstract schemas remain robust.
12.3 Morphology vs Syntax in Variation
Morphological richness and analytic structure shape how variation is distributed.
Urdu and Saraiki: Morphology encodes much of the grammatical load, producing stable core syntax and localized variation in morphological exponents and optional constructions. Peripheral morphology serves as the primary locus of variation.
English: Analytic construction relies on word order, auxiliary selection, and periphrastic forms. Here, variation occurs across both peripheral and higher-order syntactic constructions, reflecting the distributed nature of grammatical encoding.
These patterns demonstrate that typology mediates the locus and extent of variation while conforming to the broader principle that high-centrality constructions resist change.
12.4 Cross-Linguistic Convergence and Divergence
Despite typological differences, several general principles emerge:
Core constructions are universally stable, providing functional and cognitive anchors.
Peripheral constructions are sensitive to exposure, context, and register, and show both individual and dialectal divergence.
Abstraction mediates the impact of variation, with higher-order constructions smoothing differences across populations.
Register effects overlay, rather than overwrite, dialectal and idiolectal variation, producing context-specific convergence.
Divergence occurs primarily in language-specific constructions and typologically constrained patterns, while convergence arises in response to functional pressures, exposure overlap, and cognitive constraints.
12.5 A Unified Model
Integrating individual, dialectal, and register variation yields a unified model of syntactic variation as an emergent property of constructional networks. Key principles of the model include:
Network perspective: Constructions are nodes with weighted connections, shaped by frequency, function, and exposure.
Core–periphery dynamics: Highly connected core constructions remain stable; peripheral nodes vary with context and usage.
Exposure-driven adaptation: Individual and dialectal differences reflect the aggregation of exposure histories across populations.
Function-driven flexibility: Register modulates constructional activation according to situational needs.
Typology-sensitive distribution: Morphological richness constrains peripheral variation; analytic systems distribute variation across periphrastic constructions.
This framework allows for predictive modeling of syntactic variation, cross-linguistic comparison, and the identification of loci of robust versus flexible constructions. It bridges micro-level idiolectal patterns, meso-level dialect networks, and macro-level register effects into a coherent theoretical system.
Summary
Section 12 demonstrates that variation in Urdu, Saraiki, and English is systematic, hierarchical, and typology-sensitive. By mapping individual, dialectal, and register variation onto constructional networks, the section provides a coherent architecture for understanding language as a complex adaptive system, where core constructions anchor stability while peripheral constructions accommodate both functional and exposure-driven variability.
13: Exposure, Generalization, and Grammatical Stability
13.1 Individual Exposure as the Engine of Variation
Variation originates primarily from individual exposure. Speakers encounter distinct frequencies of constructions across contexts, registers, and social networks. These experiences shape probabilistic constructional networks that differ subtly between individuals, producing idiolectal diversity.
In Urdu and Saraiki, exposure to local dialects, media, education, and multilingual interaction generates minor divergences in peripheral constructions while preserving core grammatical structures. In English, exposure effects are amplified by the flexibility of periphrastic constructions, producing greater observable variation even in core syntactic patterns.
13.2 Generalization as a Stabilizing Force
Generalization mechanisms act to reduce variability by abstracting patterns from repeated instances of construction use. High-frequency and highly connected constructions are reinforced, forming a stable core across speakers.
In Urdu and Saraiki, generalization stabilizes argument structure templates and canonical SOV patterns. Peripheral constructions, being less frequent or less connected, remain sites of idiolectal and dialectal variation.
In English, generalization operates across periphrastic schemas, smoothing irregularities and aligning constructions with functional norms, particularly in written and formal registers.
Thus, generalization functions as a system-wide corrective, balancing the divergent effects of heterogeneous exposure.
13.3 Why Urdu and Saraiki Converge at Higher Abstraction
The convergence of Urdu and Saraiki at higher abstraction levels is a consequence of their shared genealogical heritage and similar morphosyntactic architecture. First-order constructions exhibit local divergence, reflecting regional exposure differences. However, as constructions generalize to second- and higher-order schemas, the effects of individual variation are averaged out, leading to remarkable cross-dialectal stability.
This process illustrates how typologically related languages, even when part of a dialect continuum, maintain convergent high-level constructions despite peripheral variability.
13.4 Why English Relies on Periphrastic Smoothing
English exhibits a different pattern due to its analytic, periphrastic architecture. Grammatical meaning is distributed across multiple auxiliaries, particles, and word-order configurations. Variation in exposure therefore affects both peripheral and some core constructions. Generalization in English acts as periphrastic smoothing, aligning distributed elements across contexts to maintain functional stability.
For example, auxiliary combinations and tense–aspect constructions are smoothed across registers and dialects, producing convergence in functionally critical constructions despite individual variation in usage.
13.5 System-Level Explanation
At the system level, grammatical stability emerges from the interaction of three forces:
Exposure-driven differentiation: Individual experiences introduce variability across constructional networks.
Network-mediated entrenchment: Core constructions, highly connected and frequently used, resist divergence.
Generalization and abstraction: Higher-order constructions integrate input from multiple individuals and contexts, smoothing local divergences.
Together, these mechanisms explain why peripheral constructions are highly variable, core constructions are stable, and typological architecture shapes where and how variation manifests. Urdu and Saraiki converge at abstract levels due to shared structure, while English relies on periphrastic generalization to maintain systemic coherence.
Summary
Section 13 demonstrates that exposure and generalization jointly regulate variation, producing a dynamic equilibrium between innovation and stability. Variation is not random; it is predictable according to the connectivity of constructions, the frequency of exposure, and the typological architecture of each language. This system-level perspective provides a unified explanation for the distribution of syntactic variation across individual, dialectal, and register levels, and across typologically distinct languages.
Part VII: Implications
14: Rethinking South Asian Syntax
4.1 Moving Beyond Rule-Based Descriptions
Traditional grammars of South Asian languages have often relied on rigid, rule-based descriptions that fail to capture gradient, usage-driven variation. Our analyses of Urdu and Saraiki demonstrate that syntactic behavior is probabilistic, context-sensitive, and distributed across complex constructional networks.
Construction Grammar provides a more faithful model by treating grammar as a set of learned form–meaning pairings rather than a prescriptive set of rules. This approach accommodates individual, dialectal, and register variation while preserving predictability in core constructions.
14.2 Reinterpreting “Free Word Order”
The notion of “free word order” in Indo-Aryan languages has long been overstated. Corpus-based analysis reveals that word order is only partially flexible: canonical SOV order dominates in core constructions, while peripheral constructions exhibit greater alternation.
Variation is conditioned by discourse, register, and function rather than by arbitrary freedom. This perspective reframes “flexibility” as probabilistic variation governed by constructional networks, moving away from static categorizations.
14.3 Construction Grammar for Indo-Aryan Languages
Applying Construction Grammar to Urdu and Saraiki offers several advantages:
Gradient representation: Peripheral constructions, optional morphology, and discourse-sensitive patterns are naturally modeled.
Typologically informed abstraction: Hierarchical constructions reflect first-order concrete forms and higher-order generalized schemas.
Cross-level integration: Individual, dialectal, and register effects are incorporated into a unified network framework.
Predictive modeling: Frequency, exposure, and functional constraints can forecast likely variation in both new and under-documented contexts.
This framework bridges descriptive and computational approaches, offering a robust platform for both theoretical and applied research in Indo-Aryan syntax.
14.4 Implications for Linguistic Typology
Our findings have significant typological implications:
Morphologically rich languages such as Urdu and Saraiki stabilize core constructions while permitting peripheral flexibility.
Analytic languages like English distribute variation more evenly across periphrastic constructions.
Functional pressure, exposure, and abstraction operate universally, but their effects are mediated by typology.
A network-based, usage-informed model of grammar accommodates gradient variation and cross-linguistic comparison more effectively than rule-based typologies.
These insights challenge simplistic typological classifications and encourage a dynamic, usage-based approach to understanding syntax across languages.
Summary
Section 14 highlights how a constructionist, usage-based approach reshapes our understanding of South Asian syntax. Free word order is probabilistic rather than absolute; variation is distributed across a complex network of constructions; and both individual experience and register shape grammatical behavior. By integrating these insights, Construction Grammar provides a coherent, typologically informed framework for modeling Urdu, Saraiki, and related Indo-Aryan languages.
15: Pedagogy, Policy, and Standardization
15.1 Teaching Grammar in Multilingual Societies
Grammar instruction in multilingual societies such as Pakistan must reflect the reality of diverse individual exposures and dialectal repertoires. Students bring heterogeneous linguistic experiences to the classroom, with Urdu, Saraiki, Punjabi, and English influencing their constructions.
A Construction Grammar approach encourages teaching patterns and functions rather than rigid prescriptive rules, enabling learners to navigate variation across registers and contexts. Exposure-based pedagogy, which emphasizes frequency, function, and real-world usage, better prepares students to handle both standardized and colloquial forms.
15.2 Myth of a Single “Standard Urdu”
The notion of a single, monolithic “standard Urdu” is misleading. Corpus evidence shows variation across regions, registers, and social groups. Administrative, literary, and spoken registers each exhibit distinct constructional profiles, and peripheral constructions vary widely while core SOV structures remain relatively stable.
Educational and policy frameworks should acknowledge this spectrum of variation, avoiding overemphasis on prescriptive norms and instead teaching register-sensitive usage. Recognizing the dialect continuum with Saraiki and other regional varieties also fosters linguistic inclusivity and sociolinguistic awareness.
15.3 Register Awareness in Education
Register awareness is essential for effective communication in multilingual contexts. Students must learn not only which constructions are grammatically correct but also which are functionally appropriate for academic, formal, digital, or oral registers.
For example, formal written Urdu emphasizes nominalization, embedded clauses, and standardized morphology, whereas oral Saraiki favors flexible word order and discourse-sensitive constructions. English teaching similarly benefits from explicit attention to register, including academic writing, media language, and digital communication.
15.4 English Teaching in Pakistan
English instruction should incorporate insights from variation and usage-based grammar. Learners often encounter both British and North American forms, academic vs digital registers, and context-sensitive constructions. Pedagogy should highlight:
Core constructions that are stable across dialects and registers.
Peripheral constructions that vary according to function, audience, and modality.
Constructional patterns that facilitate effective code-switching and multilingual competence.
This approach promotes communicative competence rather than rote memorization of prescriptive rules.
15.5 Applied Outcomes
The integrated insights from this study have practical implications:
Curriculum design: Emphasize constructional patterns, register awareness, and typologically-informed grammar instruction.
Assessment: Evaluate students’ ability to use constructions functionally across contexts rather than simply marking prescriptive forms.
Policy: Support inclusive recognition of dialectal and register variation in Urdu and Saraiki.
Teacher training: Equip educators with tools to explain probabilistic variation, register sensitivity, and constructional networks.
Language planning: Guide standardization initiatives while respecting natural variation and functional adaptability in multilingual communities.
These outcomes demonstrate that insights from Construction Grammar, typology, and variation research can directly inform education, policy, and applied linguistics, bridging theory and practice.
Summary
Section 15 underscores the importance of context-sensitive grammar pedagogy in multilingual societies. By acknowledging dialectal diversity, register-driven variation, and functional adaptation, educators and policymakers can foster effective communication, linguistic inclusivity, and applied competence in Urdu, Saraiki, and English. Construction Grammar provides a theoretically grounded, empirically informed framework to guide these initiatives.
16: Conclusion- Variation across Typological Space
16.1 What Comparative CxG Reveals
Comparative analysis of Urdu, Saraiki, and English through a Construction Grammar lens demonstrates that grammar is not a static set of prescriptive rules but a dynamic network of form–meaning pairings. Key insights include:
Core–periphery distinction: Core constructions remain stable across individuals, dialects, and registers, while peripheral constructions are highly sensitive to exposure, function, and context.
Typology mediates variation: Morphologically rich Indo-Aryan languages show concentrated peripheral variation, whereas analytic English distributes variation across periphrastic constructions.
Integration across levels: Individual exposure, dialectal contact, and register pressure interact systematically to shape constructional networks.
These findings validate CxG as a robust framework for cross-linguistic, multi-level analysis of syntactic variation.
16.2 Variation as Emergent Across Languages
Variation is an emergent property arising from the interaction of multiple factors:
Exposure-driven diversity: Individual and population-level experiences introduce probabilistic differences in constructional inventories.
Generalization and abstraction: Repeated exposure and network connectivity stabilize higher-order constructions, smoothing local divergences.
Functional adaptation: Registers selectively activate constructions according to communicative goals, producing context-sensitive variation.
Typological constraints: Morphology, word order, and syntactic encoding shape where variation is permitted and how it is expressed.
Emergence occurs consistently across typologically distinct systems, revealing universal principles of constructional organization while accommodating language-specific patterns.
16.3 Directions for Future Research
This study opens multiple avenues for further investigation:
Extended typological comparisons: Apply the unified CxG framework to additional morphologically rich and analytic languages to test universality of core–periphery dynamics.
Longitudinal exposure studies: Examine how changing social networks, media, and education influence constructional stability and peripheral variation over time.
Computational modeling: Refine probabilistic, network-based models of individual and population-level variation to predict emerging patterns.
Pedagogical interventions: Explore how register-sensitive and function-based teaching can improve multilingual competence in educational contexts.
Sociolinguistic and policy applications: Investigate how inclusive approaches to dialectal and register variation can inform language planning, standardization, and literacy programs.
By integrating Construction Grammar, complex systems thinking, and cross-linguistic empiricism, future research can continue to illuminate how grammatical variation emerges, stabilizes, and adapts across languages and societies.
Summary
Section 16 synthesizes the comparative study, showing that variation in Urdu, Saraiki, and English is systematic, emergent, and typologically conditioned. Construction Grammar provides the theoretical scaffolding to understand these dynamics across multiple levels, individual, dialectal, and register, bridging descriptive, cognitive, and applied perspectives.
Variation is not noise but a predictable feature of a complex linguistic system, shaped by exposure, generalization, function, and typology. Understanding this landscape enables both theoretical advancement and practical applications in linguistics, education, and policy.
Appendices
Appendix A: Construction Inventories for Urdu, Saraiki, and English
This appendix provides representative inventories of constructions across Urdu, Saraiki, and English, illustrating both core and peripheral constructions as well as register-specific patterns. Each inventory includes:
- Constructional schema (form + slot structure)
- Typical examples across registers and dialects
- Functional annotation (argument structure, discourse function)
- Frequency/usage notes based on corpus analysis
Example entries:
| Language | Construction | Slot Structure | Example | Function | Register Notes |
|---|---|---|---|---|---|
| Urdu | SOV | Subj + Obj + Verb | میں کتاب پڑھ رہا ہوں | Declarative | Stable across registers |
| Saraiki | Discourse particle marking | NP + particle | توں کے کریندا پیا؟ | Emphasis/focus | Oral-heavy, peripheral |
| English | Periphrastic future | Aux + V | I will read | Future tense | Core across dialects, register-sensitive |
This inventory is designed to support cross-linguistic comparison, computational modeling, and pedagogical applications.
Appendix B: Glossing Conventions (Leipzig)
All constructions are annotated using Leipzig glossing conventions to ensure transparency and reproducibility. Key conventions used in this book include:
| Abbreviation | Meaning | Example |
|---|---|---|
| 1SG, 2SG, 3SG | Person/Number | پڑھ-تا → read-1SG |
| FUT | Future tense | کرے-گا → do-FUT |
| PERF | Perfective aspect | کھا-چکا → eat-PERF |
| SUBJ | Subjunctive mood | جائے |
| PL, SG | Plural, Singular | کتابیں → book-PL |
| AUX | Auxiliary verb | will read → FUT periphrastic |
| NEG | Negation | نہیں |
Glosses are included alongside each example in the main text and appendices to facilitate comparative analysis and computational modeling.
Appendix C: Corpus Metadata and Reproducibility
This appendix details the corpora used, including size, composition, dialectal coverage, and register annotation, enabling replication and computational experimentation.
Corpus Overview:
| Language | Corpus Type | Size (tokens) | Dialects / Regions | Registers | Notes |
|---|---|---|---|---|---|
| Urdu | Written + Spoken | 5M | Major cities + administrative | Literary, news, oral | Balanced across regions |
| Saraiki | Written + Oral | 1.2M | Multiregional cluster | Oral, written | Under-resourced, digitized from fieldwork |
| English | Global | 10M | NA + UK + South Asian English | Academic, media, digital | Standardized annotation |
Annotation Details:
- Part-of-speech tagging: Language-specific tagsets harmonized for cross-linguistic comparison
- Morphosyntactic parsing: Manual correction for Urdu and Saraiki; automated for English with validation
- Register labels: Assigned based on genre, modality, and functional use
Reproducibility Guidelines:
- All constructions, glosses, and frequency counts are linked to corpus IDs and line numbers
- Scripts for construction extraction, similarity measurement, and network modeling are provided
- Detailed metadata allows other researchers to replicate analyses and extend studies to additional languages or registers
Appendix Summary
These appendices provide the empirical backbone for the post’s comparative analyses. They include:
- Comprehensive construction inventories to illustrate core-periphery dynamics
- Glossing conventions ensuring clarity and standardization
- Corpus metadata and reproducibility protocols to support computational modeling, cross-linguistic comparison, and educational applications
Together, they enable the reader to validate findings, replicate studies, and apply insights to related research or pedagogical contexts.
Acknowledgment: This post is inspired by the theoretical and computational framework presented in Dunn, J. (2026). Syntactic Variation from Individuals to Populations: Language as a Complex System. Cambridge: Cambridge University Press, which examines syntactic variation across individuals, populations, and contexts through the lens of Construction Grammar and complex systems.
Reference
