Data Analysis
Turning Linguistic Evidence into Scholarly Argument
Data analysis is the stage of a linguistics PhD where the research either becomes intellectually persuasive or collapses into descriptive reporting. Many theses present large amounts of linguistic data, transcripts, corpus extracts, sentence structures, and discourse samples but fail to transform that material into a coherent argument.
Examiners are not impressed by data alone. They are impressed by what the data explain.
A strong analysis chapter does not simply show linguistic evidence. It demonstrates what that evidence means within a theoretical and research-driven framework.
The Core Function of Data Analysis
The primary purpose of data analysis is not to display data but to convert data into knowledge.
This involves three intellectual operations:
identification of relevant patterns
interpretation of those patterns
integration of those interpretations into a broader argument
Without these steps, analysis remains at the level of description.
In linguistics, description answers:
“What linguistic forms are present?”
Analysis answers:
“What do these forms reveal about language use, structure, or meaning?”
Description vs Interpretation: The Central Divide
One of the most common weaknesses in PhD theses is descriptive analysis disguised as interpretation.
Description:
“The speaker uses modal verbs such as can, may, and might.”
Interpretation:
“The use of modal verbs reflects epistemic uncertainty and mitigates assertive force, indicating politeness strategies in institutional discourse.”
The first statement reports observation.
The second transforms observation into explanation.
Examiners consistently value the second approach because it demonstrates analytical reasoning rather than simple identification.
Data Does Not Speak for Itself
A critical misconception among candidates is that data inherently contain meaning.
In reality, data are interpreted through theoretical and methodological frameworks.
The same linguistic feature can support different interpretations depending on the analytical lens.
For example, a pause in spoken discourse may indicate:
cognitive processing
turn-taking management
emotional hesitation
interactional politeness
Without theoretical grounding, interpretation becomes arbitrary.
Data analysis is therefore not discovery of meaning but construction of meaning through systematic reasoning.
The Role of the Theoretical Framework in Analysis
Data analysis is where the theoretical framework becomes operational.
A theory that remains unused in analysis is academically inert.
For example:
In Systemic Functional Linguistics, analysis may focus on ideational, interpersonal, and textual metafunctions.
In Critical Discourse Analysis, analysis may focus on ideology, power relations, and discursive strategies.
In Relevance Theory, analysis may focus on inferential processes and contextual effects.
In syntactic theory, analysis may focus on structural representation and rule-governed patterns.
The theoretical framework determines what counts as relevant evidence and how that evidence is interpreted.
Organising Data: From Raw Material to Analytical Structure
Raw data cannot be analysed in its initial form. It must be organised into meaningful categories.
This process may involve:
coding linguistic features
grouping discourse patterns
classifying syntactic structures
identifying pragmatic functions
segmenting corpus data
The purpose of organization is not simplification but analytical clarity.
Well-organized data allows patterns to emerge more clearly and systematically.
Inductive and Deductive Analysis
Linguistic analysis may follow two broad approaches.
Deductive analysis begins with theory and applies it to data.
Inductive analysis begins with data and allows patterns to inform interpretation.
Most strong PhD studies combine both approaches.
Deductive reasoning ensures theoretical consistency.
Inductive reasoning allows empirical discovery.
A purely deductive study risks forcing data into predetermined categories.
A purely inductive study risks theoretical fragmentation.
Balanced integration produces methodological strength.
From Patterns to Explanations
Identifying patterns is not sufficient for doctoral-level analysis.
For example:
Pattern:
Frequent use of passive constructions in academic writing.
Explanation:
Passive constructions function to foreground processes over agents, thereby contributing to an impersonal and objective academic style.
The transition from pattern to explanation is the core intellectual movement in data analysis.
Without explanation, analysis remains incomplete.
Levels of Linguistic Analysis
Depending on the research focus, analysis may operate at different linguistic levels:
Phonological level (sound patterns)
Morphological level (word formation)
Syntactic level (sentence structure)
Semantic level (meaning)
Pragmatic level (contextual interpretation)
Discourse level (text and interaction)
A strong thesis maintains clarity about which level is being analysed and why that level is relevant.
Integrating Quantitative and Qualitative Analysis
Many linguistics studies involve both numerical and interpretive data.
Quantitative analysis may include:
frequency counts
statistical comparisons
distributional patterns
corpus-based measurements
Qualitative analysis may include:
interpretation of meaning
contextual analysis
discourse interpretation
functional explanation
The challenge is integration.
Numbers alone do not explain linguistic significance.
Interpretation alone may lack empirical grounding.
A strong analysis chapter allows both forms of evidence to reinforce each other.
Avoiding Common Analytical Weaknesses
Several recurring problems weaken data analysis chapters:
Excessive description without interpretation
Unexplained analytical categories
Selective presentation of data without justification
Overgeneralisation from limited examples
Lack of theoretical integration
Absence of clear analytical procedure
Repetition of findings without synthesis
These weaknesses often result in chapters that look detailed but remain conceptually shallow.
Building an Argument Through Data
A strong analysis chapter is not a presentation of findings but a structured argument.
Each section of analysis should contribute to answering the research questions.
This means that data should be selected, organized, and interpreted in a way that progressively builds justification for the study’s claims.
Analysis is therefore not separate from argumentation. It is argumentation in empirical form.
The Relationship Between Findings and Analysis
In many theses, the boundary between findings and analysis is unclear.
A useful distinction is:
Findings present what the data show.
Analysis explains what the findings mean.
However, in practice, these stages often overlap.
What matters is not strict separation but clarity of function.
Examiners expect transparency in how interpretations are derived from data.
Ensuring Analytical Coherence
A coherent analysis maintains alignment with:
research questions
theoretical framework
methodological design
Without this alignment, analysis becomes fragmented and difficult to evaluate.
Coherence ensures that every interpretive claim is anchored in the broader structure of the thesis.
What Examiners Look For in Data Analysis
Examiners typically assess whether the candidate demonstrates:
systematic handling of data
clear analytical procedures
theoretical consistency
logical interpretation of patterns
relevance to research questions
evidence of critical thinking
integration of quantitative and qualitative insights
The key question is not “What data were presented?” but “What do these data demonstrate in relation to the research problem?”
Reflection
Data analysis is the intellectual transformation point of a linguistics thesis. It is where raw linguistic material becomes structured knowledge.
A strong analysis does not overwhelm the reader with examples. It selects, organizes, and interprets data in a way that constructs a coherent scholarly argument.
When analysis remains descriptive, the thesis remains observational. When analysis becomes interpretive and theoretically informed, the thesis becomes explanatory.
The difference between description and explanation is the difference between reporting language and producing knowledge about language.

