Overcoming Challenges in Corpus Construction PDF Download

Are you looking for read ebook online? Search for your book and save it on your Kindle device, PC, phones or tablets. Download Overcoming Challenges in Corpus Construction PDF full book. Access full book title Overcoming Challenges in Corpus Construction by Robbie Love. Download full books in PDF and EPUB format.

Overcoming Challenges in Corpus Construction

Overcoming Challenges in Corpus Construction PDF Author: Robbie Love
Publisher: Routledge
ISBN: 0429771096
Category : Language Arts & Disciplines
Languages : en
Pages : 176

Book Description
This volume offers a critical examination of the construction of the Spoken British National Corpus 2014 (Spoken BNC2014) and points the way forward toward a more informed understanding of corpus linguistic methodology more broadly. The book begins by situating the creation of this second corpus, a compilation of new, publicly-accessible Spoken British English from the 2010s, within the context of the first, created in 1994, talking through the need to balance backward capability and optimal practice for today’s users. Chapters subsequently use the Spoken BNC2014 as a focal point around which to discuss the various considerations taken into account in corpus construction, including design, data collection, transcription, and annotation. The volume concludes by reflecting on the successes and limitations of the project, as well as the broader utility of the corpus in linguistic research, both in current examples and future possibilities. This exciting new contribution to the literature on linguistic methodology is a valuable resource for students and researchers in corpus linguistics, applied linguistics, and English language teaching.

Overcoming Challenges in Corpus Construction

Overcoming Challenges in Corpus Construction PDF Author: Robbie Love
Publisher: Routledge
ISBN: 0429771096
Category : Language Arts & Disciplines
Languages : en
Pages : 176

Book Description
This volume offers a critical examination of the construction of the Spoken British National Corpus 2014 (Spoken BNC2014) and points the way forward toward a more informed understanding of corpus linguistic methodology more broadly. The book begins by situating the creation of this second corpus, a compilation of new, publicly-accessible Spoken British English from the 2010s, within the context of the first, created in 1994, talking through the need to balance backward capability and optimal practice for today’s users. Chapters subsequently use the Spoken BNC2014 as a focal point around which to discuss the various considerations taken into account in corpus construction, including design, data collection, transcription, and annotation. The volume concludes by reflecting on the successes and limitations of the project, as well as the broader utility of the corpus in linguistic research, both in current examples and future possibilities. This exciting new contribution to the literature on linguistic methodology is a valuable resource for students and researchers in corpus linguistics, applied linguistics, and English language teaching.

Corpus Design and Construction in Minoritised Language Contexts - Cynllunio a Chreu Corpws mewn Cyd-destunau Ieithoedd Lleiafrifoledig

Corpus Design and Construction in Minoritised Language Contexts - Cynllunio a Chreu Corpws mewn Cyd-destunau Ieithoedd Lleiafrifoledig PDF Author: Dawn Knight
Publisher: Springer Nature
ISBN: 3030724840
Category : Language Arts & Disciplines
Languages : en
Pages : 178

Book Description
This bilingual book provides a detailed overview of the project to construct a National Corpus of Contemporary Welsh (CorCenCC), addressing the conceptual and methodological challenges faced when developing language corpora for minoritised languages. A conceptual framework is presented for the user-driven design that underpinned the CorCenCC project, along with a detailed blueprint that can function as a scaffold for other researchers embarking on projects of this nature. This book will be of value to those working in language teaching, learning and assessment, language policy and planning, translation, corpus linguistics and language technology, and to anyone with an interest in Welsh and other minoritised languages. Mae'r llyfr dwyieithog hwn yn rhoi trosolwg manwl o'r prosiect i greu Corpws Cenedlaethol Cymraeg Cyfoes (CorCenCC), ac yn mynd i'r afael â'r heriau cysyniadol a methodolegol a wynebir wrth ddatblygu corpora iaith ar gyfer ieithoedd lleiafrifoledig. Cyflwynir fframwaith cysyniadol ar gyfer y cynllun wedi'i yrru gan ddefnyddwyr sy'n greiddiol i brosiect CorCenCC, ynghyd â glasbrint manwl a all weithredu fel sgaffald i ymchwilwyr eraill sy'n dechrau ar brosiectau o'r fath. Bydd y llyfr hwn o werth i'r rhai sy'n gweithio ym meysydd addysgu, dysgu ac asesu ieithoedd, polisi iaith a chynllunio ieithyddol, cyfieithu, ieithyddiaeth gorpws a thechnoleg iaith, ac unrhyw un â diddordeb yn y Gymraeg ac ieithoedd lleiafrifoledig eraill.

The Routledge Handbook of Corpus Linguistics

The Routledge Handbook of Corpus Linguistics PDF Author: Anne O'Keeffe
Publisher: Routledge
ISBN: 0429632649
Category : Language Arts & Disciplines
Languages : en
Pages : 684

Book Description
The Routledge Handbook of Corpus Linguistics 2e provides an updated overview of a dynamic and rapidly growing area with a widely applied methodology. Over a decade on from the first edition of the Handbook, this collection of 47 chapters from experts in key areas offers a comprehensive introduction to both the development and use of corpora as well as their ever-evolving applications to other areas, such as digital humanities, sociolinguistics, stylistics, translation studies, materials design, language teaching and teacher development, media discourse, discourse analysis, forensic linguistics, second language acquisition and testing. The new edition updates all core chapters and includes new chapters on corpus linguistics and statistics, digital humanities, translation, phonetics and phonology, second language acquisition, social media and theoretical perspectives. Chapters provide annotated further reading lists and step-by-step guides as well as detailed overviews across a wide range of themes. The Handbook also includes a wealth of case studies that draw on some of the many new corpora and corpus tools that have emerged in the last decade. Organised across four themes, moving from the basic start-up topics such as corpus building and design to analysis, application and reflection, this second edition remains a crucial point of reference for advanced undergraduates, postgraduates and scholars in applied linguistics.

Building a National Corpus

Building a National Corpus PDF Author: Dawn Knight
Publisher: Springer Nature
ISBN: 3030818586
Category : Language Arts & Disciplines
Languages : en
Pages : 192

Book Description
This book aims to provide a micro-level, working model of a methodological approach and practical guidelines for building a corpus, informed by the work on the CorCenCC project (Corpws Cenedlaethol Cymraeg Cyfoes - the National Corpus of Contemporary Welsh). It focuses specifically on the development of detailed design frames for corpora across communicative modes (spoken, written and e-language), and the practical processes involved in the planning, collection, transcription, collation and (re)presentation of language data. The book is designed to be of significant value and relevance to those interested in critically engaging with corpus methodology. Although Welsh is the language under discussion, the processes and approaches discussed in the building of CorCenCC can be applied to a lesser or greater extent to other language contexts. This book provides a working model, and an account of how to build a corpus dataset from which step by step guidelines for creating other linguistic corpora in any language can be easily extrapolated. It will be of value to students and scholars of minority languages and corpus linguistics.

Web Corpus Construction

Web Corpus Construction PDF Author: Roland Schäfer
Publisher: Morgan & Claypool Publishers
ISBN: 1608459845
Category : Computers
Languages : en
Pages : 147

Book Description
The World Wide Web constitutes the largest existing source of texts written in a great variety of languages. A feasible and sound way of exploiting this data for linguistic research is to compile a static corpus for a given language. There are several adavantages of this approach: (i) Working with such corpora obviates the problems encountered when using Internet search engines in quantitative linguistic research (such as non-transparent ranking algorithms). (ii) Creating a corpus from web data is virtually free. (iii) The size of corpora compiled from the WWW may exceed by several orders of magnitudes the size of language resources offered elsewhere. (iv) The data is locally available to the user, and it can be linguistically post-processed and queried with the tools preferred by her/him. This book addresses the main practical tasks in the creation of web corpora up to giga-token size. Among these tasks are the sampling process (i.e., web crawling) and the usual cleanups including boilerplate removal and removal of duplicated content. Linguistic processing and problems with linguistic processing coming from the different kinds of noise in web corpora are also covered. Finally, the authors show how web corpora can be evaluated and compared to other corpora (such as traditionally compiled corpora). For additional material please visit the companion website: sites.morganclaypool.com/wcc Table of Contents: Preface / Acknowledgments / Web Corpora / Data Collection / Post-Processing / Linguistic Processing / Corpus Evaluation and Comparison / Bibliography / Authors' Biographies

Metaphor and Corpus Linguistics

Metaphor and Corpus Linguistics PDF Author: Rafael Alejo-González
Publisher: Taylor & Francis
ISBN: 1040002854
Category : Language Arts & Disciplines
Languages : en
Pages : 161

Book Description
Metaphor and Corpus Linguistics: Building and Investigating an English as a Medium of Instruction Corpus offers a model for building a corpus of oral EMI seminars. It demonstrates how incorporating metaphor to the process of corpus building affords a more comprehensive description of the role of metaphor in discourse. EMI is the specific context outlined in this volume, and as such it will be of particular interest to researchers in this area, though the design and model can be easily generalised and applied to other corpora focusing on metaphor. Alejo-González argues for the need to build such a corpus given the scarcity of corpora being tagged for metaphor as well as the shortage of those dealing with the EMI phenomenon. This book will be of practical use and interest to those researchers of corpus linguistics or related areas looking to explore metaphor through their corpus studies.

Analysing Representation

Analysing Representation PDF Author: Frazer Heritage
Publisher: Taylor & Francis
ISBN: 104001898X
Category : Language Arts & Disciplines
Languages : en
Pages : 316

Book Description
Analysing Representation: A Corpus and Discourse Textbook guides readers through the process of researching how people and phenomena are represented in discourse and introduces them to key tools they can use from corpus linguistics and (critical) discourse analysis. This book takes a step-by-step approach to introducing each concept and includes exercises and further reading to help readers check their progress and prepare for independent research. It is unique in introducing readers to a range of experts representing the full range of work in this area. This book is aimed at final-year undergraduate, taught postgraduate and doctoral level students. It wil also be useful to scholars who are new to combining corpus and discourse methods in investigations of representation.

Broadening the Spectrum of Corpus Linguistics

Broadening the Spectrum of Corpus Linguistics PDF Author: Susanne Flach
Publisher: John Benjamins Publishing Company
ISBN: 9027256985
Category : Language Arts & Disciplines
Languages : en
Pages : 329

Book Description
This volume presents a snapshot of the current state of the art of research in English corpus linguistics. It contains selected papers from the 40th ICAME conference in 2019 and features contributions from experts in synchronic, diachronic, and contrastive linguistics, as well as in sociolinguistics, phonetics, discourse analysis, and learner language. The volume showcases the particular strengths of research in the ICAME tradition. The papers in this volume offer new insights from the reanalysis of new data types, methodological refinements and advancements of quantitative analysis, and from taking new perspectives on ongoing debates in their respective fields.

Corpus-Assisted Discourse Studies

Corpus-Assisted Discourse Studies PDF Author: Mathew Gillings
Publisher: Cambridge University Press
ISBN: 1009197878
Category : Language Arts & Disciplines
Languages : en
Pages : 132

Book Description
The breadth and spread of corpus-assisted discourse studies (CADS) indicate its usefulness for exploring language use within a social context. However, its theoretical foundations, limitations, and its epistemological implications must be considered so that we can adjust our research designs accordingly. This Element focuses on important meta-level questions around epistemology, while also offering a compact guide to which corpus linguistic tools are available and how they can contribute to finding out more about discourse. This Element will appeal to researchers both new and experienced, both within the CADS community and beyond.

Fundamental Principles of Corpus Linguistics

Fundamental Principles of Corpus Linguistics PDF Author: Tony McEnery
Publisher: Cambridge University Press
ISBN: 1009428985
Category : Language Arts & Disciplines
Languages : en
Pages : 327

Book Description
How might evidence of language use – writing and speech – be used as a way of studying language? Corpus linguistics is the study of linguistic data from a particular language or set of languages. It is a fast-moving approach to studying language, and there is still a degree of divergence in how research questions are approached using corpus data. This book uses a framework, based on the work of Karl Popper, to explore a number of fundamental issues in corpus linguistics. It critically evaluates how these issues are tackled, and proposes a set of best practices for future research. It spells out why using corpus data is valuable, what we can learn from using it, and how we may most effectively progress our understanding of language by using such data. It is essential reading for researchers and students of language in general, and of applied linguistics and English language in particular.