Novedad bibliográfica

Infoling 3.74 (2025)

Grupo ALTYA
Universidad de Jaén

Departamento de Filología
Universidad de Almería

Universidad Autónoma del Estado de Morelos

Grupo ILSE
Universidad de Almería

Gracias por su ayuda

Título:Linguistic Corpora and Big Data in Spanish and Portuguese

Autores/as:Calderón Campos, Miguel; Vaamonde, Gael

Año de publicación:2024

Lugar de edición:Berlin / Boston

Editorial:De Gruyter

URL (acceso abierto):https://www.degruyter.com/document/doi/1...

Descripción

In recent decades, corpus linguistics has experienced tremendous development in the Hispanic world, along two opposite but complementary approaches:

increase in corpus size (corpus linguistics as Big Data) and
improvement in document selection and data annotation (corpus linguistics as High Quality Data).

The first approach has led to the creation of massive corpora such as esTenTen; at the same time, it has promoted the use of the web and social networks as corpora. The second perspective gives rise to specialized corpora such as Post Scriptum or Oralia Diacrónica del español (ODE).

The contributions gathered in this volume combine both methods in order to exploit their advantages and to overcome their possible limitations. On the one hand, it addresses the creation and design of small corpora focused on data quality; on the other hand, it offers case studies that make use of both specialized corpora and massive data extracted from the web.

Highlighting the complementary nature of both methods is the main idea of this book.

Temática:Humanidades digitales, Lexicografía, Lexicología, Lingüística de corpus

Índice

Miguel Calderón Campos & Gael Vaamonde
Introduction. Corpus Linguistics in the Era of Big and Rich Data: Methodological Perspectives on Spanish and Portuguese

I Small, Tidy and Rich Diachronic Corpora: The PS-ES and the ODE Corpora

Gael Vaamonde
Not so Big Data: Assessing Two Small Specialized Corpora for the Study of Historical Variation in Spanish

Inmaculada González Sopeña
Language Corpora and Lexical Arabisms in the Digital Age

Miguel Calderón Campos
Corpus Size and Tagging: Methodological Strategies for Research on the History of Diminutives -ito, -illo, and -ico

II The COSER Corpus and Newspaper Digital Libraries as Alternative Data Sources for Research on Rural and Informal Varieties

Miriam Bouzouita, Johnatan E. Bonilla & Rosa Lilia Segundo Díaz
Gaming for Dialects: Creating an Annotated and Parsed Corpus of European Spanish Dialects through GWAPs

María Teresa García-Godoy
Big Data and Lexical History: Digital Newspaper Libraries in Spanish Diachronic Research

III Exploiting Portuguese Reference Corpora: The CdP and the CRPC Corpora

Amália Mendes
The Reference Corpus of Contemporary Portuguese: Corpus Design and Case Study on Discourse Markers

Anton Granvik
On the Origins of the Shell Noun Construction in Portuguese

Katharina Gerhalter
Escrever não escrevo, mas ler um livro, ou um jornal, uns versos, leio. A Corpus-Based Approach to Topicalized Infinitives in Portuguese

Formato:PDF / EPUB (acceso abierto)

Págs.:231

ISBN-13:9783110781465

Formato:libro impreso

Págs.:231

ISBN-13:9783110781458

Precio: 104,00 EUR

Compra-ehttps://www.degruyter.com/document/doi/1...

Fecha de publicación en Infoling:31 de marzo de 2025

Remitente:

Gael Vaamonde
Universidad de Granada (España)
<gaelvaamonde

ugr.es>