Cross-Lingual Word Embeddings

Research

Cross-Lingual Word Embeddings

Research output: Book/Report › Book › Research › peer-review

Standard

Cross-Lingual Word Embeddings. / Søgaard, Anders; Vulić, Ivan; Ruder, Sebastian; Faruqui, Manaal.

2 ed. Morgan & Claypool Publishers, 2019. 132 p. (Synthesis Lectures on Human Language Technologies).

Research output: Book/Report › Book › Research › peer-review

Harvard

Søgaard, A, Vulić, I, Ruder, S & Faruqui, M 2019, Cross-Lingual Word Embeddings. Synthesis Lectures on Human Language Technologies, 2 edn, Morgan & Claypool Publishers. https://doi.org/10.2200/S00920ED2V01Y201904HLT042

APA

Søgaard, A., Vulić, I., Ruder, S., & Faruqui, M. (2019). Cross-Lingual Word Embeddings. (2 ed.) Morgan & Claypool Publishers. Synthesis Lectures on Human Language Technologies https://doi.org/10.2200/S00920ED2V01Y201904HLT042

Vancouver

Søgaard A, Vulić I, Ruder S, Faruqui M. Cross-Lingual Word Embeddings. 2 ed. Morgan & Claypool Publishers, 2019. 132 p. (Synthesis Lectures on Human Language Technologies). https://doi.org/10.2200/S00920ED2V01Y201904HLT042

Author

Søgaard, Anders ; Vulić, Ivan ; Ruder, Sebastian ; Faruqui, Manaal. / Cross-Lingual Word Embeddings. 2 ed. Morgan & Claypool Publishers, 2019. 132 p. (Synthesis Lectures on Human Language Technologies).

Bibtex

@book{4264a46fd9e846e4a704b2d13002e521,

title = "Cross-Lingual Word Embeddings",

abstract = "The majority of natural language processing (NLP) is English language processing, and while there is good language technology support for (standard varieties of) English, support for Albanian, Burmese, or Cebuano-and most other languages-remains limited. Being able to bridge this digital divide is important for scientific and democratic reasons but also represents an enormous growth potential. A key challenge for this to happen is learning to align basic meaning-bearing units of different languages. In this book, the authors survey and discuss recent and historical work on supervised and unsupervised learning of such alignments. Specifically, the book focuses on so-called cross-lingual word embeddings. The survey is intended to be systematic, using consistent notation and putting the available methods on comparable form, making it easy to compare wildly different approaches. In so doing, the authors establish previously unreported relations between these methods and are able to present a fast-growing literature in a very compact way. Furthermore, the authors discuss how best to evaluate cross-lingual word embedding methods and survey the resources available for students and researchers interested in this topic. Table of Contents: Preface / Introduction / Monolingual Word Embedding Models / Cross-Lingual Word Embedding Models: Typology / A Brief History of Cross-Lingual Word Representations / Word-Level Alignment Models / Sentence-Level Alignment Methods / Document-Level Alignment Models / From Bilingual to Multilingual Training / Unsupervised Learning of Cross-Lingual Word Embeddings / Applications and Evaluation / Useful Data and Software / General Challenges and Future Directions / Bibliography / Authors' Biographies.",

keywords = "cross-lingual learning, machine learning, natural language processing, semantics",

author = "Anders S{\o}gaard and Ivan Vuli{\'c} and Sebastian Ruder and Manaal Faruqui",

year = "2019",

doi = "10.2200/S00920ED2V01Y201904HLT042",

language = "English",

series = "Synthesis Lectures on Human Language Technologies",

publisher = "Morgan & Claypool Publishers",

address = "United States",

edition = "2",

}

RIS

TY - BOOK

T1 - Cross-Lingual Word Embeddings

AU - Søgaard, Anders

AU - Vulić, Ivan

AU - Ruder, Sebastian

AU - Faruqui, Manaal

PY - 2019

Y1 - 2019

N2 - The majority of natural language processing (NLP) is English language processing, and while there is good language technology support for (standard varieties of) English, support for Albanian, Burmese, or Cebuano-and most other languages-remains limited. Being able to bridge this digital divide is important for scientific and democratic reasons but also represents an enormous growth potential. A key challenge for this to happen is learning to align basic meaning-bearing units of different languages. In this book, the authors survey and discuss recent and historical work on supervised and unsupervised learning of such alignments. Specifically, the book focuses on so-called cross-lingual word embeddings. The survey is intended to be systematic, using consistent notation and putting the available methods on comparable form, making it easy to compare wildly different approaches. In so doing, the authors establish previously unreported relations between these methods and are able to present a fast-growing literature in a very compact way. Furthermore, the authors discuss how best to evaluate cross-lingual word embedding methods and survey the resources available for students and researchers interested in this topic. Table of Contents: Preface / Introduction / Monolingual Word Embedding Models / Cross-Lingual Word Embedding Models: Typology / A Brief History of Cross-Lingual Word Representations / Word-Level Alignment Models / Sentence-Level Alignment Methods / Document-Level Alignment Models / From Bilingual to Multilingual Training / Unsupervised Learning of Cross-Lingual Word Embeddings / Applications and Evaluation / Useful Data and Software / General Challenges and Future Directions / Bibliography / Authors' Biographies.

AB - The majority of natural language processing (NLP) is English language processing, and while there is good language technology support for (standard varieties of) English, support for Albanian, Burmese, or Cebuano-and most other languages-remains limited. Being able to bridge this digital divide is important for scientific and democratic reasons but also represents an enormous growth potential. A key challenge for this to happen is learning to align basic meaning-bearing units of different languages. In this book, the authors survey and discuss recent and historical work on supervised and unsupervised learning of such alignments. Specifically, the book focuses on so-called cross-lingual word embeddings. The survey is intended to be systematic, using consistent notation and putting the available methods on comparable form, making it easy to compare wildly different approaches. In so doing, the authors establish previously unreported relations between these methods and are able to present a fast-growing literature in a very compact way. Furthermore, the authors discuss how best to evaluate cross-lingual word embedding methods and survey the resources available for students and researchers interested in this topic. Table of Contents: Preface / Introduction / Monolingual Word Embedding Models / Cross-Lingual Word Embedding Models: Typology / A Brief History of Cross-Lingual Word Representations / Word-Level Alignment Models / Sentence-Level Alignment Methods / Document-Level Alignment Models / From Bilingual to Multilingual Training / Unsupervised Learning of Cross-Lingual Word Embeddings / Applications and Evaluation / Useful Data and Software / General Challenges and Future Directions / Bibliography / Authors' Biographies.

KW - cross-lingual learning

KW - machine learning

KW - natural language processing

KW - semantics

UR - http://www.scopus.com/inward/record.url?scp=85066947466&partnerID=8YFLogxK

U2 - 10.2200/S00920ED2V01Y201904HLT042

DO - 10.2200/S00920ED2V01Y201904HLT042

M3 - Book

AN - SCOPUS:85066947466

T3 - Synthesis Lectures on Human Language Technologies

BT - Cross-Lingual Word Embeddings

PB - Morgan & Claypool Publishers

ER -

ID: 240408154