Jump to ContentJump to Main Navigation
Computing and Language VariationInternational Journal of Humanities and Arts Computing Volume 2$
Users without a subscription are not able to see the full content.

John Nerbonne and Charlotte Gooskens

Print publication date: 2009

Print ISBN-13: 9780748640300

Published to Edinburgh Scholarship Online: September 2012

DOI: 10.3366/edinburgh/9780748640300.001.0001

Show Summary Details
Page of

PRINTED FROM EDINBURGH SCHOLARSHIP ONLINE (www.edinburgh.universitypressscholarship.com). (c) Copyright Edinburgh University Press, 2022. All Rights Reserved. An individual user may print out a PDF of a single chapter of a monograph in ESO for personal use.date: 26 June 2022

Recognising Groups among Dialects

Recognising Groups among Dialects

(p.153) Recognising Groups among Dialects
Computing and Language Variation

Jelena ProkiĆ

John Nerbonne

Edinburgh University Press

Dialectometry is a multidisciplinary field that uses various quantitative methods in the analysis of dialect data. Very often those techniques include classification algorithms such as hierarchical clustering algorithms used to detect groups within certain dialect area. Although known for their instability, clustering algorithms are often applied without evaluation or with only partial evaluation. Very small differences in the input data can produce substantially different grouping of dialects. This chapter evaluates algorithms used to detect groups among language dialect varieties measured at the aggregate level. The data used in this research is dialect pronunciation data that consists of various pronunciations of 156 words collected all over Bulgaria. The distances between words are calculated using Levenshtein algorithm, which also resulted in the calculation of the distances between each two sites in the data set. Seven hierarchical clustering algorithms, as well as the k-means and neighbor-joining algorithm, are applied to the calculated distances.

Keywords:   dialectometry, classification algorithms, hierarchical clustering algorithms, dialects, pronunciations, Bulgaria, Levenshtein algorithm, k-means, neighbor-joining algorithm

Edinburgh Scholarship Online requires a subscription or purchase to access the full text of books within the service. Public users can however freely search the site and view the abstracts and keywords for each book and chapter.

Please, subscribe or login to access full text content.

If you think you should have access to this title, please contact your librarian.

To troubleshoot, please check our FAQs, and if you can't find the answer there, please contact us.