DICL Language Database

About

The database contains 11 index measures of linguistic similarity between 242 countries, both domestically and internationally. The domestic measures capture linguistic similarities present among populations within a single country while the international indexes capture language similarities between two different countries. The indexes, which are based on 6,674 languages, reflect three different dimensions of language: common official languages, common native and acquired spoken languages, and linguistic proximity across different languages. This database has many uses, such as in the study of bilateral flows—including FDI, migration, and international trade—as well as in regional or country level analyses. 

Download

Available for download from the Harvard Dataverse

Technical description 


Recommended Citation

Gurevich, T., Herman, P.R., Toubal, F., Yotov, Y. (2025) A Dataset on Linguistic Connectivity Across and Within Countries. Sci Data 12, 542. https://doi.org/10.1038/s41597-025-04692-8 


International Common Native Language (CNL) Linkages

Domestic Native Language Diversity