DICL Language Database
About
The database contains 11 index measures of linguistic similarity between 242 countries, both domestically and internationally. The domestic measures capture linguistic similarities present among populations within a single country while the international indexes capture language similarities between two different countries. The indexes, which are based on 6,674 languages, reflect three different dimensions of language: common official languages, common native and acquired spoken languages, and linguistic proximity across different languages. This database has many uses, such as in the study of bilateral flows—including FDI, migration, and international trade—as well as in regional or country level analyses.
Download
Available for download from the Harvard Dataverse.
Recommended Citation
Gurevich, T., Herman, P.R., Toubal, F., Yotov, Y. (2025) A Dataset on Linguistic Connectivity Across and Within Countries. Sci Data 12, 542. https://doi.org/10.1038/s41597-025-04692-8