A clean, structured, and production-ready dataset of countries and territories based on ISO 3166-1 standards.
The Countries Dataset is designed for developers, researchers, analysts, educators, and organizations that need reliable country reference data in multiple machine-readable formats. The repository provides standardized country names, ISO country codes, regional classifications, and generated distribution files suitable for APIs, applications, analytics pipelines, forms, search systems, and educational projects.
The project was built with a focus on accuracy, usability, interoperability, and long-term maintainability. Rather than providing only a simple text list, the dataset includes structured canonical source data, generated outputs, validation tooling, schema documentation, and citation metadata.
View the repository on GitHub:
Included Data
The dataset currently includes:
- Country and territory names
- Official and common naming support
- ISO 3166-1 alpha-2 codes
- ISO 3166-1 alpha-3 codes
- ISO numeric codes
- Region and subregion classifications
- UTF-8 normalized text
Available Formats
The repository includes:
- CSV
- JSON
- YAML
- Plain text
- Comma-separated text
- Typeahead/search-friendly JSON
Designed for Real-World Usage
The dataset is suitable for:
- Country dropdowns and forms
- APIs and backend services
- Search and autocomplete systems
- Data engineering pipelines
- Geographic analytics
- Educational and research projects
- Open-source software
- Travel and mobility platforms
Repository Features
The project also includes:
- Validation scripts
- Dataset generation tooling
- GitHub Actions validation workflow
- End-user documentation
- Citation metadata
- Zenodo-ready metadata
- Structured repository layout
Open Source
The Countries Dataset is open source and designed to be transparent, reusable, and easy to integrate into modern applications and workflows.
Contributions, corrections, and improvements are welcome.