Comtrade Mirroring

Produces harmonized bilateral trade estimates by reconciling exporter- and importer-reported UN Comtrade data using a reliability-weighted method.

What This Does

Transforms UN Comtrade data into clean bilateral trade data through a mirroring process that reconciles discrepancies between exporter and importer reported values. This methodology underpins the bilateral trade data published in the Atlas of Economic Complexity.

Prerequisites

Python 3.10+
Poetry for managing dependencies
FRED API key (get one here)
Comtrade data files (download from comtrade-downloader)

Installation

git clone https://github.com/your-org/comtrade-mirroring.git
cd comtrade-mirroring
poetry install && poetry shell

# Set up environment variables
export FRED_API_KEY="your_fred_key_here"

Quick Start

Configure processing settings in user_config.py
Run the pipeline: python main.py
Find results in your configured output directory

Configuration

Edit user_config.py:

Select Classifications to Process

# Choose which trade classifications to process
PROCESS_SITC = False   # SITC data from 1962-END_YEAR
PROCESS_HS92 = True    # HS92 data from 1992-END_YEAR
PROCESS_HS12 = True    # HS12 data from 2012-END_YEAR

# Test mode - only process recent years
TEST_MODE = True
TEST_START_YEAR = 2020
END_YEAR = 2023

Set Data Paths

# Path to downloaded Comtrade files
DOWNLOADED_FILES_PATH = "/path/to/downloaded/comtrade/data"

# Results output directory
FINAL_OUTPUT_PATH = "/path/to/output/directory"

Processing Steps

PROCESSING_STEPS = {
    "run_cleaning": True,              # Main bilateral trade cleaning pipeline
    "delete_intermediate_files": True, # Clean up intermediate files
}

Supported Classifications

SITC (Standard International Trade Classification):

SITC Revision 1 (1962-current)
SITC Revision 2 (1976-current)
SITC Revision 3 (1988-current)

HS (Harmonized System):

HS1992 (1992-current)
HS1996 (1996-current)
HS2002 (2002-current)
HS2007 (2007-current)
HS2012 (2012-current)
HS2017 (2017-current)
HS2022 (2022-current)

Output

Mirrored trade data saved as:

{FINAL_OUTPUT_PATH}/{DATA_VERSION}/mirrored_output/
├── H0/                    # HS92 bilateral trade data
│   ├── H0_2020.parquet
│   ├── H0_2021.parquet
│   └── ...

Each trade file contains: year, exporter, importer, commoditycode, value_final, value_exporter, value_importer

How It Works

The mirroring pipeline consists of five processing steps:

1. Preprocess and aggregate trade data

2. Adjust values from CIF to FOB

3. Compute country reporting reliability scores

4. Reconcile country-pair trade totals

5. Reconcile product-level trade values

The final output provides reconciled trade values that combine exporter and importer reports based a country reporting reliability network.

Repository Structure

comtrade-mirroring/
├── mirror/
│   ├── main.py                    # Main entry point
│   ├── user_config.py             # Configuration
│   ├── src/
│   │   ├── objects/
│   │   ├── table_objects/
│   │   └── utils/
    ├── logs/ 
    ├── images/     
│   └── data/
│       ├── static/
├── pyproject.toml                # Python dependencies
└── README.md                     # This file

Data Requirements

Input Data Structure

The pipeline expects downloaded Comtrade data in this structure:

{DOWNLOADED_FILES_PATH}/
├── H0/                    # HS92 classification
│   ├── H0_2020.parquet
│   ├── H0_2021.parquet
│   └── ...
├── H4/                    # HS12 classification  
│   ├── H4_2020.parquet
│   └── ...
└── SITC/                  # SITC classification
    ├── SITC_2020.parquet
    └── ...

Technical Details

System Requirements

Memory: ~16GB+ RAM recommended for full processing
Storage: 30GB+ available space for files

License

Apache License, Version 2.0 - see LICENSE file.

Citation

@Misc{comtrade_mirroring,
  author={Harvard Growth Lab},
  title={Comtrade Mirroring Pipeline},
  year={2025},
  howpublished = {\url{https://github.com/harvard-growth-lab/comtrade-mirroring}},
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Comtrade Mirroring

What This Does

Prerequisites

Installation

Quick Start

Configuration

Select Classifications to Process

Set Data Paths

Processing Steps

Supported Classifications

Output

How It Works

1. Preprocess and aggregate trade data

2. Adjust values from CIF to FOB

3. Compute country reporting reliability scores

4. Reconcile country-pair trade totals

5. Reconcile product-level trade values

Repository Structure

Data Requirements

Input Data Structure

Technical Details

System Requirements

License

Citation

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 225 Commits
mirror		mirror
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Comtrade Mirroring

What This Does

Prerequisites

Installation

Quick Start

Configuration

Select Classifications to Process

Set Data Paths

Processing Steps

Supported Classifications

Output

How It Works

1. Preprocess and aggregate trade data

2. Adjust values from CIF to FOB

3. Compute country reporting reliability scores

4. Reconcile country-pair trade totals

5. Reconcile product-level trade values

Repository Structure

Data Requirements

Input Data Structure

Technical Details

System Requirements

License

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages