Skip to content

raulcd/datanomy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

59 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Datanomy

Explore the anatomy of your columnar data files

Datanomy is a terminal-based tool for inspecting and understanding data files. It provides an interactive view of your data's structure, metadata, and internal organization.

Supported formats

  • Parquet (.parquet, .parq)
  • Arrow IPC (.arrow, .feather, .ipc)

Features for Parquet view

General Structure

General Structure

Schema

Schema

Data

Data

Metadata

Metadata

Stats

Stats

Features for Arrow IPC view

Structure

File-level layout showing header, record batches, and footer.

Schema

Arrow schema with per-column type and nullability details.

Data

Preview of the first 50 rows.

Metadata

File and schema-level metadata.

Buffers

Physical buffer layout for each column — validity bitmap bits (color-coded valid/null), hex preview of values, offsets, and data buffers. For nested types (list, struct, map, dictionary) child array buffers are shown recursively.

Installation

# From PyPI
uv tool install datanomy
## with pip
pip install datanomy

# From source
uv tool install "datanomy @ git+https://github.com/raulcd/datanomy.git"
## cloning the repo 
git clone https://github.com/raulcd/datanomy.git
cd datanomy
uv sync

Usage

# Run without installing using uvx
uvx datanomy data.parquet

# Inspect a Parquet file
datanomy data.parquet

# Inspect an Arrow IPC file
datanomy data.arrow

You can also use from source using uvx. This uses the development version:

uvx "git+https://github.com/raulcd/datanomy.git" data.parquet
uvx "git+https://github.com/raulcd/datanomy.git" data.arrow

Keyboard Shortcuts

  • q - Quit the application

Development

# Install dependencies
uv sync

# Run from source
uv run datanomy path/to/file.parquet
uv run datanomy path/to/file.arrow
# Install dev dependencies
uv sync --extra dev

# Run tests
uv run pytest

# Format code
uv run ruff format .

# Lint
uv run ruff check .

# Lint
uv run mypy .

License

Apache License 2.0

Contributing

Contributions welcome! Please open an issue or PR.


Built with Textual and PyArrow

About

Dissecting data structures

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages