Skip to content

Introduce S3SeekableFile as a S3 file wrapper#1523

Open
Flauschbaellchen wants to merge 1 commit into
jschneier:masterfrom
Flauschbaellchen:s3-seekable
Open

Introduce S3SeekableFile as a S3 file wrapper#1523
Flauschbaellchen wants to merge 1 commit into
jschneier:masterfrom
Flauschbaellchen:s3-seekable

Conversation

@Flauschbaellchen

Copy link
Copy Markdown

Previously, whenever an S3 file was opened, it was directly downloaded into a SpooledTemporaryFile, independently if it was ever read or not. For files with a bigger size it reduced performance and increased traffic and expenses in cloud environments.

This commit wraps the access to the S3 file into a S3SeekableFile wrapper which reads requested bytes on demand, without loading the file completely into memory or local storage.

Previously, whenever an S3 file was opened, it was directly downloaded
into a `SpooledTemporaryFile`, independently if it was ever read or not.
For files with a bigger size it reduced performance and increased
traffic and expenses in cloud environments.

This commit wraps the access to the S3 file into a S3SeekableFile
wrapper which reads requested bytes on demand, without loading the file
completely into memory or local storage.
Comment thread storages/backends/s3.py
self._file, ExtraArgs=params, Config=self._storage.transfer_config
)
self._file.seek(0)
self._file = S3SeekableFile(self.obj, ExtraArgs=params)

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please note that Config=self._storage.transfer_config was removed... I'm not sure if this is needed as the method changed from download_fileobj to get as well.

@Flauschbaellchen

Flauschbaellchen commented Jul 14, 2025

Copy link
Copy Markdown
Author

@jschneier May I bump this PR? I would love to hear from you what you think about this change. It would help a lot when working with large files on S3 if only a subset of data needs to be read. It also reduces the time it takes from opening the file until the first chunk can be processed by the application as it does not need to wait until the full download has been completed.

@ben-xo

ben-xo commented Jun 12, 2026

Copy link
Copy Markdown

following

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants