Skip to content

[Dataset] RSSCN7 and RESISC45#32

Open
Hazel-Heejeong-Nam wants to merge 8 commits into
galilai-group:mainfrom
Hazel-Heejeong-Nam:f/hazel/rsscn7-and-resisc45
Open

[Dataset] RSSCN7 and RESISC45#32
Hazel-Heejeong-Nam wants to merge 8 commits into
galilai-group:mainfrom
Hazel-Heejeong-Nam:f/hazel/rsscn7-and-resisc45

Conversation

@Hazel-Heejeong-Nam

Copy link
Copy Markdown
Contributor

What does this PR do?

This PR adds two new dataset to stable-datasets:

  • RSSCN7
  • RESISC45

Usage Examples

  • RSSCN7
from stable_datasets.images.rsscn7 import RSSCN7

print("Loading RSSCN7 dataset...")
rsscn7_train = RSSCN7(split="train")
rsscn7_all = RSSCN7(split=None)

print(f"\nDataset Metadata:")
print(f"  - Homepage: {rsscn7_train.info.homepage}")
print(f"  - Description: {rsscn7_train.info.description}")
print(f"  - Citation:\n{rsscn7_train.info.citation}")

print(f"\nDataset Statistics:")
print(f"  - Train samples: {len(rsscn7_train)}")
print(f"  - Total splits: {len(rsscn7_all)}")
print(f"  - Number of classes: {rsscn7_train.features['label'].num_classes}")

sample = rsscn7_train[0]
print(f"\nSample Information:")
print(f"  - Keys: {list(sample.keys())}")
print(f"  - Image type: {type(sample['image'])}")
print(f"  - Image size: {sample['image'].size}")
print(f"  - Label (int): {sample['label']}")
print(f"  - Label (string): {rsscn7_train.features['label'].int2str(sample['label'])}")

print(f"\nAll class names:")
for i in range(7):
    print(f"  {i}: {rsscn7_train.features['label'].names[i]}")

print("\nRSSCN7 dataset loaded successfully!")
  • RESISC45
from stable_datasets.images.resisc45 import RESISC45

print("Loading RESISC45 dataset...")
resisc45_train = RESISC45(split="train")
resisc45_all = RESISC45(split=None)

print(f"\nDataset Metadata:")
print(f"  - Homepage: {resisc45_train.info.homepage}")
print(f"  - Description: {resisc45_train.info.description}")
print(f"  - Citation:\n{resisc45_train.info.citation}")

print(f"\nDataset Statistics:")
print(f"  - Train samples: {len(resisc45_train)}")
print(f"  - Total splits: {len(resisc45_all)}")
print(f"  - Number of classes: {resisc45_train.features['label'].num_classes}")

sample = resisc45_train[0]
print(f"\nSample Information:")
print(f"  - Keys: {list(sample.keys())}")
print(f"  - Image type: {type(sample['image'])}")
print(f"  - Image size: {sample['image'].size}")
print(f"  - Label (int): {sample['label']}")
print(f"  - Label (string): {resisc45_train.features['label'].int2str(sample['label'])}")

print(f"\nFirst 10 class names:")
for i in range(10):
    print(f"  {i}: {resisc45_train.features['label'].names[i]}")

print("\nRESISC45 dataset loaded successfully!")

Who can review?

@RandallBalestriero

@Leon-Leyang

Copy link
Copy Markdown
Collaborator

Could you run pre-commit run --all-files so that the precommit test passes.

@Leon-Leyang

Copy link
Copy Markdown
Collaborator

My guess about the test failure is that the download is unsuccessful. Could you also try with the new download utility. So could you try just using _split_generators from BaseDatasetBuilder instead of using the current one defined in RESISC45.

@Hazel-Heejeong-Nam Hazel-Heejeong-Nam force-pushed the f/hazel/rsscn7-and-resisc45 branch from 6ba7031 to 0abd50d Compare February 11, 2026 21:11
@Hazel-Heejeong-Nam

Copy link
Copy Markdown
Contributor Author

@Leon-Leyang

  • Pre-commit checks are passing now.
  • The failing tests are unrelated to RESISC45 and they are coming from other datasets as you can check in the log.

  • FYI, I also changed the download link for RESISC45. The original Figshare URL used to work, but when re-testing it today, the request is now being blocked as bot traffic. A curl -IL on the original link returns:
x-amzn-waf-action: challenge
server: awselb/2.0
content-length: 0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants