Skip to content

fix: recover dotted directory names#2188

Open
Jay2006sawant wants to merge 5 commits into
oscal-compass:developfrom
Jay2006sawant:fix/trash-recover-dotted-dirs
Open

fix: recover dotted directory names#2188
Jay2006sawant wants to merge 5 commits into
oscal-compass:developfrom
Jay2006sawant:fix/trash-recover-dotted-dirs

Conversation

@Jay2006sawant

Copy link
Copy Markdown
Contributor

Types of changes

  • Hot fix (emergency fix and release)
  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Documentation (change which affects the documentation site)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Release (develop -> main)

Quality assurance (all should be covered).

  • My code follows the code style of this project.
  • Documentation for my change is up to date?
  • My PR meets testing requirements.
  • All new and existing tests passed.
  • All commits are signed-off.

Summary

Fix trash.recover() so directories with dots in their names, such as policy.v1, are restored correctly instead of being treated as files.

The recovery logic now checks whether a trashed directory exists before falling back to file recovery, so the dotted-directory case restores successfully as expected.

Add a regression test covering dotted directory names to prevent this issue from returning.

Key links:

Before you merge

  • Ensure it is a 'squash commit' if not a release.
  • Ensure CI is currently passing
  • Check sonar. If you are working for a fork a maintainer will reach out, if required.

Signed-off-by: Jay2006sawant <jay242902@gmail.com>
@Jay2006sawant Jay2006sawant requested a review from a team as a code owner April 1, 2026 13:03
@Jay2006sawant

Copy link
Copy Markdown
Contributor Author

Hey @degenaro ! Could you please approve the workflow runs for this PR? Thanks.

@Jay2006sawant

Copy link
Copy Markdown
Contributor Author

@degenaro Can you please review this?

@degenaro

Copy link
Copy Markdown
Collaborator

PR Review: fix: recover dotted directory names ([#2188](#2188))

Repo: oscal-compass/compliance-trestle
Author: [@Jay2006sawant](https://github.com/Jay2006sawant)
Fixes: [#2187 — trash.recover() fails for directory names containing dots](#2187)
Files changed: trestle/common/trash.py, tests/trestle/utils/trash_test.py
Status: Open — awaiting review


Summary

This PR fixes a real and well-documented bug in trash.recover() where directories with dots in their names (e.g. policy.v1) were silently misrouted to recover_file() instead of recover_dir(), causing recovery to fail with an AssertionError.


The Bug (Issue #2187)

The original dispatch logic in recover() was:

if dest_content_path.suffix != '':
    return recover_file(dest_content_path, delete_trash)
return recover_dir(dest_content_path, delete_trash)

pathlib.Path('policy.v1').suffix == '.v1', so any dotted directory name is treated as a file regardless of what the filesystem actually contains. The issue author's debug output makes the failure unambiguous — the trashed directory exists at policy.v1__bk (the dir naming convention), but recover() was looking for policy.v1.bk (the file convention) and failing with:

EXCEPTION_TYPE: AssertionError
EXCEPTION_MSG: Specified path ".../policy.v1" could not be found in trash

The Fix

# Before
if dest_content_path.suffix != '':
    return recover_file(dest_content_path, delete_trash)
return recover_dir(dest_content_path, delete_trash)

# After
trash_dir_path = to_trash_dir_path(dest_content_path)
if trash_dir_path.exists() and trash_dir_path.is_dir():
    return recover_dir(dest_content_path, delete_trash)
return recover_file(dest_content_path, delete_trash)

This is the right approach. Instead of guessing type from the path name, it consults the trash store directly. The two suffix conventions (__bk for directories, .bk for files) ensure to_trash_dir_path and to_trash_file_path produce non-colliding paths, so the disambiguation is reliable.


What Works Well

  • Correct root cause fix. The old heuristic was fundamentally wrong — using Path.suffix to determine file-vs-directory is a category error. Checking what's actually in the trash is the only reliable approach.
  • Minimal, focused change. Four lines changed in production code. No unrelated churn.
  • Good regression test. The new test correctly: creates the trestle config dir, creates a dotted directory with a child file, trashes it, asserts it's gone, recovers it, and asserts both the directory and its contents are restored — exactly the scenario from the bug report.
  • Commits are signed-off and the branch was synced with develop before review.

Concerns & Suggestions

🔶 Missing test: dotted file names

The PR adds a test for a dotted directory name but does not add (or call out) a test for a dotted file name such as policy.v1.json. These are different code paths — a dotted file should still route to recover_file() under the new logic. It would be good to confirm this is covered by an existing test, or add it here explicitly.

Suggested test:

def test_trash_recover_dotted_file_name(tmp_path: pathlib.Path) -> None:
    """Test recover still handles dotted file names correctly."""
    test_utils.ensure_trestle_config_dir(tmp_path)
    dotted_file: pathlib.Path = tmp_path / 'policy.v1.json'
    dotted_file.write_text('{}')

    trash.store(dotted_file, True)
    assert dotted_file.exists() is False

    trash.recover(dotted_file)
    assert dotted_file.exists()

🔶 Missing coverage: delete_trash=True

The new test calls trash.recover(dotted_dir) without exercising the delete_trash=True path. That code path is unchanged, but since this is a regression test it would be more thorough to cover it.

🔵 Silent fallthrough when nothing is in trash

If recover() is called on a dotted-name path that was never trashed, the new code falls through to recover_file(), which raises an AssertionError with a "file not found in trash" message. This matches pre-fix behavior for undotted names, so it isn't a regression — but the error message would be slightly misleading for the directory case. Low-priority nit; could be addressed separately.

🔵 Docstring not updated

The recover() docstring remains thin:

"It recovers the latest path content from trash if exists"

A one-liner noting that dispatch is now based on what exists in the trash store (rather than the path suffix) would help future readers understand the intent at a glance.


Verdict

Approve with minor suggestions.

The fix is correct, minimal, and well-motivated. The regression test covers the primary failure scenario. The main thing worth addressing before merging is confirming dotted file names still dispatch to recover_file() correctly — either by pointing to an existing test or adding a new one. The remaining notes are low-priority and can be tracked separately.

@degenaro degenaro self-assigned this Apr 23, 2026
@degenaro

Copy link
Copy Markdown
Collaborator

@Jay2006sawant This PR looks pretty good. Please have a look at Claude's suggestions, thx.

@degenaro

Copy link
Copy Markdown
Collaborator

@Jay2006sawant Would like to merge this soon. Can you address the 4 issues Claude called out? Thx!

@Jay2006sawant

Copy link
Copy Markdown
Contributor Author

@degenaro Yes, working on it

Signed-off-by: Jay2006sawant <jay242902@gmail.com>
@Jay2006sawant

Copy link
Copy Markdown
Contributor Author

@degenaro PTAL,now!

@Jay2006sawant

Copy link
Copy Markdown
Contributor Author

@degenaro Any updates on this?

@degenaro

degenaro commented Apr 29, 2026

Copy link
Copy Markdown
Collaborator

Redacted Claude review:

What's good: This is actually a stronger fix than the one in PR #2209. Rather than checking the source path's characteristics, it asks the trash filesystem directly — "is there a directory in the trash for this path?" — which is ground truth. The suffix heuristic is abandoned entirely, not merely deprioritized.

Concern — silent fallback when nothing is in trash: If nothing was ever trashed (i.e., to_trash_dir_path(path) doesn't exist), the code silently falls through to recover_file(), which will then presumably operate on a non-existent trash file. The old code had the same implicit assumption, so this is not a regression, but neither branch checks that anything actually exists in the trash before proceeding. If recover_file handles the missing-file case gracefully, this is fine — but it's worth a look at recover_file's error handling if not already reviewed.

Minor concern — ambiguous tiebreak: If both to_trash_dir_path(path) and to_trash_file_path(path) somehow exist (a degenerate but conceivable case), the directory silently wins. This is probably the right call but a brief comment would make the intent clear.

@Jay2006sawant

Copy link
Copy Markdown
Contributor Author

Thanks, agreed on both points!
Will start working on it now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

trash.recover() fails for directory names containing dots

2 participants