-
Notifications
You must be signed in to change notification settings - Fork 232
Suppress sysrq Kernel panics for Kdump tests #4473
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from 1 commit
3df958b
d7c960b
23714ee
eb15160
8848678
0e9db01
9bcb0cf
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -898,6 +898,32 @@ def kdump_test( | |
| # We should clean up the vmcore file since the test is passed | ||
| self.node.execute(f"rm -rf {kdump.dump_path}/*", shell=True, sudo=True) | ||
|
|
||
| # The sysrq-triggered panic is intentional for this test. Suppress the | ||
| # post-case panic check so it doesn't flag the expected crash as a | ||
| # failure when SerialConsole.check_panic re-scans the boot diagnostics. | ||
| self._suppress_expected_sysrq_panic() | ||
|
|
||
|
SRIKKANTH marked this conversation as resolved.
|
||
| def _suppress_expected_sysrq_panic(self) -> None: | ||
| from lisa.features import SerialConsole | ||
|
|
||
| if not self.node.features.is_supported(SerialConsole): | ||
| return | ||
| serial_console = self.node.features[SerialConsole] | ||
| # Patterns matching the panic this test intentionally triggered via sysrq. | ||
| expected_patterns: List[re.Pattern[str]] = [ | ||
|
|
||
| re.compile( | ||
| r"^(.*Kernel panic - not syncing: sysrq triggered crash.*)$", | ||
| re.MULTILINE, | ||
| ), | ||
| re.compile(r"^(.*sysrq: SysRq : Trigger a crash.*)$", re.MULTILINE), | ||
|
SRIKKANTH marked this conversation as resolved.
|
||
| # The RIP line accompanies the sysrq-triggered panic. | ||
| re.compile(r"^(.*RIP:.*)$", re.MULTILINE), | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @SRIKKANTH RIP is too broad
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There is not much to compare after RIP.
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. RIP is too broad , kdump is being run by many teams and with this change we will start getting false panic detection . i had discussion with Vijay/Tyle some months ago regarding kernel panic detection and from meeting it was decided we should not add overly broad detectiom and every kernel panic detection change should go through kernel team for review .
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You can do like this : It does not ignore every RIP: line.
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @vyadavmsft , Added '0033:' which will restrict the regex only to user space generated kernel panics like 'echo c > /proc/sysrq-trigger'
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @LiliDeng , @johnsongeorge-w please approve this PR if there are no more questions.
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @SRIKKANTH
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. RIP: 0033:.* says the crash was generated by user space application. Any kernel or hardware generated crashes are not covered by it. I think its safe to have this one.
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
fixed the typo
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sample sysrq generated kernel panic. There is not much to compare after RIP.
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There was a typo in my comment, corrected it, now check
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Please scope this to the sysrq crash handler:
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. can you try changing it |
||
| ] | ||
| # Shadow the class attribute on the instance so other nodes are unaffected. | ||
| existing = list(serial_console.panic_ignorable_patterns) | ||
| existing.extend(expected_patterns) | ||
| serial_console.panic_ignorable_patterns = existing | ||
|
kanchansenlaskar marked this conversation as resolved.
|
||
|
|
||
| def trigger_kdump_on_specified_cpu(self, cpu_num: int, log_path: Path) -> None: | ||
| lscpu = self.node.tools[Lscpu] | ||
| thread_count = lscpu.get_thread_count() | ||
|
|
||
Uh oh!
There was an error while loading. Please reload this page.