Consider support for role assumption (SET ROLE) to narrow effective permissions #10894

bharos · 2026-04-28T16:55:32Z

bharos
Apr 28, 2026
Collaborator

Problem
When a user holds multiple roles, Gravitino evaluates all of them as a union for every authorization check. There is no way to restrict evaluation to a subset of roles for a given request or session.

Example: A user has broad_role (100 tables) and restricted_role (5 tables). Today the user always sees all 105 tables. We want the ability to operate under only restricted_role's permissions — equivalent to Snowflake's USE SECONDARY ROLE / Hive's SET ROLE.

Current behavior
The authorization layer loads all roles granted to the authenticated user and evaluates them as a union. There is no mechanism to specify which role(s) should be active for a given request. This applies to both the Iceberg REST API and native Gravitino API paths.

Possible approach
Support an optional HTTP header (e.g., X-Gravitino-Active-Role) on Iceberg REST and native API requests:

Auth interceptor reads the header
Validate the user actually holds the specified role (prevent escalation)
If set, evaluate only the specified role's policies instead of all
No header = current behavior (fully backward compatible)

This works with Iceberg clients (Trino, Spark) via RESTSessionCatalog's header.* catalog properties , no client code changes needed. However, the role is static per catalog configuration. Dynamic per-session role switching (like Snowflake's USE ROLE) would require either a server-side session API or client-side changes. The header approach can cover the case where different applications or contexts operate under different roles.

Alternative / longer-term direction
A header is a pragmatic first step. For full parity with Snowflake's USE ROLE — where users dynamically switch roles mid-session via SQL — the Iceberg REST catalog server would need session-aware role state, and engine connectors (Trino, Spark) would need to propagate SET ROLE to the server. That's a larger cross-project effort.

Interested in gathering community's thoughts on this and hearing if there are other approaches being considered.

roryqi · 2026-04-29T02:26:10Z

roryqi
Apr 29, 2026
Collaborator

Thanks for this discussion. We have considered this question, the compute engine (Spark and Flink) doesn't support similar operations. So I don't do similar support.

0 replies

bharos · 2026-04-29T06:26:54Z

bharos
Apr 29, 2026
Collaborator Author

Thanks @roryqi . Your observation is correct, although I think the header approach I mentioned above specifically avoids needing compute engine support.
AFAIK Iceberg's RESTSessionCatalog already forwards any header.* catalog property as an HTTP header on every request. So a user can configure something like:

spark.sql.catalog.x.header.X-Gravitino-Active-Role = restricted_role

No Spark/Flink/Trino code changes needed. The role is set at catalog configuration time, not via SQL.

This covers the practical use case: different applications or service accounts operate under different roles (e.g., a reporting pipeline runs with read_only_role, while an ETL pipeline uses write_role). The drawback is that it's static per catalog instance, not dynamic per-session — but we can use that as a workaround for now, for at least some use-cases.

0 replies

markhoerth · 2026-04-30T03:33:19Z

markhoerth
Apr 30, 2026
Collaborator

Thanks for raising this, @bharos. A few things would help the discussion reason about this more concretely:
Use case. Could you walk through the specific scenario driving this? In particular: (1) is this primarily for service-account workloads where one identity needs different effective permissions for different jobs, or for interactive users with multiple roles, or both? (2) what's the current workaround — separate catalogs per role, accepting the over-grant exposure, separate service identities? (3) is the narrowing requirement static (per-pipeline configuration) or dynamic (changes during a single workload's lifetime)? The right approach depends a lot on which of these is the dominant case.
Style of narrowing. Header-based per-request narrowing is one option; another is encoding the activated role set into the auth token itself, typically via OAuth scope at token issuance. Some other products in this space take the scope-based approach, so it's worth weighing alongside the header design. A few architectural properties to consider: the activated role set becomes a characteristic of the session/token rather than a per-request parameter, which means a token can't be widened by an attacker manipulating headers — it's bound to its narrowed set at issuance. It also opens the door to additional issuance-time checks (time-of-day, MFA, source-IP) where the IdP path supports them. Header-based narrowing is simpler to implement and may still be the right answer; both approaches are worth weighing.
Dynamic vs. static. True dynamic mid-session role switching (the SQL USE ROLE model) is a meaningfully larger effort — server-side session state, engine cooperation to propagate the switch — and would need a specific use case to justify. The connection-time / token-issuance-time narrowing covers most practical cases without those costs. If your scenario really needs mid-session switching rather than per-connection or per-token narrowing, it'd be worth hearing why explicitly.

0 replies

bharos · 2026-04-30T04:00:12Z

bharos
Apr 30, 2026
Collaborator Author

Thanks @markhoerth for looking into this.
The use-case I want to solve is similar to what I mentioned above, ie. broad-access-role and restricted-access-role
Imagine a case where I want to expose a restricted set of tables (and nothing more) to a specific role restricted-access-role

For the second part of your question, I did consider the option of passing the group via OAuth token some way using specific scopes, but AFAICT the IdP provider (Azure in our case) doesn't allow this, it just sends all the groups (upto 200 groups) that the user belongs to, and we can't choose to have a narrow set of groups based on scope. If this understanding is incorrect, then yeah that would be a potential option as well.

2 replies

markhoerth Apr 30, 2026
Collaborator

HI @bharos, could you say more? If broad-access-role has 100 tables, and restricted-access-role has 5, then the identity has access to all 105 tables. Why is it valuable to reduce this user's capability to just the 5?

bharos Apr 30, 2026
Collaborator Author

Least-privilege enforcement at runtime. A specific workload should only touch 5 tables even though the identity can touch 105. If that workload misbehaves (bug, misconfigured join, compromised credential), the blast radius is bounded. This is the core rationale behind Snowflake's USE ROLE and Hive's SET ROLE — both exist precisely for this scenario.
Audit and compliance. When a pipeline operates under restricted_role, any access outside those 5 tables is a hard deny. You get a clean audit signal: "this workload attempted out-of-scope access" vs. "it silently succeeded and we have to figure out intent from logs after the fact." Some data governance policies require explicit opt-in to elevated access.
Migration parity. For context — SET ROLE is an existing capability in Hive with SQL Standard Authorization. Teams migrating table formats to Iceberg (which goes through Gravitino's REST catalog) would lose this capability. Being able to narrow the active role via the header approach restores parity without requiring engine-side changes.

To directly answer your question: it's not about the user wanting less access , it's about the workload operating with only the access it needs.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consider support for role assumption (SET ROLE) to narrow effective permissions #10894

Uh oh!

{{title}}

Uh oh!

Replies: 4 comments 2 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Consider support for role assumption (SET ROLE) to narrow effective permissions #10894

Uh oh!

bharos Apr 28, 2026 Collaborator

Replies: 4 comments · 2 replies

Uh oh!

roryqi Apr 29, 2026 Collaborator

Uh oh!

bharos Apr 29, 2026 Collaborator Author

Uh oh!

markhoerth Apr 30, 2026 Collaborator

Uh oh!

bharos Apr 30, 2026 Collaborator Author

Uh oh!

markhoerth Apr 30, 2026 Collaborator

Uh oh!

bharos Apr 30, 2026 Collaborator Author

bharos
Apr 28, 2026
Collaborator

Replies: 4 comments 2 replies

roryqi
Apr 29, 2026
Collaborator

bharos
Apr 29, 2026
Collaborator Author

markhoerth
Apr 30, 2026
Collaborator

bharos
Apr 30, 2026
Collaborator Author

markhoerth Apr 30, 2026
Collaborator

bharos Apr 30, 2026
Collaborator Author