Replies: 4 comments 2 replies
-
|
Thanks for this discussion. We have considered this question, the compute engine (Spark and Flink) doesn't support similar operations. So I don't do similar support. |
Beta Was this translation helpful? Give feedback.
-
|
Thanks @roryqi . Your observation is correct, although I think the header approach I mentioned above specifically avoids needing compute engine support. No Spark/Flink/Trino code changes needed. The role is set at catalog configuration time, not via SQL. This covers the practical use case: different applications or service accounts operate under different roles (e.g., a reporting pipeline runs with read_only_role, while an ETL pipeline uses write_role). The drawback is that it's static per catalog instance, not dynamic per-session — but we can use that as a workaround for now, for at least some use-cases. |
Beta Was this translation helpful? Give feedback.
-
|
Thanks for raising this, @bharos. A few things would help the discussion reason about this more concretely: |
Beta Was this translation helpful? Give feedback.
-
|
Thanks @markhoerth for looking into this. For the second part of your question, I did consider the option of passing the group via OAuth token some way using specific scopes, but AFAICT the IdP provider (Azure in our case) doesn't allow this, it just sends all the groups (upto 200 groups) that the user belongs to, and we can't choose to have a narrow set of groups based on scope. If this understanding is incorrect, then yeah that would be a potential option as well. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Problem
When a user holds multiple roles, Gravitino evaluates all of them as a union for every authorization check. There is no way to restrict evaluation to a subset of roles for a given request or session.
Example: A user has broad_role (100 tables) and restricted_role (5 tables). Today the user always sees all 105 tables. We want the ability to operate under only restricted_role's permissions — equivalent to Snowflake's USE SECONDARY ROLE / Hive's SET ROLE.
Current behavior
The authorization layer loads all roles granted to the authenticated user and evaluates them as a union. There is no mechanism to specify which role(s) should be active for a given request. This applies to both the Iceberg REST API and native Gravitino API paths.
Possible approach
Support an optional HTTP header (e.g., X-Gravitino-Active-Role) on Iceberg REST and native API requests:
This works with Iceberg clients (Trino, Spark) via RESTSessionCatalog's header.* catalog properties , no client code changes needed. However, the role is static per catalog configuration. Dynamic per-session role switching (like Snowflake's USE ROLE) would require either a server-side session API or client-side changes. The header approach can cover the case where different applications or contexts operate under different roles.
Alternative / longer-term direction
A header is a pragmatic first step. For full parity with Snowflake's USE ROLE — where users dynamically switch roles mid-session via SQL — the Iceberg REST catalog server would need session-aware role state, and engine connectors (Trino, Spark) would need to propagate SET ROLE to the server. That's a larger cross-project effort.
Interested in gathering community's thoughts on this and hearing if there are other approaches being considered.
Beta Was this translation helpful? Give feedback.
All reactions