Skip to content

OTel collector UDS socket permissions not configurable (datadog/agent_receiver created with 755) #49575

@shaikatzz

Description

@shaikatzz

Summary

When the Datadog agent with embedded OTel collector is configured to use a Unix Domain Socket via the datadog/agent_receiver component with transport: unix, the socket is created with default permissions srwxr-xr-x (755). There is no way to configure these permissions in the receiver config or anywhere else in the agent/Helm chart.

Problem

In our Kubernetes setup, the OTel socket is exposed on a hostPath volume so that application pods running on the same node can send traces/metrics to the agent via UDS. For this to work, the socket must be world-writable (722 or similar), but the receiver always creates it with 755 (readable but not writable by others), causing clients to fail with EACCES.

Current receiver config — no permissions knob available:

datadog/agent_receiver_uds:
  endpoint: /var/run/datadog-uds/otel.socket
  transport: unix
  read_timeout: 60s
  # permissions: "0722"  <-- this field does not exist

As a workaround we are using a postStart lifecycle hook on the agent container that polls and re-applies chmod 722 in a background loop:

timeout 30 sh -c '
  [ -d /var/run/datadog-uds ] || exit 0
  if [ -S /var/run/datadog-uds/otel.socket ] && [ -r /proc/net/unix ] && \
     ! grep -Fq "/var/run/datadog-uds/otel.socket" /proc/net/unix; then
    rm -f /var/run/datadog-uds/otel.socket
  fi
  while [ ! -S /var/run/datadog-uds/otel.socket ]; do sleep 1; done
  chmod 722 /var/run/datadog-uds/otel.socket
  (while true; do
    sleep 5
    [ -S /var/run/datadog-uds/otel.socket ] && chmod 722 /var/run/datadog-uds/otel.socket 2>/dev/null
  done) &
' || true

This is fragile because:

  • There is a small window between socket creation and the first chmod where clients fail with EACCES.
  • The background loop re-applying chmod every 5s is a hack — if the agent restarts the OTel component and recreates the socket, consumers get temporary EACCES errors.
  • It requires privileged/root access in a postStart step just to fix permissions the receiver should own.

Feature Request

Please add a permissions (or socket_permissions) field to the datadog/agent_receiver component so the socket is created with the correct mode from the start:

datadog/agent_receiver_uds:
  endpoint: /var/run/datadog-uds/otel.socket
  transport: unix
  read_timeout: 60s
  permissions: "0722"

This would eliminate the need for any external chmod and make the socket usable by non-root client pods immediately upon creation.

Environment

  • Datadog Agent version: 7.77.0 (custom OTel image)
  • Deployment: Kubernetes via Helm chart (datadog/datadog)
  • OTel collector: embedded (useStandaloneImage: false)

Metadata

Metadata

Assignees

No one assigned

    Labels

    oss/0External contributions priority 0pendingLabel for issues waiting a Datadog member's response.team/opentelemetry-agent

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions