Skip to content

Track macOS platform support gaps in ADP #1718

@thieman

Description

@thieman

Context

ADP targets Linux as its first-class platform. macOS builds compile and run, but several features degrade silently or are disabled entirely. This issue catalogs the known gaps with enough detail to act on them.

Related issues: #580 (CI build/test), #625 (flaky process-memory test on macOS).


What already works on macOS

  • PlatformSettings (lib/datadog-agent-commons/src/platform/macos_impl.rs) exists: default config dir (/opt/datadog-agent/etc), log dir, and syslog URI (unixgram:///var/run/syslog) are all defined.
  • process-memory has a darwin module using Mach task APIs — compiles and works.
  • UDS (Unix domain sockets) compile and run via the #[cfg(unix)] path.
  • Syslog works via the #[cfg(unix)] path.
  • Containerd collector is not Linux-gated; it populates ContainerPid → Container alias entries in the tag store on macOS when containerd is running.

Gap 1: UDS origin detection — a platform limitation, not just a missing implementation

Files: lib/saluki-io/src/net/unix/non_linux.rs, lib/saluki-io/src/net/addr.rs, lib/saluki-components/src/sources/dogstatsd/mod.rs

The platform constraint

On Linux, ADP uses SO_PASSCRED + SCM_CREDENTIALS ancillary data: each recvmsg delivers the sender's {pid, uid, gid} as out-of-band data in the ucred struct. The PID is what the origin detection pipeline needs — it's what gets set via origin.set_process_id() and is used to resolve a container ID.

On macOS, the analogous APIs are getsockopt(LOCAL_PEERCRED) (returns xucred) and getpeereid(). Both give uid and gid only — neither provides a PID. This is a macOS kernel design decision with no standard workaround. PID-based UDS origin detection is not possible on macOS.

What the code does today

  • enable_uds_socket_credentials is a no-op (returns Ok(()) without setting any socket option).
  • uds_recvmsg in non_linux.rs calls recvmsg via socket2 but passes no ancillary data buffer, returning ProcessCredentialsError::UnsupportedPlatform in the ConnectionAddress.

The real symptom: misleading error telemetry

When dogstatsd_origin_detection: true is configured with a UDS listener on macOS, origin_detection_enabled is true. Every received packet has peer_addr.has_process_credential_error() == true (because the identity is ProcessIdentity::Error(UnsupportedPlatform)), so the dogstatsd_uds_origin_detection_errors counter increments on every single packet. This looks like a continuous failure stream in metrics even though there's no bug — ADP just can't get the PID on this platform.

Recommended fix

The UnsupportedPlatform case is semantically different from InvalidCredentials and ZeroPid: it's not a per-message failure, it's a static platform capability gap. The error counter should only fire for InvalidCredentials and ZeroPid. UnsupportedPlatform should be treated as Unavailable for telemetry purposes, or the counter should be gated to exclude it. Optionally, a one-time startup warning when dogstatsd_origin_detection: true is configured on a non-Linux platform would make this legible without spamming metrics.


Gap 2: On-demand PID resolver — intentionally noop, but the containerd alias path works

File: lib/saluki-env/src/workload/on_demand_pid.rs

The OnDemandPIDResolver has two implementations:

  • Linux (Inner::Linux): on-demand procfs/cgroups walk — given a PID, reads /proc/<pid>/cgroup to find the container ID. This is a fast fallback for processes that appeared between collector ticks.
  • Non-Linux (Inner::Noop): always returns None. This is intentional and correct — there is no procfs on macOS.

However, the tag store has a separate, cross-platform alias mechanism. The containerd collector (which is not Linux-gated) builds ContainerPid → Container alias entries via MetadataOperation::add_alias as containers start. When get_tags_for_entity(ContainerPid(pid)) is called on the tag store, the alias lookup redirects to the container entity and returns its tags. This path works on macOS when containerd is running.

The OnDemandPIDResolver noop on macOS is therefore acceptable: the alias path in the tag store covers the steady-state case. The only thing lost is the on-demand fallback for PIDs that arrived before the collector had a chance to register the alias. No code change is needed here, but this interaction should be documented.

The practical bottleneck remains Gap 1: we can't extract a PID from the socket on macOS in the first place.


Gap 3: Cgroups collector — correctly Linux-only

Files: lib/saluki-env/src/workload/collectors/mod.rs, bin/agent-data-plane/src/internal/env/workload/mod.rs

CgroupsMetadataCollector is #[cfg(target_os = "linux")] and absent on macOS. The cgroups collector maps cgroup controller inodes to container IDs — this is a Linux kernel concept with no macOS equivalent. On macOS, the containerd collector entirely covers the alias-mapping role. No action needed.


Gap 4: UDP autoscaling — minor, handled gracefully

File: lib/saluki-components/src/sources/dogstatsd/mod.rs:621

SO_REUSEPORT with kernel hash-based load balancing across multiple socket descriptors is Linux-specific. If dogstatsd_autoscale_udp_listeners: true is set, ADP logs a warning on macOS and falls back to a single UDP socket. The fallback is functional. socket_reuseport_supported() in non_linux.rs always returns false, which is correct. No action needed.


Gap 5: jemalloc — cosmetic

File: bin/agent-data-plane/src/main.rs

jemalloc (with background threads, stats, and the tracking allocator) is gated on target_os = "linux". macOS falls back to std::alloc::System with just the tracking wrapper. No memory stats from jemalloc on macOS. Not relevant for production; low priority.


Actionable work summary

Gap Action
dogstatsd_uds_origin_detection_errors fires on every packet on macOS Fix: don't count UnsupportedPlatform as an error; add optional startup warning
PID-based origin enrichment unavailable on macOS Document: platform limitation, not a code bug
enable_uds_socket_credentials is a no-op Harmless; leave as-is or add a comment
OnDemandPIDResolver is noop Document: containerd alias path covers macOS steady-state
Cgroups collector absent No action needed
UDP autoscaling falls back No action needed

The highest-value item is fixing the error counter so macOS users don't see spurious origin_detection_errors telemetry when dogstatsd_origin_detection is enabled.

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/componentsSources, transforms, and destinations.area/ioGeneral I/O and networking.effort/intermediateInvolves changes that can be worked on by non-experts but might require guidance.type/enhancementAn enhancement in functionality or support.

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions