Senior ML Engineer at Red Hat AI Inference Engineering. Kubernetes-native distributed LLM inference.
Working on llm-d · KServe · Gateway API Inference Extension · vLLM
Interests
P/D disaggregation · inference scheduling · KServe LLMInferenceService · EPP scorers · KEDA autoscaling


