About the Role
We are seeking an experienced Observability Engineer to build and enhance real-time monitoring and logging capabilities, starting with the Surveys and Marketplace Catalog-air services. The engineer will implement OpenTelemetry-based observability, design and extend SDKs for consistent instrumentation, stream data pipelines into AWS-native tools, and create real-time health dashboards. This role is part of a high-visibility initiative that ensures application health, operational transparency, and proactive alerting for mission-critical customer-facing systems.
Responsibilities
- Instrument backend services using OpenTelemetry SDKs for logs, traces, and metrics.
- Develop and extend observability SDKs/libraries for consistent instrumentation across services.
- Integrate observability data pipelines with the FOCUS framework.
- Configure and manage AWS OpenSearch, QuickSight, and Kinesis Data Streams/KDA.
- Build and deploy QuickSight dashboards for service health monitoring.
- Implement near real-time alerting and automated escalation mechanisms.
- Extend monitoring to additional services (e.g., Catalog-air).
- Define performance baselines and set up anomaly detection rules.
- Collaborate with backend and DevOps teams to ensure secure and scalable observability pipelines.
- Document runbooks, observability architecture, and onboarding guides.