Staff Site Reliability Engineer (New York) Job at Altana, New York, NY

QVJmbFRwa2Y5OElRT01XNjR0Y0wxM1o5L0E9PQ==
  • Altana
  • New York, NY

Job Description

Get AI-powered advice on this job and more exclusive features.

Pay found in job post

Retrieved from the description.

Base pay range

$170,000.00/yr - $220,000.00/yr

AI can be a powerful tool for good in the world at Altana we apply AI to the worlds largest organized body of supply chain data to power a more resilient, more secure, and more sustainable model of global commerce. Our customers connect to the Altana network to build resilience for critical industries and infrastructure, automate and safeguard cross-border trade, transform insurance underwriting, protect national security, combat modern slave labor, disrupt fentanyl trafficking, and ensure that their products are sustainable.

Altana is backed by leading investors and used by the worlds most important organizations, including Lloyds, Maersk, multiple government agencies across the US, UK, EU, Singapore, and Australia, General Atomics, Boston Scientific, and more. We are building a global platform connecting the public and private sectors into an AI-powered network for building trusted supply chains. We operate in accordance with our values: we focus on value creation, not capture; we foster diversity and embrace difference; we embrace reality; we get things done; we amaze our clients. When you join Altana, youll be joining a vibrant, collaborative team working together to solve complex problems with the potential for global societal impact.

The Opportunity at Altana

At Altana, we believe that software that ships must be reliable and efficient. As a Staff Site Reliability Engineer, you will be instrumental in ensuring the availability, performance, and scalability of Altanas critical production services, with a strong focus on our cloud-native environments and data pipelines. You will apply Google-style SRE principles, embedding reliability into our architecture and operations through automation, proactive monitoring, and a commitment to reducing toil.

You will work handson with engineering teams, influencing system design for operability and contributing to the development of robust, selfhealing infrastructure. This role emphasizes a deep understanding of observability practices to gain comprehensive insights into system behavior, proactive incident prevention, and efficient incident response. Success will be measured by the resilience of our production systems, the effectiveness of our observability stack, and our continuous improvement in operational efficiency and reliability.

Your Responsibilities

  • Reliability Engineering: Champion and implement SRE principles, including establishing and monitoring Service Level Objectives (SLOs) and error budgets for critical services. Drive initiatives to improve system reliability, availability, performance, and efficiency.
  • Observability & Monitoring: Design, implement, and maintain advanced monitoring, logging, and tracing solutions for our cloudnative applications and infrastructure (e.g., Kubernetes, microservices). Develop dashboards, alerts, and runbooks that provide deep insights into system health and behavior.
  • Automation & Toil Reduction: Identify and automate repetitive operational tasks and manual processes across our production environment. Develop tools and scripts to enhance system operations, deployment pipelines, and incident response.
  • Incident Management & Postmortems: Actively participate in the incident response lifecycle, including detection, triage, mitigation, and resolution of production issues. Lead thorough blameless postmortems to identify root causes and implement preventative measures and lasting improvements.
  • System Design & Optimization: Collaborate closely with development teams to influence the design of new services, ensuring they are built for operability, reliability, and costefficiency. Proactively identify and address performance bottlenecks and architectural weaknesses.
  • OnCall Rotation: Participate in a periodic oncall rotation, responding to critical alerts and ensuring rapid resolution of production incidents.
  • Data Reliability: Implement and maintain reliability and observability for critical data pipelines and data infrastructure, ensuring data integrity, availability, and timely processing.

About You

  • 5+ years of handson experience in a Site Reliability Engineering (SRE), DevOps, or equivalent role focusing on production system reliability and operations.
  • Strong understanding and practical application of Site Reliability Engineering (SRE) principles, including SLOs, error budgets, toil reduction, and blameless culture.
  • Expertise in designing, implementing, and managing observability platforms for cloudnative environments (e.g., Prometheus, Grafana, Datadog, ELK stack, OpenTelemetry, Jaeger).
  • Proficiency in at least one programming/scripting language (e.g., Python, Go) for automation and tool development.
  • Extensive handson experience with cloud platforms (AWS, Azure, or GCP), including their compute, networking, and database services.
  • Demonstrated experience with containerization technologies (Docker) and container orchestration platforms (Kubernetes).
  • Experience with Infrastructure as Code (IaC) tools (e.g., Terraform, OpenTofu, CloudFormation) for managing cloud resources.
  • Proven experience participating in and improving incident management processes for critical systems.
  • Knowledge of modern software delivery paradigms, including microservices architectures and CI/CD pipelines.
  • Excellent problemsolving, analytical, and troubleshooting skills in complex distributed systems.
  • Strong communication and collaboration skills, with the ability to work effectively across engineering teams.
  • Experience with data engineering concepts, including building or operating reliable data pipelines, data streaming technologies, or managing largescale data infrastructure.

This role can be based in New York City, or the San Francisco Bay Area with an expectation of occasional travel as needed.

US Salary Range And Benefits

$170,000 - $220,000

Benefits

  • Flexible Time Off: Altana operates with a Flexible Time Off (FTO) policy that gives you agency over your own time off so you can maximize your worklife balance.
  • Parental Leave: We offer industry leading Paid Parental Leave (PPL), providing 14 weeks of leave for nonbirthing, adoptive, and foster parents and up to 26 weeks of leave for birthing parents, all paid at 100% of your base salary.
  • Health Benefits: We have a full suite of medical, vision, and dental benefits with generous employer contributions, designed to give you flexibility and choice for your individual health situation.
  • Supplemental Benefits: Altana provides life, short and longterm disability, and AD&D insurance coverage, all at no cost to you.
  • 401(k) Savings: Save for and invest in your future using our Guideline 401(k) retirement savings program.
  • Commuter Benefits: Save money on your commute by setting aside pretax funds for public transit or parking.
  • Wellness: Every Altana employee has access to a free premium subscription to Calm, the #1 app for meditation, sleep, and mindfulness.
  • Pet Insurance: Pets are family too! Keep them healthy with Wishbone insurance and/or our Total Pet vet service and telehealth discount plan.
  • Employee Assistance Program: Free access to confidential personal support.
  • Dependent Care FSA: You will have access to a Dependent Care FSA, which allows you to set aside pretax funds for childcare expenses.

The recruiter assigned to this role can share more information about the specific compensation and benefit details associated with this role during the hiring process.

Equal Opportunity Statement

At Altana, we believe that a diverse workforce enables greater creativity, performance, and adaptability. Were proud to be an equal opportunity employer and welcome you to join us as you are. Our employment opportunities and decisions are based on business needs and individual qualifications, without regard to race, color, religious creed, national origin, ancestry, age, physical or mental disability, medical condition, marital status, sexual orientation, gender identity or expression, genetic information, family care or medical leave status, military or veteran status, or any other characteristic protected by the laws or regulations in the areas in which we operate. We prohibit discrimination and harassment of any type, in any situation.

Why its great to work at Altana

  • We love to collaborate, and we win as a team!
  • We are committed to engineering excellence.
  • We value personal and professional development.
  • We learn from diverse backgrounds and perspectives.
  • We impact the world, from enabling developing countries to identifying drug traffickers.
#J-18808-Ljbffr

Job Tags

Full time, Flexible hours,

Similar Jobs

InflectionCX

Inbound Contact Center Agent (U.S. Remote): Job at InflectionCX

 ...Job Description Job Description Inbound Contact Center Agent (U.S. Remote): Please note, we are currently not hiring in: AZ,CA,CT, Washington D.C., MA, NJ, NM, NY OR, or WA. InflectionCX, Inc. Remote InflectionCX, a leader in modern Customer Experiences, is... 

Synergy Medical Staffing

Travel Radiation Therapist Job at Synergy Medical Staffing

 ...Job Description Synergy Medical Staffing is seeking a travel Radiation Therapist for a travel job in Princeton, West Virginia. Job...  ...hours, days ~ Employment Type: Travel Travel, Radiation Therapy - Radiation Therapist Location: Princeton, West Virginia... 

Team Builder Recruiting

Senior CPA Job at Team Builder Recruiting

 ...Description Senior CPA Honey Brook, PA Are you a seasoned CPA who enjoys building lasting client relationships, leading...  ...Work a full-time hybrid schedule: 2 days in-office and 3 days remote. Our Ideal Senior CPA: Experienced: Brings 5+ years of tax... 

PRIDE Health

Travel Correctional Registered Nurse Job at PRIDE Health

 ...Job Description PRIDE Health is seeking a travel nurse Correctional for a travel nursing job in Leavenworth, Kansas. Job Description & Requirements ~ Specialty: Correctional ~ Discipline: RN ~ Duration: 13 weeks ~36 hours per week ~ Shift: 12 hours,... 

Avero

Materials Manager Job at Avero

 ...Our customer in Cassopolis, MI. is looking for an experienced Materials Manager to join their team. Responsibilities: Oversee daily materials and logistics operations Manage logistics planning Coordinate with sales, production, and planning teams to align...