View All Jobs 3280

Data Engineer

Design scalable data pipelines to support AI model training and evaluation
San Francisco, California, United States
Senior
$180,000 – 225,000 USD / year
1 week ago
Scale AI

Scale AI

A technology firm specializing in artificial intelligence and machine learning data annotation for various industries such as automotive and retail.

Data Engineer

Software is eating the world, but AI is eating software. We live in unprecedented times – AI has the potential to exponentially augment human intelligence. Every person will have a personal tutor, coach, assistant, personal shopper, travel guide, and therapist throughout life. As the world adjusts to this new reality, leading platform companies are scrambling to build LLMs at billion scale, while large enterprises figure out how to add it to their products. To make them safe, aligned and actually useful, these models need human eval and reinforcement learning through human feedback (RLHF) during pre-training, fine-tuning, and production evaluations. This is the main innovation that's enabled ChatGPT to get such a large head start among competition.

At Scale, our Generative AI Data Engine powers the most advanced LLMs and generative models in the world through world-class RLHF, human data generation, model evaluation, safety, and alignment. The data we are producing is some of the most important work for how humanity will interact with AI.

The Data Analytics team is responsible for centralized data, experimentation and reporting across all areas of Scale. We are building out the critical data pipelines, platforms and reporting, to support data-driven decision making and strategy for the company, including support for financial reporting, experimentations, and AI enabled insights. The team are strong relationship builders and work in close collaboration with delivery, operations, finance, and engineering. You'll be deeply involved in building flexible new systems to support experimentation across the company, and we are looking for engineers who are relentlessly curious and thrive on building systems from ambiguity.

Responsibilities:

  • Provide critical input in the Data Engineering team's roadmap and technical direction
  • Continually improve ongoing data pipelines and simplify self-service support for business stakeholders
  • Perform regular system audits, and create data quality tests to ensure complete and accurate reporting of data/metrics
  • Design and implement and deploy data engineering frameworks
  • Manage and optimize data pipelines, warehouses and costs
  • Deliver at a high velocity and level of quality to engage our customers
  • Work across the entire product lifecycle from conceptualization through production
  • Be able, and willing, to multi-task and learn new technologies quickly
  • Work closely with cross-functional partners like finance, product, software engineers, and operations to identify opportunities for business impact, understand, refine and prioritize requirements for Data engineering.

Requirements:

  • 6+ years of relevant work experience in a role requiring application of data modeling, warehouse optimization and automation skills.
  • Ability to create extensible and scalable data schema and pipelines that lay the foundation for downstream analysis using SQL and Python
  • Experience building a reliable transformation layer and pipelines from ambiguous business processes using tools such DBT to create a foundation for data insights.
  • Experience partnering with engineering, and business stakeholders to automate manual data workflows
  • Experience in best practices for query and cost optimization in Snowflake.
  • Strong written and verbal communication skills
  • Strong problem-solving skills, and be able to work independently or as part of a team.

Nice to Haves:

  • Strong knowledge of software engineering best practices and CI/CD tooling (CircleCI).
  • Experience developing and deploying data engineering tooling
  • Excitement to work with AI technologies.

Compensation packages at Scale for eligible roles include base salary, equity, and benefits. The range displayed on each job posting reflects the minimum and maximum target for new hire salaries for the position, determined by work location and additional factors, including job-related skills, experience, interview performance, and relevant education or training. Scale employees in eligible roles are also granted equity based compensation, subject to Board of Director approval. Your recruiter can share more about the specific salary range for your preferred location during the hiring process, and confirm whether the hired role will be eligible for equity grant. You'll also receive benefits including, but not limited to: Comprehensive health, dental and vision coverage, retirement benefits, a learning and development stipend, and generous PTO. Additionally, this role may be eligible for additional benefits such as a commuter stipend.

The base salary range for this full-time position in the location of San Francisco is: $180,000 - $225,000 USD

+ Show Original Job Post
























Data Engineer
San Francisco, California, United States
$180,000 – 225,000 USD / year
Software
About Scale AI
A technology firm specializing in artificial intelligence and machine learning data annotation for various industries such as automotive and retail.