Data Engineer (DBT + Spark + Argo)

Data Engineer (DBT + Spark + Argo)

Data Engineer (DBT + Spark + Argo)

Data Engineer (DBT + Spark + Argo)

Home

Industries

Services

Sep 9, 2025

873

Data Engineer (DBT + Spark + Argo)

LATAM

Remote

Fulltime

Senior

Location: LATAM (Remote supporting US-based teams)
Job Type: Contractor (Full-Time, Remote)
Project: Data Platform Modernization Healthcare Sector
Time Zone: Aligned with GMT-3 (Argentina)
English Level: B2/C1

Get to Know Us

At Darwoft, we build software that drives real change. But were more than just tech were people first. With a remote-first culture and a highly collaborative team spread across LATAM, we partner with global companies to co-create reliable, scalable, and impactful digital products.

Were currently working with a leading US-based healthtech platform, in a major transformation of their data pipeline ecosystem migrating legacy SQL logic into modern, scalable cloud-based infrastructure using DBT, Spark, Argo, and AWS.

Were Looking For a Senior Data Engineer (DBT + Spark + Argo)

In this role, you will be at the core of a strategic data transformation initiative: converting monolithic SQL Server logic into a modular, testable DBT architecture, while integrating Spark for performance and Argo for orchestration. You will work with cutting-edge lakehouse formats like Apache Hudi, Parquet, and Iceberg, and enable real-time analytics through ElasticSearch integration.

If you're passionate about modern data engineering and want to work in a data-driven, cloud-native, healthcare-focused environment, this is the role for you.

What Youll Be Doing

Translate legacy T-SQL logic into modular, scalable DBT models powered by Spark SQL
Build reusable and performant data transformation pipelines
Develop testing frameworks to ensure data accuracy and integrity in DBT workflows
Design and orchestrate workflows using Argo Workflows and CI/CD pipelines with Argo CD
Manage mock data and reference datasets (e.g., ICD-10, CPT), ensuring version control and governance
Implement efficient storage/query strategies using Apache Hudi, Parquet, and Iceberg
Integrate ElasticSearch for analytics by building APIs and pipelines to support indexing and querying
Collaborate with DevOps teams to optimize S3 usage, enforce data security, and ensure compliance
Work in Agile squads and participate in planning, estimation, and sprint reviews

What You Bring

Strong experience with DBT for data modeling, testing, and deployment
Hands-on proficiency in Spark SQL, including performance tuning
Solid programming skills in Python for automation and data manipulation
Familiarity with Jinja templating for building reusable DBT components
Practical experience with data lake formats: Apache Hudi, Parquet, Iceberg
Expertise in Argo Workflows and CI/CD integration with Argo CD
Deep understanding of AWS S3 data storage, performance tuning, and cost optimization
Strong command of ElasticSearch for indexing structured/unstructured data
Knowledge of ICD-10, CPT, and other healthcare data standards
Ability to work cross-functionally in Agile environments

Nice to Have

Experience with Docker, Kubernetes, and container orchestration
Familiarity with cloud-native data tools: AWS Glue, Databricks, EMR, or GCP equivalents
Prior work on CI/CD automation for data engineering workflows
Knowledge of data compliance standards: HIPAA, SOC2, etc.
Contributions to open-source projects in DBT, Spark, or data engineering frameworks