What We’re Looking For:
Bachelor’s or Master's degree in Computer Science, Engineering, or a related field
1-3 years of experience in data engineering with a proven track record of designing large-scale, distributed data systems.
Strong expertise in Snowflake and other distributed analytical data stores.
Hands-on experience with Apache Spark, Flink, Airflow, and modern data lakehouse formats (Iceberg, Parquet).
Deep understanding of data modeling, schema design, query optimization, and partitioning strategies at scale.
Proficiency in Python, SQL, Scala, Go/Nodejs with strong debugging and performance-tuning skills.
Experience in streaming architectures, CDC pipelines, and data observability frameworks.
Proficient in deploying containerized applications (Docker, Kubernetes, ECS).
Familiarity with using AI Coding assistants like Cursor, Claude Code, or GitHub Copilot
Preferred Qualifications:
Exposure to CI/CD pipelines, automated testing, and infrastructure-as-code for data workflows.
Familiarity with streaming platforms (Kafka, Kinesis, Pulsar) and real-time analytics engines (Druid, Pinot, Rockset).
Understanding of data governance, lineage tracking, and compliance requirements in a multi-tenant SaaS platform.