Title: Data Engineer, APAC D2C (Databricks)
THAI CITIZENS ONLY - No Work Permit Provided
Long-term Contracting role
Hybrid, Bangkok City
We're hiring a hands-on Data Engineer to join an APAC D2C (eCommerce) data domain, building and maintaining robust pipelines and analytics-ready data models on an enterprise Databricks platform.
What you'll do
- Build API-based ingestion pipelines (OAuth, pagination, incremental loads, token refresh handling) for marketplace and digital commerce data sources.
- Develop and optimize pipelines in Databricks using PySpark + SQL (Delta Lake, Spark optimization, partitioning strategies).
- Implement and maintain medallion architecture (bronze silver curated gold).
- Design scalable fact/dimension data models aligned with enterprise standards to support multi-market reporting and insights.
- Own schema evolution, data quality validation, and performance tuning; troubleshoot pipeline failures and data freshness issues.
- Collaborate with the central enterprise data platform team (DXO) to meet governance and architecture standards.
- Contribute to CI/CD workflows and version-controlled development practices.
Key clarifications
1) Cloud platform (flexible)
- No strict requirement for a specific cloud provider beyond Databricks (core platform for data engineering & analytics).
- Experience with Azure / AWS / GCP is beneficial but not mandatory, since most work happens inside Databricks on an enterprise platform managed by DXO.
2) Data volume & scale
- Supporting APAC D2C data across 8 markets in Asia.
- Data sources include marketplace APIs (e.g., Shopee, Lazada), digital commerce platforms, branded webstores, and related commercial datasets.
- The emphasis is not big data at extreme scale, but reliable ingestion + strong analytical data modelling for multi-market reporting.
3) Primary use cases (engineering foundations)
- Initial focus: foundational pipelines + analytics data models enabling eCommerce performance analysis.
- Typical use cases:
- Commercial performance reporting
- Marketplace sales analysis
- Pricing & promotion insights
- Category & product performance analytics
- Forecasting / advanced data science is not the primary scope (this is a data engineering foundations role).
4) Orchestration
- Current orchestration is handled using Airflow.
5) CI/CD & development workflow
- Standard Git-based workflows aligned to enterprise data platform practices managed by DXO.
- Candidates should be comfortable with version control (Git) and structured collaboration; implementation details may vary by platform standards.
What we're looking for
Must-haves
- Professional proficiency in English (speaking, reading, writing)
- 57 years in Data Engineering (or similar).
- Strong hands-on Databricks experience (Delta Lake, Spark optimization, partitioning).
- Proven ability building production-grade REST API ingestion pipelines.
- Strong SQL + PySpark.
- Solid experience designing fact/dimension models in enterprise environments.
- Comfortable with Git and collaborative development workflows; independent problem-solver.
Nice-to-haves
- eCommerce / marketplace data exposure; multi-country / multi-currency datasets.
- Familiarity with Power BI modelling and downstream BI needs; federated/data mesh environments.
What success looks like in this role
- Stable, automated ingestion pipelines for marketplace/digital commerce data.
- Clean, scalable silver-layer models ready for BI consumption and multi-market analytics.
- Improved freshness SLAs, fewer pipeline failures, strong cross-team collaboration (DXO + BI).