Databricks Developer
Job Description
About the Role
We are looking for a Databricks Developer to design, optimize, and maintain data systems built on Delta Lake, MLflow, Unity Catalog. This role combines technical depth in Databricks with the ability to understand business context — you'll work with analysts, engineers, and stakeholders to ensure data is reliable, accessible, and useful. The ideal candidate has hands-on experience with PySpark and Spark SQL, can diagnose complex query performance issues, and understands both OLTP and OLAP patterns. You'll own data pipeline reliability, query performance, and schema evolution for systems handling millions of records.
Key Responsibilities
- Own Delta Lake implementation and optimization — configuration, customization, and ongoing enhancement based on business needs
- Manage MLflow workflows including setup, user training, and continuous improvement of processes
- Implement and maintain Unity Catalog ensuring seamless integration with existing systems and workflows
- Design and maintain Databricks schemas optimized for both operational and analytical workloads
- Write and optimize complex queries, stored procedures, and data transformation pipelines
- Monitor Databricks performance — query execution plans, resource utilization, and capacity planning
- Build automated ETL/ELT pipelines for data integration from multiple source systems
- Create dashboards and reporting solutions that enable data-driven decision making
- Implement data quality checks, validation rules, and monitoring for data pipeline reliability
- Plan and execute database migrations with zero-downtime cutover strategies
Must-Have Qualifications
- Hands-on experience with Delta Lake — configuration, customization, and troubleshooting in production environments
- Proficiency with PySpark as part of the Databricks development/operations workflow
- 3+ years of hands-on Databricks experience in production environments
- Strong SQL skills — complex queries, window functions, CTEs, and query optimization
- Experience with data modeling — star schemas, normalization, and denormalization trade-offs
- Understanding of ETL/ELT pipeline design and data quality management
- Ability to communicate data insights to both technical and non-technical stakeholders
Nice-to-Have Skills
- Databricks Certified Data Engineer Associate certification or equivalent validated credential
- Databricks Certified ML Associate certification or equivalent validated credential
- Experience with advanced Databricks features: MLflow, Unity Catalog, SQL Analytics
- Familiarity with the broader Databricks ecosystem including Spark SQL and Delta Live Tables
- Experience with real-time streaming systems (Kafka, Kinesis, or Flink)
- Knowledge of data governance frameworks and data catalog tools
Interview Tips
Technical Coding Exercise
Give a small, realistic Databricks coding challenge that tests fundamentals — clean code, edge case handling, and test writing. Time-box to 45-60 minutes.
Architecture Whiteboard
Present a system design problem relevant to Databricks. Evaluate their approach to scalability, data modeling, and trade-off discussions.
Code Review Simulation
Show a Databricks pull request with both good patterns and subtle issues. Assess what they catch, how they communicate feedback, and what they prioritize.
Past Project Deep-Dive
Have them walk through their most challenging Databricks project. Ask probing questions about architecture decisions, obstacles, and what they learned.
Typical Team Structure
Team Size
2-5 Databricks developers
Reports To
Engineering Manager, Tech Lead, or CTO
Collaborates With
Product Management, QA/Testing, DevOps, Design
Skip the JD — Get Matched Instead
Tell us your Databricks requirements and we'll send pre-vetted profiles with video intros in 24-48 hours.
You're all set!
We'll send matched profiles within 24-48 hours. Check your email for next steps.