Big Data Architect

We are looking for Big Data Architect, who will have the opportunity to develop and build a modern data quality platform that scales, and processes large amount of data of various form, shape, and type to aid business in assessments, strategy and decision making. You will lead and provide design, architecture around advanced data engineering, pipeline, analysis and analytical capabilities to support various data quality processing needs.

Location – Pasadena, CA (Remote)

What you will do

• Hands on design and architecture building a modern data platform using the most advanced technologies.

• Implement methods and procedures for processing large amount of data without moving and processing for data quality assessment.

• Create a streamlined data pipeline, workflow management and flexible metadata model for various activities such as data catalog, quality and curation.

• Collaborate with our product and analytics teams to help further refine the platform.

• Build for scalability as we rapidly expand our data systems and technology.

• Act as a thought leader in the industry by creating written content and speak in various technical webinars and conferences.

• Be the expert of cloud, ML tools and platforms, database technologies, data platforms to design, build, operate & scale our next-generation products.

• Collaborate with leadership and key stakeholders on a regular basis.


• 8+ years of hands-on professional development experience in architecting and implementing big data solutions is required.

• 5+ years of platform or product design and architecture is highly desired using Apache Spark and Python.

• 5+ years of cloud experience (AWS, Azure or GCP).

• Experience working in an onshore/offshore model is highly preferred.

• Strong hands-on experience using one or more ETL tools such as Airflow, dbt etc.,

• Experience with one or more on-prem or cloud databases such as Snowflake, Synapse etc., and Compute stores such as Spark, Databricks.

• Strong experience with big data tools and technologies and must be experienced in the following areas: Linux, Hadoop, Hive, Spark, HBase, Sqoop, Impala, Kafka, Flume, Oozie, MapReduce, etc.

• Ability to architect solutions using real-time/stream processing systems (i.e. Kafka, Kafka Connect, Spark, Flink, AWS Kinesis).

• Expert level of expertise with SQL, Scala.

• Strong experience implementing Data Quality best practices and frameworks, including MDM and Data Governance

• Excellent verbal and written communication skills

• This is a full-time, salaried position (no contractors please).


Apart from our competitive salary and employee stock options, our benefits extend into healthcare, retirement, life insurance, short/long-term disability, unlimited paid time off, short-term incentive plans (annual bonus) and long-term incentive plans.

Apply to Join DQLabs Team

Best Practices

DQLabs in Action: Observe, Measure, Discover


DQLabs in Action: Observe, Measure, Discover

The Modern Data Stack needs Modern Data Quality. Organizations deserve a better way to observe, measure and discover the data that matters. It’s time we eliminate the data silos created by legacy Data Observability, Data Quality and Data Discovery platforms by centralizing them into a single, agile solution. That is Modern Data Quality. That is DQLabs.

Join us on this webinar to learn how the DQLabs platform is the Modern Data Quality Platform eliminates critical data silos by centralizing Data Observability, Data Quality, and Data Discovery into a single, agile AI-driven platform.


12:00 pm: Welcome & Introductions

12:05: pm: Industry Insights: Defining Modern Data Quality

12:15 pm: DQLabs in Action: Platform Showcase 

12:30 pm: Questions & Answers

12:45 pm: Close

View More Arrow image