Big Data Architect

We are looking for Big Data Architect, who will have the opportunity to develop and build a modern data quality platform that scales, and processes large amount of data of various form, shape, and type to aid business in assessments, strategy and decision making. You will lead and provide design, architecture around advanced data engineering, pipeline, analysis and analytical capabilities to support various data quality processing needs.

Location – Pasadena, CA (Remote)

What you will do

• Hands on design and architecture building a modern data platform using the most advanced technologies.

• Implement methods and procedures for processing large amount of data without moving and processing for data quality assessment.

• Create a streamlined data pipeline, workflow management and flexible metadata model for various activities such as data catalog, quality and curation.

• Collaborate with our product and analytics teams to help further refine the platform.

• Build for scalability as we rapidly expand our data systems and technology.

• Act as a thought leader in the industry by creating written content and speak in various technical webinars and conferences.

• Be the expert of cloud, ML tools and platforms, database technologies, data platforms to design, build, operate & scale our next-generation products.

• Collaborate with leadership and key stakeholders on a regular basis.

Qualifications

• 8+ years of hands-on professional development experience in architecting and implementing big data solutions is required.

• 5+ years of platform or product design and architecture is highly desired using Apache Spark and Python.

• 5+ years of cloud experience (AWS, Azure or GCP).

• Experience working in an onshore/offshore model is highly preferred.

• Strong hands-on experience using one or more ETL tools such as Airflow, dbt etc.,

• Experience with one or more on-prem or cloud databases such as Snowflake, Synapse etc., and Compute stores such as Spark, Databricks.

• Strong experience with big data tools and technologies and must be experienced in the following areas: Linux, Hadoop, Hive, Spark, HBase, Sqoop, Impala, Kafka, Flume, Oozie, MapReduce, etc.

• Ability to architect solutions using real-time/stream processing systems (i.e. Kafka, Kafka Connect, Spark, Flink, AWS Kinesis).

• Expert level of expertise with SQL, Scala.

• Strong experience implementing Data Quality best practices and frameworks, including MDM and Data Governance

• Excellent verbal and written communication skills

• This is a full-time, salaried position (no contractors please).

Benefits

Apart from our competitive salary and employee stock options, our benefits extend into healthcare, retirement, life insurance, short/long-term disability, unlimited paid time off, short-term incentive plans (annual bonus) and long-term incentive plans.

Apply to Join DQLabs Team



Best Practices

Data + AI Summit 2022 - DQLabs Events

EVENTS

Data + AI Summit 2022

Come meet us at Data + AI Summit 2022, the world’s largest data and AI conference happening in San Francisco and takes place virtually in a hybrid format on June 27-30, 2022. DQLabs is proud to be a sponsor for the grand event.  

Explore the latest trends and innovations, technical sessions, and networking opportunities from AI thought leaders and data professionals all around the world. 

If you want to explore or integrate a data management solution for your organization, you must meet DQLabs in the event. 

DQLabs’ AI augmented data quality platform gives organizations the ability to manage data smarter and leverage an immediate ROI in weeks, rather than months. With a Data Quality first approach powered by ML and self-learning capabilities organizations can connect, discover, measure, monitor, remediate and improve data quality across any type of data – all in one agile and innovative self-service platform. 

View More Arrow image