Big Data Architect
We are looking for Big Data Architect, who will have the opportunity to develop and build a modern data quality platform that scales, and processes large amount of data of various form, shape, and type to aid business in assessments, strategy and decision making. You will lead and provide design, architecture around advanced data engineering, pipeline, analysis and analytical capabilities to support various data quality processing needs.
Location – Pasadena, CA (Remote)
What you will do
• Hands on design and architecture building a modern data platform using the most advanced technologies.
• Implement methods and procedures for processing large amount of data without moving and processing for data quality assessment.
• Create a streamlined data pipeline, workflow management and flexible metadata model for various activities such as data catalog, quality and curation.
• Collaborate with our product and analytics teams to help further refine the platform.
• Build for scalability as we rapidly expand our data systems and technology.
• Act as a thought leader in the industry by creating written content and speak in various technical webinars and conferences.
• Be the expert of cloud, ML tools and platforms, database technologies, data platforms to design, build, operate & scale our next-generation products.
• Collaborate with leadership and key stakeholders on a regular basis.
• 8+ years of hands-on professional development experience in architecting and implementing big data solutions is required.
• 5+ years of platform or product design and architecture is highly desired using Apache Spark and Python.
• 5+ years of cloud experience (AWS, Azure or GCP).
• Experience working in an onshore/offshore model is highly preferred.
• Strong hands-on experience using one or more ETL tools such as Airflow, dbt etc.,
• Experience with one or more on-prem or cloud databases such as Snowflake, Synapse etc., and Compute stores such as Spark, Databricks.
• Strong experience with big data tools and technologies and must be experienced in the following areas: Linux, Hadoop, Hive, Spark, HBase, Sqoop, Impala, Kafka, Flume, Oozie, MapReduce, etc.
• Ability to architect solutions using real-time/stream processing systems (i.e. Kafka, Kafka Connect, Spark, Flink, AWS Kinesis).
• Expert level of expertise with SQL, Scala.
• Strong experience implementing Data Quality best practices and frameworks, including MDM and Data Governance
• Excellent verbal and written communication skills
• This is a full-time, salaried position (no contractors please).
Apart from our competitive salary and employee stock options, our benefits extend into healthcare, retirement, life insurance, short/long-term disability, unlimited paid time off, short-term incentive plans (annual bonus) and long-term incentive plans.