Principal data scientist – data engineering + discovery

Website Target

Description:The Data Engineering & Discovery team enables insight discovery from large amounts of data and its presentation in a human-friendly and understandable way. The problem space we work on encompasses e-commerce search and ranking, discovering insights into big data lakes, query and general text understanding, business-domain chat bots, as well as other NLP tasks.

The search project presents a guest with the highly relevant products from Target’s several million inventory based on the guest’s query expressed in English language. Solr and deep learning models power our search technology. 

The insight discovery project helps understanding business phenomena using differential analysis of large data sets with colossal number of features. For example, if one wants to compare sales between Mountain View and Sunnyvale stores, our system returns an insight grounded on differential analysis of particular features such as demographics or time, e.g., “millennial males bought more apparels in Sunnyvale”.  The insight discovery is enabled by high-speed algorithms for approximate traversal of a large feature lattice implemented using C++ and RocksDB.

The business-domain chatbot answers questions formulated in English language, e.g.,  “Can you tell me last week’s sales?”, based on the data available in the numerous database tables. The technology behind it involves complex algorithms for translation English into SQL queries.

Our team possesses a combination of deep expertise in machine learning approaches to data and text analysis as well as engineering techniques for their implementation, deployment, and maintenance in production.

Ultimately, we provide immediate, informed, relevant, personalized access and insights from data. Use your skills, experience and talents to be a part of groundbreaking thinking and visionary goals. You will be required to:

  • Understand and implement Natural Language Understanding systems that translate natural languages to SQL and other internal APIs/declarative languages
  • Have background in understanding how to create signals for a NLP engine that matches customer inputs (such as ecommerce queries) to the data we have (such as product catalog and the clickstream data)
  • Design and implement algorithms/heuristics for complex optimization problems that are capable of being high performance as data scales – this includes taking ideas to production.
  • Participate in sprints and design, develop, test and deploy code as part of a CI/CD environment
  • Understand how to tradeoff performance and quality of output when required


  • PhD in math, advanced statistics, physics, operations research and/or computer science
  • 10+ years of experience deploying data science algorithms in a production environment
  • Experience designing algorithms for a relevance system such as a personalized tool, search, recommendations etc.
  • Proficient in one or more of C, C++, Java, Scala or Python
  • Working Knowledge of one or more of deep learning models (RNN, CNN etc.)
  • Experience with data wrangling, data cleansing, feature engineering, model selection, training and testing models.
  • Experience deploying production models in Tensorflow, Pytorch etc.
  • Excellent written and verbal communication skills

To apply for this job please visit