Data Scientist – Search Engine
Search engines are a very broad area in computer science inclusive of data structures, distributed systems, and data science. ITHAKA’s products host a diverse set of topics, from science to literature, as well as content types such as research articles to images of artwork. Moreover, our diverse audiences from students to faculty at universities have diverse expectations of search.
As a Data Scientist, you will help us overcome these challenges with data and models. Using your software engineering skills, you will bring about solutions that fit within the constraints of our systems as well as build new systems.
Data Scientists at ITHAKA on Search work with software and quality assurance engineers, interact directly with ITHAKA business, product, and engineering leadership to identify opportunities to advance our mission.
The successful candidate will have applicable experience with billion-row datasets, a high degree of intellectual curiosity, and excellent soft skills.
Our organization and this role will provide you with an opportunity few other companies can offer including:
- Exercising your entrepreneurial muscles to solve problems for the scholarly community
- Leveraging technologies including, AWS, Databricks, Spark, and the various ML stacks built on Python such as Tensorflow, PyTorch, or MXNet.
- Working on Agile teams that follow continuous deployment and test automation best practices allowing for rapid application development and frequent deployments. We complete an average of 80 production deployments each week.
- Developers are using the same architecture, technologies, and tools as companies like Netflix, Etsy, and Amazon.com.
- This is a unique opportunity to be on the leading edge of building large-scale, cloud-delivered web applications that host hundreds of millions of sessions annually.
When joining the development team at JSTOR, you can expect to receive tool and product training. We have an excellent onboarding program, which enables new engineers to become productive very quickly. A lead will work closely with you as you begin engaging your assigned agile team. We will provide you with constant support as we work to make you comfortable in your new environment. Those in leadership roles will work tirelessly to set you up for success.
Specific Objectives and Responsibilities
The successful candidate will have applicable development experience, experience with large-scale distributed web applications, billion-row datasets, a high degree of intellectual curiosity, excellent soft skills. We look for candidates that possess strong problem recognition and problem solving skills, and a passion for innovation.
The primary responsibilities of our data scientists include:
- Understand the ITHAKA business as well as the businesses and organizations we engage with to advance access to scholarly materials.
- Develop relationships with teams that will consume your models. Understanding their problems so that you can provide the solution they need but aren’t necessarily asking for directly.
- Help your team develop prioritized backlogs of work – enabling them to know what they are working on and to know how well they are meeting their commitments.
- Leverage language models, learning to rank, neural networks, and reinforcement learning to improve the user experience with ITHAKA’s core search engine for diverse user experiences.
- Work with the ITHAKA team at large to develop high quality datasets of relevance judgements for ITHAKA’s search engine.
- Manage the lifecycle of the models you create throughout the systems where they are used.
- Build frameworks and reusable code to make the production of trained models
- Work with your team to provide evidence regarding the impacts of your models, leveraging ITHAKA’s Data Pipeline and Data Warehouse, to all levels of the organization.
Skills, Experience, and Characteristics
- Advanced degree in a quantitative discipline such as Statistics, Economics, Mathematics, Computer Science or equivalent
- 3 plus years hands on experience building statistical and machine learning models using common data science toolkits such as Python, Spark, NumPy, SciPy, Scikit-learn, TensorFlow, R, etc.
- Exceptional communication skills with experience presenting complex results in easy to understand ways to diverse audiences
- Experience with multi-billion row record datasets and leading projects that span the disciplines of data science and data engineering
- Experience with relational databases, data warehousing, and NoSQL technologies such as Postgres, MySQL, Oracle, AWS Redshift, AWS Athena, ElasticSearch, Dynamo
- Proficiency in using query languages such as SQL, Spark DataSets, Pandas, Lucene/ElasticSearch/SOLR
- Experience building robust data ingestion pipelines
- Intellectual curiosity, entrepreneurial drive, innovative thinking and problem-solving skills
- Ability to determine what is needed for a minimum viable dataset, minimum viable predictive model and how to iterate that dataset or model for iterative and continuous improvement.
Work for ITHAKA
Our team is passionate about our mission and supporting one another. We enjoy working together to create opportunities for people to learn and grow out in the world, and we bring that same commitment to helping our teammates develop in our careers and our lives. One of our core values is belonging. We embrace differences, and believe that the things that make each of us unique are the things that help us see new insights and build better solutions.
Learn more about Working at ITHAKA.
Interested candidates can submit their resume, a detailed cover letter, and salary requirements.
We are proud to be an Equal Opportunity/Affirmative Action employer. All qualified applicants receive consideration for employment without regard to race, color, sex, national origin, gender identity, sexual orientation, age, religion, domestic violence victim status, veteran status, disability, history of disability or perceived disability, or other status protected by law.
To apply for this job please visit recruiting.ultipro.com.