Job title: Data Science Engineer
Company: OCLC, Inc.
Job description: You have a life. We like that about you.
At OCLC, we believe you’ll do the best work of your life when you’re living the best life possible.
We work hard to build the technology that connects thousands of today’s libraries. But we also work hard to make a job at OCLC a meaningful part of a balanced life- not a substitute for one.
Technology with a Purpose. OCLC supports thousands of libraries in making information more accessible and more useful to people around the world. OCLC provides shared technology services, original research and community programs that help libraries meet the ever-evolving needs of their users, institutions, and communities. With office locations around the globe, OCLC employees are dedicated to offering premier services and software to help libraries.
The Job Details are as follows:
OCLC’s Data Science team is finding new ways to make the world’s libraries more effective for their users. Established a year ago, the team is already underway on models that scan our data sources to identify duplicative data and a model that transliterate records from Latin to Cyrillic. Our team has expertise in machine learning, math and statistics and we are now hiring a Data Scientist with a strong background in data engineering and data management.
This role will help the team prepare the billions of rows of data that we often work with, as well as write and review the models that take advantage of that data. We’re standing up our team’s technology stack (Databricks and Snowflake) and need someone who is well versed in Spark and Python.
What you will do:
- Build pipelines to transform data
- Develop processes and tools to monitor and analyze model performance and data accuracy
- Mine and analyze data from our multi-petabyte ecosystem of data to drive optimization and improvement of product development, marketing techniques and business strategies
- Assess the effectiveness and accuracy of new data sources and data gathering techniques
- Develop custom data models and algorithms to apply to data sets
- Use predictive modeling to increase and optimize customer experiences, revenue generation, ad targeting and other business outcomes
- Coordinate with different functional teams to implement models and monitor outcomes
Qualifications
- Master’s degree in Statistics, Mathematics, Computer Science or similar field or equivalent combination of education and experience
- 4+ years demonstrated experience in big data, statistics, model development working with statistical packages and machine learning tools (Pandas, PyTorch, Tensorflow, etc.)
- Strong experience with ETL processes using either PySpark or Scala.
- Experience building testable production pipelines
- Experience with: Databricks, Snowflake, GIT, AWS, Agile, CI/CD
- Strong problem-solving skills with an emphasis on product development
- Knowledge of a variety of machine learning techniques (clustering, decision tree learning, artificial neural networks, etc.) and their real-world advantages/drawbacks
- Knowledge of advanced statistical techniques and concepts (regression, properties of distributions, statistical tests and proper usage, etc.) and experience applying those techniques to build new product features
- Excellent written and verbal communication skills for coordinating across teams
Expected salary:
Location: Dublin, OH
Job date: Sat, 09 Jul 2022 22:22:37 GMT
Apply for the job now!