Skip to main contentA logo with &quat;the muse&quat; in dark blue text.
Leidos

Data Scientist

2 days agoRemote

Description

Leidos has an immediate opening for a Data Scientist to contribute to public health projects working within a cross-functional team for the National Center of Immunization and Respiratory Disease (NCIRD) at the Centers for Disease Control and Prevention (CDC). The Data Scientist would analyze datasets (both unstructured and structured) to determine data relationships, model datasets for loading/storing into relational databases, create data pipelines and ETL processes leveraging the R environment and the Azure cloud stack of resources. The Data Scientist may contribute to data analysis projects and data visualization projects, working with the team to connect data streams/pipelines to Power BI, Tableau, and R Shiny. Contribute to data migration projects from on-premises SQL databases to the Azure SQL Databases or Azure Synapse databases. This role will work closely and collaboratively with the Leidos Data Science team, CDC staff, and partners. This position requires an entrepreneurial mindset and strong communication skills to meet with customers to translate their requirements to working data solutions, while operating within the government's guidelines and mandates. This role provides the opportunity to work closely within a growing Data Science discipline within the CDC to solve tomorrow's public health challenges.



Day to Day Responsibilities/Duties:

  • Design, develop, test, and implement fully automated, event-triggered or scheduled production data pipelines using R programming and/or the Azure stack of tools and services working in a collaborative environment within a cross-functional team.
  • Writes, tests, and implements R code for cleaning, wrangling, manipulating, and transforming data to prepare it for downstream storage processing such as insertion/updating of databases and for downstream analytical processing, including statistical computation, visualization, and standardized reporting.
  • Develops data visualizations for public facing Power BI dashboards.
  • Independently meets and clearly communicates with CDC subject-matter experts, CDC technical staff, and fellow NCIRD Data Science Team members to extract project requirements, translate them into technical implementation plans, and to develop solutions to meet the requirements.
  • Creates generalized functions and incorporates them into an R package maintained by the team to perform common data tasks.
  • Attend team planning meetings, backlog refinement, daily stand-ups, and customer demos

Qualifications:

  • Bachelor's Degree in Statistics, Biostatistics, Data Science, Analytics, Mathematics, Computer Science or Computer engineering, Computer or Management Information Systems, (or similar scientific degree), and 4+ years of experience designing, developing, and implementing data pipelines or analytical processes using R Studio and/or MS Azure.
  • High proficiency in R programming performing data cleaning, data wrangling, manipulation, and transformation to prepare data for visualization and analysis.
  • Experience merging and integrating disparate datasets and outputting to various target destination databases or systems.
  • Experience using Application Programming Interfaces (APIs) to retrieve data or submit requests to online data systems.
  • Experience with Power BI and R Shiny Dashboards.
  • Experience in Statistics, Machine Learning, and/or AI techniques.
  • Ability to multi-task based on the project priorities and deliver the solutions on-time with excellent quality.
  • Familiarity with GitLab (or GitHub) as a version control code repository as well as a project management tracking system.
  • Understanding of enterprise data architectures and data quality controls.
  • Ability to write technical documentation and create system architecture diagrams.
  • Knowledge of Agile Development methodologies and the Software Development Lifecycle (SDLC).
  • Team player who thrives in a dynamic and sometimes fast-paced environment.

Plus, but not required:

  • Graduate-level university coursework in R, Statistical Computing, and/or Database Design and Administration.
  • Programming experience in Python, SAS, or SUDAAN.
  • Familiarity with stored procedures, tables, views, triggers and queries.
  • Work experience with Azure Data Factory, SQL Server or other Azure Databases.
  • Hands on experience migrating complex data pipelines from on-premises into Azure cloud environments.
  • Any of the following relevant certifications: Azure Data Engineer Certification, Azure Data Scientist Associate, Azure Solution Architect, Microsoft Certified Solutions Associate, Solutions Expert or Database Administrator.
  • Experience using MS SQL Server Management Studio to write SQL and T-SQL Code for interacting with project databases.
  • Familiarity with or willingness to learn to build and configure data workflows using Azure Data Factory, Azure Stream Analytics, Azure SQL Database/Warehouse, Azure Databricks, Delta Lake, Lake House, PySpark, and Scala.
  • Familiarity with complex sample surveys and analysis.
  • Familiarity with continuous integration and continuous delivery or continuous deployment (CI/CD) methods and tools with Azure DevOps.

Pay Range:

Pay Range $81,250.00 - $146,875.00

The Leidos pay range for this job level is a general guideline only and not a guarantee of compensation or salary. Additional factors considered in extending an offer include (but are not limited to) responsibilities of the job, education, experience, knowledge, skills, and abilities, as well as internal equity, alignment with market data, applicable bargaining agreement (if any), or other law.

#Remote

Original Posting Date:

01/10/2024

While subject to change based on business needs, Leidos reasonably anticipates that this job requisition will remain open for at least 3 days with an anticipated close date of no earlier than 3 days after the original posting date as listed above.

Job ID: Leidos-R-00126209
Employment Type: Full Time