Working on the STEM-Away® Data Science portal will give students the opportunity to work with industry-standard infrastructure and tools. They will work with and develop real-world data science workflows, visualization tools, and reporting software. This portal will be part of Level 2 & Level 3 projects.
Many employers implement training programs for junior-level data scientists, as many have worked solely on their own laptops. Having the opportunity to work with infrastructure will give students a significant edge in applying for jobs and internships.
Students will gain the following valuable skills:
- Ability to reason about physical resources (memory, CPU) for projects that take place on a single node.
- Ability to structure workflows for parallel computing libraries such as Dask and Spark.
- Experience in how to communicate project outcomes in a professional setting to data scientists, researchers, investigators, etc., with Jupyterhub notebooks and Sphinx Docs.
- The knowledge to integrate existing services (i.e. Label Studio, SageMaker, MLFlow) for additional functionality and complex ML pipelines.
- How to create and manage their own customized software stacks.