Skip to content

Latest commit

 

History

History
86 lines (50 loc) · 2.22 KB

README.md

File metadata and controls

86 lines (50 loc) · 2.22 KB


Logo

Data Engineering

Training resources for Accelerating Data Engineering Pipelines on Baskerville.

screenshot

About The Project

Training resources for Accelerating Data Engineering Pipelines on Baskerville HPC. This course covers:

  1. Setting up your environment on Baskerville
  2. Data on the Hardware Level with Pandas, cuDF and Dask
  3. Data Visualisation with Plotly
  4. Final Challenge

(back to top)

Getting Started

To take this course, you will need a registered account on Baskerville. Details for requesting access can be found here.

Prerequisites

This course is for beginners, however some familiarity with the following may be beneficial:

  • Python
  • Jupyter notebooks
  • Pandas

(back to top)

License

This work is licensed under a GNU General Public License v3.0. See LICENSE.md for more information.

(back to top)

Contact

Email us: [email protected]

Project Link: https://github.com/baskerville-hpc/data-engineering

(back to top)

Acknowledgments

Baskerville is funded by the EPSRC and UKRI through the World Class Labs scheme (EP/T022221/1) and the Digital Research Infrastructure programme (EP/W032244/1).

(back to top)