Develop Python code to access and gather a large variety of geological databases into a consistent data catalogue * Develop data processing pipelines to organise, prepare and transform the data into a format suitable for direct integration within models * Optimise data access for use within the context of repeated simulations * Ensure code quality and set up continuous integration pipelines to maintain data catalogue integrity and accessibility - Good experience with data manipulation, conversion and cleaning techniques - Good experience with data formats (CSV, Excel, JSON, Parquet, XML, HDF5, NetCDF, etc.) * Ability to handle messy, unstructured, or incomplete data efficiently * Ability to clearly document and explain data transformation steps * Experience with data cataloguing tools (e.g., Intake) * Familiarity with cloud platforms (AWS, GCP, Azure) and cloud-native
mehr