Processing Scientific Data in Cloud
Monthly Tech Talk by ARDC and RDS Nodes
Location:Limited remote attendee spaces OR University of Technology Sydney: Room CB01.09.29, Level 9, UTS Tower, 15 Broadway, Ultimo
Cloud services have become very popular as a method to store, process and visualise scientific data. This October Tech Talk event will feature two presentations of data service in cloud, one is processing of scientific data in Hierarchical Data Format version 5 (HDF5) and the other in Network Common Data Form (NetCDF).
1. John Readey, Senior Architect at The HDF Group
Title: Scientific Data in the Cloud
Description: The HDF5 file format has been used extensively in the HPC community for the storage of scientific data (e.g. multi-dimensional arrays). Unfortunately, the traditional HDF5 library doesn't work so well for applications running in the cloud. To address this, we've developed a service based implementation of HDF5, HDF Kita. Kita utilizes object based storage (e.g. AWS S3) and runs as a cluster of Docker Containers. In combination with the service, JupyterHub enables users to easily run notebooks in the cloud that can use an unlimited amount of data and take advantage of the parallelization capabilities of the Kita Server.
Bio: John Readey has been a Senior Architect at The HDF Group since he joined in June 2014. His interests include web services related to HDF, applications that support the use of HDF and data visualization. Prior to joining The HDF Group, John worked at Amazon.com from 2006-2014 where he developed service-based systems for eCommerce and AWS. John graduated from Ohio State University with Master Degrees in Mathematics and Computer Science.
2. Nigel Rees, Research Data Management Specialist at the National Computational Infrastructure (NCI)
Title: Using NetCDF in Jupyter notebooks
Description: The Network Common Data Form (NetCDF) software is widely used throughout the world as a mechanism for storing and accessing scientific data. This presentation will introduce what NetCDF is and where NetCDF is applied, show examples from disciplines that actually use NetCDF, and summary pros and cons from examples.
Bio: Nigel Rees is a Research Data Management Specialist at the National Computational Infrastructure (NCI) with a background in magnetotelluric geophysics. In his role at NCI, he supports research data needs and assists with the management, publishing and discovery of data.
Who should come along?
The general purpose of this virtual plus face-to-face meeting is twofold:
- Provide a national forum for developers working with research data and/or research data management to discuss topics of interest
- Enable a dialogue between developers and NCRIS facilities
This event is for anyone who wants to know more about tech aspects of data and NCRIS facilities:
- data scientists
- researchers who are building data tools
- data technologists
- data librarians
How does virtual and face-to-face work?
Tell us which of the 12 hubs you'll be attending (on the next page when you complete this Eventbrite registration.)
If attending you are highly recommended to physically attend at one of the hubs where feasible. For attendees who are not able to reach a hub, please select "Join Remotely" when registering (see links below). Limited remote attendee spaces.
First 30 remote attendees in will be emailed the Zoom Meeting ID. Further attendees will be waitlisted and emailed Zoom Meeting ID if space.
To guarantee attendance, join one of the physical meetings or set up your own location. Min: 4 attendees.
If you are unable to join on the day remotely, you can access the slides shortly after the event.
The talks will take about an hour (3:00pm-4:00 pm AEST), followed by face-to-face networking and discussion at the 11 hub locations (4:00pm-5:00 pm), facilitated by local hub facilitators:
Sydney: University of Technology Sydney: Room CB01.09.29, Level 9, UTS Tower, 15 Broadway, Ultimo, City Campus (starts 3:00 pm AEST)
For more info and to register, go to https://www.eventbrite.com.au/e/monthly-tech-talk-in-oct-2018-processing-structured-scientific-data-in-cloud-tickets-50342249022