Jupyter for Science User Facilities and High Performance Computing

Jupyter is the “Google Docs” of data science. It provides that same kind of easy-to-use ecosystem, but for interactive data exploration, modeling, and analysis. Just as people have come to expect to be able to use Google Docs everywhere, scientists assume that Jupyter is there for them whenever and wherever they open their laptops.

But what if the data you want to interact with through Jupyter doesn’t fit on your laptop or is excruciating to move? What if the model you want to build and test requires more computing power and storage than you have right in front of you? As a scientist, you want the same interactive experience and all the benefits of Jupyter, but you also need to “reach out” to put something big into your science process: A supercomputer, a telescope data archive, a beam-line at a synchrotron. Can Jupyter help you do that big science? What efforts are in motion already to make this a reality, what work still needs to be done, and who needs to do it?

Doing this right will take a community: New collaborations between core Jupyter developers, engineers from high-performance computing (HPC) centers, staff from large-scale experimental and observational data (EOD) facilities, users and other stakeholders. Many facilities have figured out how to deploy, manage, and customize Jupyter, but have done it while focused on their unique requirements and capabilities. Still others are just taking their first steps and want to avoid reinventing the wheel. With some initial critical mass, we can start contributing what we’ve learned separately into a shared body of knowledge, patterns, tools, and best practices.

40+ participants from universities, national labs, industry, and science user facilities. Credit: Fernando Perez.

In June, a Jupyter Community Workshop held at the National Energy Research Scientific Computing Center (NERSC) and the Berkeley Institute for Data Science (BIDS) brought about 40 members of this community together to start distilling. Over three days in talks and breakout sessions, we addressed pain points and best practices in Jupyter deployment, infrastructure, and user support; securing Jupyter in multi-tenant environments; sharing notebooks; HPC/EOD-focused Jupyter extensions; and strategies for communication with stakeholders.

Here are just a few highlights from the meeting:

Michael Milligan from the Minnesota Supercomputing Center perfectly set the tone for the workshop with his keynote, “Jupyter is a One-Stop Shop for Interactive HPC Services.” Michael is the creator of BatchSpawner and WrapSpawner, JupyterHub Spawners that let HPC users run notebooks on compute nodes supporting a variety of batch queue systems. Contributors to both packages met in an afternoon-long breakout to build consensus around some technical issues, start managing development and support in a collaborative way, and gel as a team.
Securing Jupyter is a huge topic. Thomas Mendoza from Lawrence Livermore National Laboratory talked about his work to enable end-to-end SSL in JupyterHub and best practices for securing Jupyter. Outcomes from two breakouts on security include a plan to more prominently document security best practices, and a future meeting (perhaps another Jupyter Community Workshop?) focused specifically on security in Jupyter.
Speakers from Lawrence Livermore and Oak Ridge National Laboratories, the European Space Agency showed off a variety of beautiful JupyterLab extensions, integrations, and plug-ins for climate science, complex physical simulations, astronomical images and catalogs, and atmospheric monitoring. People at a variety of facilities are finding ways to adapt Jupyter to meet the specific needs of their scientists.

Really, there’s just too much to pack into a blog post so we encourage you to look at the talk slides and notes on Discourse — all the breakout notes have been posted there to this topic. We’re working on getting videos of the slide presentations up on the workshop website as well. Watch for announcements of future meeting opportunities and documentation on Discourse as well.

Finally we want to thank Project Jupyter, NumFOCUS, and Bloomberg for their help making this meeting happen. We all came away with a better sense of who is doing what in our community, and how we can work together on this new area of growth for the Jupyter community. The organizers also want to thank their respective institutions’ administrative staff (Seleste Rodriguez at NERSC, and Stacy Dorton at BIDS) for helping with workshop logistics.

Jupyter for Science User Facilities and High Performance Computing was originally published in Jupyter Blog on Medium, where people are continuing the conversation by highlighting and responding to this story.

Jupyter for Science User Facilities and High Performance Computing

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112