Quantcast
Channel: Jupyter Blog - Medium
Viewing all 311 articles
Browse latest View live

Diving into Leadership to Build Push-Button Code

$
0
0

“Hi everyone, I’m Sarah! I’m a Research Data Scientist at the Alan Turing Institute and I’m also an operator of mybinder.org. It’s really cool seeing how many people here are interested in BinderHub!”

And it is cool. Really cool! But also a bit scary as a room full of Research Software Engineers (each of them much further on in their careers than I am) suddenly turn to me, eager for the knowledge I was surely about to impart to them.

But let’s rewind a bit.

Joining the Binder community

Its May 2019 and I’m attending the Research Software Reactor sprint jointly hosted by Microsoft and Imperial College London. This is a 3-day hackathon where researchers from different areas (though all with a computing background) come together to collaboratively build cloud-based resources on Microsoft’s Azure platform.

As for me, I’m only 6 months into my role at the Turing, having graduated from my PhD in Astrophysics at the start of the year. Also, my role as a “Binder Operator” is barely 2 months old… which is why I’m starting to feel nervous about the amount of interest in the room! This also happens to be the second hackathon I’ve attended ever, so the adrenaline is high.

So, what is this “Binder” that we’re all so excited about?

Binder is an amazing resource that can host reproducible and interactive code in a web browser. Great… what does that mean? It means that if I have some scripts or notebooks in a repository (like this one) and I describe the packages in a configuration file (such as requirements.txt), then I can go to mybinder.org, copy the URL of the repository into the form and hit launch. This will begin a series of events culminating in my notebook appearing in a browser window with all of the packages installed, and the code will just run.

Sounds like magic, right? You can even combine Binder with Jupyter Books to create interactive documents! Below is a comic explaining how a scientist may use Binder.

The user experience of mybinder.org. Comic courtesy of Juliette Taka.

BinderHub is the technology powering Binder (or where the magic happens). It is a multi-user server that can create a custom, user-specified computing environment and make it accessible via a URL. It utilizes different tools to make this possible and can be deployed onto either a cloud provider or an on-premise compute cluster. These tools are depicted in the below illustration.

A pictorial representation of the different tools constituting BinderHub. This image was created by Scriberia for The Turing Way community and is used under a CC-BY licence. Zenodo record.

Since BinderHub is a cloud-neutral technology (mybinder.org itself runs on Google Cloud and OVH.com), I previously worked on refining the documentation around deploying BinderHub on Azure and ran workshops teaching people how to use Binder and deploy BinderHub. It was this work that lead to me being invited to join the Binder team. Being a maintainer for an open-source project involves checking if the service is still running smoothly and being an active, friendly member of the community — one who can answer questions when they arise.

Now that we’ve covered a bit of background, let’s get back to my slightly uncomfortable moment at the sprint.

Day 1: An unexpected leadership position

Everybody in the room is now trying to whittle down which projects we should work on — all of which are awesome! The magic of BinderHub is that it covers so many aspects of software engineering tools and practices, such as continuous integration/deployment and Kubernetes. The BinderHub idea is quickly amassing a lot of satellite projects!

“So Sarah, is it OK if I name you leader of ‘Team BinderHub’?” My second uncomfortable moment. Gerard Gorman, Senior Lecturer in the Department of Earth Sciences at Imperial and co-organizer of the sprint, has just elected me leader of my first ever hack project. If I’m honest, I’d intended to spend the sprint ironing out some niggles with a colleague. But I’ve always been a “learn by doing” person so I agree and desperately begin wracking my brain for a project to give my team for the next 3 days.

After a bit of shuffling (where for a while it seemed like most of the attendees would be joining ‘Team BinderHub’!), I finally assemble a team. My teammates are: Tania Allard, a Microsoft Developer Advocate; Tim Greaves and Diego Alonso Álvarez, Research Software Engineers at Imperial; and Gerard himself.

The project I decide to bring to the table is one that I’d started but had stalled. The process of deploying a BinderHub can be quite long (usually an hour) and consists mainly of typing stuff into the command line. So a couple of months prior, I’d begun designing a set of scripts that would install the relevant tools, deploy a BinderHub on Azure and connect it to a Docker Hub account, retrieve various pieces of information (IP addresses, logs, etc.) and then remove the BinderHub from the cloud. However, my bash scripting and operating system knowledge is not especially strong. Within the team, we have a bit of a discussion around the Deploy to Azure feature — a button that can be copy-pasted into the README of a GitHub project and facilitates a one-click deployment of the project to Azure. It seems like the perfect hack is to combine my scripts with the deploy button and make deploying BinderHub as easy as possible.

Tim is a bash scripting, containerizing pro and has a little experience with the Azure “blue button” from the OKPy project. I ask him to check my setup script that installs the required Command Line Interfaces (CLIs) and explain that I’d like it to be generalized for different operating systems.

Gerard is a BinderHub enthusiast and is keen to get one deployed as a teaching resource at Imperial. As a starter task, I ask him to read up on how the blue button works.

Diego has never heard of Binder or BinderHub before and is quite confused as to what the rest of the team are so excited about! I suggest that he works through my Zero to BinderHub workshop in order to bring him up to speed with the concepts. This is also the perfect opportunity to get feedback on my workshop if parts are not clear! I encourage him to open an issue describing any problems he comes across or extra information he’d like to see.

And just like that, I find I’ve delegated myself out of a job!

Now Imposter Syndrome is beginning to creep up on me. Not only is it taking four extra people to pull together my work and make it usable, but I am also unsure how I would assist them in these tasks I’d set them. Especially Tim, as the complexity of what I wanted the bash scripts to achieve was the reason this project stalled in the first place.

Except there’s one very important aspect of any software project I’ve not mentioned yet: Documentation!

Since starting my position at the Turing, I’ve come to fully appreciate how fundamental good documentation is to the success of a project. Clearly-written instructions covering the purpose of the project, how to install/run the code, and what kind of inputs/outputs to expect make it much easier for a new person to quickly get to grips with the project. And as result, it’s far more likely to be referenced and reused. While my team begin familiarizing themselves with the infrastructure and goals of my project, I begin an overhaul of the documentation whilst keeping an eye on the repository to manage incoming issues and pull requests.

Take that, Imposter Syndrome!

Day 2: The team makes progress

It’s the second day of the sprint and ‘Team BinderHub’ gather again, our enthusiasm not diminished yet!

I very quickly set goals for the day. I want the button to work by the end of the day as I’m hoping the final day can be used to work on an idea Tania has to use Azure’s DevOps Pipelines to automatically update the deployed BinderHub as new commits come into the host repository. This would be a very handy feature for those (like me!) maintaining BinderHubs at their own institutions. Anything to automate and reduce the number of commands we have to type!

I ask Tim to begin working on the deploy script itself, to add tests where necessary and tidy up some of the parsing of the variables. The next steps are to build a Dockerfile that runs the deploy script and an ARM (Azure Resource Manager) template that controls the form that the blue button links to.

Again, I’m managing the documentation, keeping up with changes we make to the code-base and functionality. Similarly, I keep watch over the repository to manage incoming pull requests and merge conflicts and also make a start on the amendments to the BinderHub workshop Diego has compiled.

My Imposter Syndrome is definitely subsiding as I begin to feel more like I understand the skills of my team and we are all making the best use of our time.

Day 3: Document, document, document

For the third and final day of the sprint we are in a new venue: the Microsoft Reactor. This is a workspace in the Shoreditch area of London where we are offered free pizza and cookies and a DJ to provide the soundtrack to our code. (OK, less an actual DJ, more of a software engineer with Spotify Premium. 😜) What I will say though, Microsoft’s beverage-making facilities have nothing on the Turing’s iPad coffee maker! 😉

Our goals for the last day? Consolidate and document! Ideally get the button working if possible, but the top priority is to make it as easy as possible for the team (or someone new!) to come along to the repository and finish what we have started.

We decide that there isn’t enough time left or infrastructure in place to implement the Azure DevOps Pipeline for automatic upgrades; instead, Tania begins working on a tutorial so we can implement it later.

Tim and I end up in a bit of GitHub hell as we realize that the code to create the Kubernetes cluster is in the setup script, not the deploy script. We refactor some parts so that all of the resources are deployed from the deploy script, and this also means that the Dockerfile only needs to find one script to execute. However, this refactoring causes a complicated merge conflict and some of the bug fixes Tim implemented disappear in a squash merge. (This sounded difficult enough to resolve that I’ve been put off learning about squash merges since!) As team leader, I try to orchestrate whose pull request should be merged first so that all the right code ends up in master.

What did we learn and do?

By the end of the sprint, we don’t quite have a working button deployment, but we do have:

  • a set of streamlined scripts for auto-deployment based on the contents of a JSON file,
  • an ARM template and Dockerfile that will provide the backend to the blue button after some further debugging,
  • a plan to move the project forwards after the sprint.

So what did I learn from those three days?

Coding with other people is fun!

Paired programming is something we try to achieve at the Turing, but it can be difficult to do depending on the project and the time constraints of those working on it.

This was the first time I’d experienced true collaboration. Where I had an idea that I wasn’t sure how to implement, and someone with the skills had helped me shape it and realize it. It gave me a sense of community and belonging. I hope I’ve forged connections with ‘Team BinderHub’ that will last the duration of our careers and we can work together again.

There’s a role for everyone in a team and these are equally important

Even if it’s reviewing code or writing documentation, these are as important (arguably, more so) than the code itself. Code that does what you think it’s doing and has been explained well will have a much longer lifespan than code that is difficult to follow, regardless of how clever it is, and poorly documented.

I found my strength as a leader

You might remember at the beginning of this blog post I mentioned that I’d never led a hack project before, so I also learned a lot about my leadership capabilities.

I think I did well. I identified the strengths of each of my team members and gave them a task suited to them whilst letting them explore new concepts, offering my own insight and opinions when required.

I also think that it was a good choice for me to not be too involved in the coding aspect of this project. Managing the flow into the repository and updating the documentation as new code came in meant that I managed to maintain an overall perspective of the project and could switch gears as questions came in from different areas. I don’t think I could have maintained such a view if I’d been buried in code, and I can always learn from the scripts that we’ve developed at a later point.

Hackathons are not where projects end

While the first 80% of a project to get the infrastructure in place can be achieved in a short amount of time like a hackathon, the last 20% is hard and often takes longer. But this extra effort is necessary for the software to be taken up by others.

‘Team BinderHub’ continued working on this project over Slack and GitHub to finally make our version 1 release on June 11th — almost 3 weeks after the end of the sprint!

If you’d like to try the button to deploy your own BinderHub (or contribute a new feature!), the repo can be found here 👉 github.com/alan-turing-institute/binderhub-deploy. Look for the button below!

Thank You! 💖

I’d like to thank a few people who made this possible:

  • Gerard, Imperial College, Tania and Lee Stott from Microsoft for organising such an inspiring event;
  • ‘Team BinderHub’ for coming along on this wild ride with me;
  • The Binder Team for accepting me into the community and giving me the space to develop such projects;
  • and The Turing Way team who introduced me to Binder and the value of community (and documentation!).

Note: this is cross-posted with the Turing Institute blog.


Diving into Leadership to Build Push-Button Code was originally published in Jupyter Blog on Medium, where people are continuing the conversation by highlighting and responding to this story.


ipycanvas: A Python Canvas for Jupyter

$
0
0

As you may already know, the Jupyter Notebook and JupyterLab are Browser-based applications. Browsers are incredibly powerful, they allow you to swap rich and interactive graphical interfaces containing buttons, sliders, maps, 2D and 3D plots and even video games in your webpages!

All this power is readily made available to the Python ecosystem by Jupyter interactive widgets libraries. Whether you want to create simple controls using ipywidgets, display interactive data on a 2D map with ipyleaflet, plot 2D data using bqplot or plot volumic data with ipyvolume, all of this is made possible thanks to the open-source community.

One powerful tool in the Browser is the HTML5 Canvas element, it allows you to draw 2D or 3D graphics on the webpage. There are two available APIs for the Canvas, the Canvas API which focuses on 2D graphics, and the WebGL API which uses hardware acceleration for 3D graphics.

After some discussions with my work colleague Wolf Vollprecht, we came to the conclusion that it would be a great idea to directly expose the Canvas API to IPython, without making any modification to it. And that’s how we came up with ipycanvas!

ipycanvas: Exposing the Canvas API to IPython

ipycanvas exposes the Canvas API to IPython, making it possible to draw anything you want on a Jupyter Notebook directly in Python! Anything is possible, you can draw custom heatmaps from NumPy arrays, you can implement your own 2D video-game, or you can create yet another IPython plotting library!

ipycanvas provides a low-level API that allows you to draw simple primitives like lines, polygons, arcs, text, images… Once you’re familiar with the API, you’re only limited by your own imagination!

Draw image from NumPy array (left), implementation of the Game Of Life (right)
Draw millions of particles (left), draw custom sprites (right)
Make your own plotting library for Jupyter fully in Python!

Using Matt Craig’s ipyevents library, you can add mouse and key events to the Canvas and react to user interactions.

If you have a GamePad around, you can also use the built-in Controller widget and make your own video-game in a Jupyter Notebook!

Documentation

Check-out the ipycanvas documentation for more information: ipycanvas.readthedocs.io

Github repository

Give it a star on Github if you like it! github.com/martinRenou/ipycanvas

Try it online!

You can try it without the need of installing anything on your computer just by clicking on the image below:

Installation

Note that you first need to have Jupyter installed on your computer. You can install ipycanvas using pip:

pip install ipycanvas

Or using conda:

conda install -c conda-forge ipycanvas

If you use JupyterLab, you would need to install the JupyterLab extension for ipycanvas (this requires nodejs to be installed):

jupyter labextension install ipycanvas

About the author

My name is Martin Renou, I am a Scientific Software Engineer at QuantStack. Before joining QuantStack, I studied at the aerospace engineering school SUPAERO in Toulouse, France. I also worked at Logilab in Paris and Enthought in Cambridge, UK. As an open-source developer at QuantStack, I worked on a variety of projects, from xtensor and xeus-python in C++ to ipyleaflet and ipywebrtc in Python and Javascript.


ipycanvas: A Python Canvas for Jupyter was originally published in Jupyter Blog on Medium, where people are continuing the conversation by highlighting and responding to this story.

A slideshow template for Voilà apps

$
0
0

Voilà can now serve your interactive dashboards in a slideshow format.

Last June, QuantStack announced the first release of Voilà, a solution to turn Jupyter notebooks into standalone web applications. Voilà enforces security (preventing arbitrary code execution) while preserving interactivity (supporting interactive widgets for Jupyter notebooks, including roundtrips to the kernel). A recent addition to the ever-teeming Jupyter ecosystem, Voilà is flexible, extensible, and language-agnostic (running any Jupyter kernel, such as Python, R, Julia, C++).

Voilà logo.
A dashboarding solution based on Jupyter.

Getting started with Voilà

Voilà is available as a Python package on conda-forge and PyPI. After installing Voilà in their environment, Jupyter notebook users will see a new button in the toolbar, a button reading “Voila” with a display icon. Clicking this button will take you to a Voilà web app served with the notebook server.

Screenshot of Jupyter notebook interface showing Voila button in toolbar.
Voilà use case as a Jupyter server extension.

Alternatively, you can use Voilà to create a standalone Tornado application. From the terminal, run $ voila index.ipynb to turn notebook index.ipynb into a web app. Note that you don’t launch nor run the Jupyter notebook yourself. At this point, Voilà serves the app locally.

Screencast of Voilà app (default template).
Jupyter notebook turned Voilà app (source).

Of course, the full value of sharing these apps (typically analytics web apps, data dashboards) comes from deploying them. Voilà apps can come into play at different steps of a data science workflow, from the initial step of exploring data all the way to the final step of communicating results.

Styling Voilà apps with layout templates

Now, you may want to customize the layout of your app, especially if it is somewhat complex. For example, you can make results more readable by splitting them and using different tab panes or boxes. This you can already achieve with the voila-gridstack template, also available from either conda-forge or PyPI (still in beta).

Templates are written in Jinja and use the metadata field of the notebook cells. Practically, a Voilà template is a folder which lives under PREFIX/share/jupyter/voila/templates/. The system of custom templates is actually where the extensibility of Voilà shines most. Here, we introduce voila-reveal, a slideshow template for Voilà. It builds off of RISE, which itself builds off of reveal.js.

Rendering Voilà apps as slideshows

With RISE, you can instantly turn your Jupyter notebook into a slideshow. Besides, if you share it within a binder, collaborators can readily view it in their web browser, with no need for a local setup. They can enjoy the interactive controls, if any. They are expected to run code though (example), which may not be suitable if they are non-technical. And, even if they are, you may want to prevent arbitrary code execution.

With Voilà and its new reveal template, you can achieve this by sharing your RISE slideshow as a standalone web application. How so? Ever since Jupyter notebook slides, it has been possible to author or edit a Jupyter notebook with slideshow-related information on each cell. In principle, we could also add (or edit) these cell metadata manually (or automatically) by post-processing the JSON.

Our custom slideshow template voila-reveal leverages these very cell metadata (namely, subfield slideshow of field metadata). It handles them the exact same way nbconvert does when generating slideshow HTML from a notebook. It also passes default values to specific resources required by reveal. These reveal-required resources are: scroll, theme, and transition.

To use the slideshow template with Voilà, install the voila-reveal package: $ conda install voila-reveal or $ pip install voila-reveal. This will create and populate the PREFIX/share/jupyter/voila/templates/reveal/ folder. At the command line, serve the index.ipynb notebook as a standalone app with the following command:

$ voila index.ipynb --template=reveal

… and voilà!

Screencast of Voilà app in slideshow format showing zoom transition (configured).
Voilà app rendered as a slideshow with slide transitions configured to zoom in and out (source).

Configuring templates at the command line

You can overwrite the above-mentioned resource defaults by passing additional options. For instance, the default value of transition is "fade". To get the "zoom" behaviour, we could use the following command:

$ voila index.ipynb --template=reveal --VoilaConfiguration.resources="{'reveal': {'transition': 'zoom'}}"

Admittedly, it is verbose and cumbersome. Another possibility is to specify (here, reveal-specific) resources in a configuration file.

Configuring templates with a JSON file

Write your configuration file, named conf.json, with the following structure:

{
"traitlet_configuration": {
"base_template": "reveal",
"resources": {
"reveal": {
"scroll": false,
"theme": "simple",
"transition": "zoom"
}
}
}
}

Then, it is enough to run $ voila index.ipynb --template=reveal to get slide transitions zoomed in and out; Voilà picks up the config file, as long as it lives under PREFIX/share/jupyter/voila/templates/reveal/.

In the above demo screencast, we showcase a scatter plot and a scatter matrix of the “iris” dataset made with Plotly Express and customizable with ipywidgets dropdowns. To display the Python code for these plots, run

$ voila index.ipynb --template=reveal --strip_sources=False

To turn on scrollbars, so you can view the plots entirely, run

$ voila index.ipynb --template=reveal --strip_sources=False --VoilaConfiguration.resources="{'reveal': {'scroll': True}}"

or edit the scroll value in the configuration file!

Coming next

At the moment, you must specify the template upon launching Voilà. We would like to be able to toggle between different templates on the fly (without restarting the app). To this end, we shall support template specification as a URL parameter. We shall make template selection available from the Jupyter interface as well.

Acknowledgments

The development of voila-reveal is entirely supported by QuantStack. The author would like to thank Jeremy Tuloup, Johan Mabille, and Sylvain Corlay for their valuable feedback on this piece.

About the author

Marianne Corvellec is an independent scientific software developer. She is also an independent researcher affiliated with IGDORE. She holds a PhD in statistical physics from Ecole Normale Supérieure de Lyon, France.


A slideshow template for Voilà apps was originally published in Jupyter Blog on Medium, where people are continuing the conversation by highlighting and responding to this story.

Configure your dashboards with Voilà gridstack template

$
0
0

Voilà is a new dashboarding solution from Jupyter ecosystem. It provides an easy-to-use tool to convert your Jupyter notebooks into standalone web applications. If you have not used it before, you can learn more about Voilà from this blog post.

And voilà… the dashboard templates

To create interactive and engaging dashboards, you can add graphs, interactive widgets, maps etc. to your notebook. Voilà will turn them into interactive applications by stripping any code and displaying the outputs in the order they appear in the notebook. If you need more flexibility over the position of the cell outputs, you can use one of the widget layout templates defined in the ipywidgets library. If this is not enough, with Voilà you can even turn your notebooks into interactive presentations. Would you also like to re-configure your dashboards the drag-and-drop way? Et voilà, the gridstack template.

Creating ad-hoc dashboards with gridstack

If you have never used layout templates and just want to use Voilà with your existing notebooks, you can consider the new gridstack template for Voilà dashboards. Simply run the following command with the path to your notebook:

voila --template=gridstack my_notebook.ipynb --VoilaConfiguration.resources='{“gridstack”: {“show_handles”: True}}'

This will open a dashboard created from your notebook in a brower. By default the output cells of the notebook are laid out vertically. But you can move and resize them freely by dragging one of the handles in the corners of the cells.

An example notebook rendered with Voilà gridstack template. The layout was configured by dragging and resizing the cells of the notebook. The notebook was downloaded from LIGO project: https://github.com/losc-tutorial/Data_Guide

Positioning widgets with metadata

When you are done with configuring your dashboard, Voilà enables you to persist it and a create static layout of the widgets. To achieve that you will need to edit manually the notebook metadata, but we are also planning to release a tool that will simplify the process.

For example, you can add the following attributes to one of the cells (to edit the cell metadata, you need to activate the “Edit metadata” button from the View -> Cell toolbar menu of your notebook):

In Jupyter lab ≥ 1.0 you can edit the metadata using “Advanced Tools” section of “Notebook tools” sidebar (wrench icon).

Then you can start Voilà with the following command:

voila --template=gridstack my_notebook.ipynb

This should open your dashboard with cells in the specified positions and of specified sizes.

Final layout of the dashboard configured with cell metadata. The cells are not movable in this dashboard.

Supporting legacy notebooks

The metadata follow the specification of the legacy jupyter-dashboards project, which was an earlier solution for creating interactive dashboards. Unfortunately, the project is not maintained any more and it won’t work with the recent installations of Jupyter. However, you can open your notebooks created with jupyter-dashboards without changes with Voilà gridstack template to achieve identical rendering and give a second life to your Jupyter dashboards.

(To compare the outputs, you can open the notebook with binder that provisions the jupyter-dashboards ecosystem installed in an old version of Anaconda (2017). You can also use its design tool to lay out your widgets visually and save the cell metadata usable with Voilà gridstack template).

Design tool implemented in legacy jupyter-dashboards project.

How to install

If you want to try out the template yourself, please install it now with:

pip install voila-gridstack (for pip users) or

conda install voila-gridstack (for conda/anaconda users).

You can also try out the interactive examples with our gridstack binder.

And if you have any questions or want to share your experience please reach out on our Gitter chat.

Credits

The development Voilà and gridstack template was initiated by the amazing team at QuantStack that also provided financial and brain-power support.

About the author

Bartosz Telenczuk is a seasoned Python developer and a data scientist. He is an ardent user of Jupyter ecosystem and frequent contributor to open source software; among his projects is the svgutils library for composing SVG files in Python.


Configure your dashboards with Voilà gridstack template was originally published in Jupyter Blog on Medium, where people are continuing the conversation by highlighting and responding to this story.

Jupyter Community Workshops: Call for Proposals for Jan-Aug 2020

$
0
0
Teaching and Learning with Jupyter community workshop. Credit: Richard West.

We are pleased to announce the third call for proposals for Jupyter Community Workshops is now open!

The majority of Jupyter’s work is accomplished through remote, online collaboration; yet, over the years, we have found deep value in focused in-person workshops. In-person events are particularly useful for tackling challenging development and design projects, growing the community of contributors, and for strengthening collaborations. Jupyter Community Workshops is a series of community-organized events to enable such gatherings. For examples of recent workshops, see the proposals funded in the last round).

We are grateful for the initial and continuing financial support by Bloomberg that makes these workshops possible. We are also excited to announce that the program is expanding with additional financial support from Amazon Web Services. If your organization would like to support this program, please contact NumFOCUS.

The third call for proposals for Jupyter Community Workshops is open through Sunday, December 15, 2019.

Jupyter Community Workshops bring together small groups of Jupyter community members and core contributors for high-impact strategic work and community engagement on focused topics. Our vision is that the events funded in this round would occur no later than August of 2020.

We are particularly interested in workshops that explore and address topics of strategic importance for the future of Jupyter. We expect the workshops to involve up to about two dozen participants over a 2–4 day period, and have a total Jupyter-funded budget of up to $20,000, which may help cover expenses such as travel, lodging, meals, or event space. It is our intent for the workshops to include both participants who are core Jupyter contributors, as well as stakeholders and contributors and potential contributors within the larger Jupyter ecosystem. While not the primary focus of the workshops, it would be highly beneficial to couple the workshop with broader community outreach events, such as sprints, talks, or tutorials at local meetings or conferences.

For examples of recent workshops, see the proposals funded in the last round).

Proposal Process Highlights:

  1. Submit initial proposal using this form by Sunday, December 15 (Anywhere on Earth).
  2. Initial Steering Council review (up to a week). Proposal goes to Steering Council for initial review and feedback. Proposal is either approved or declined.
  3. Budget and Logistics Development (up to four weeks). Operations Manager will support workshop organizer who work will develop a venue/date proposal, detailed budget, event plan, and proposed list of participants.
  4. Final steering council review (up to a week). Proposal presented for final approval to steering council, including final budget, event details, and an estimate of the potential impact of the event. Assuming the budget included in the initial proposal is fully developed and no major changes are proposed, this period may be waived.

The proposal process for these workshops is managed by the Director of Jupyter Cal Poly, Ana Ruvalcaba (jupyterops@gmail.com), NumFOCUS and the Jupyter Steering Council. Applications can be completed using the online form and are due by Sunday, December 15 (Anywhere on Earth). Events should be hosted no later than August of 2020.

— -

This initiative is organized by Jason Grout, Paul Ivanov, Brian Granger, and Ana Ruvalcaba.


Jupyter Community Workshops: Call for Proposals for Jan-Aug 2020 was originally published in Jupyter Blog on Medium, where people are continuing the conversation by highlighting and responding to this story.

The JupyterHub and Binder Contributor in Residence!

$
0
0
The Chan Zuckerberg Initiative and Project Jupyter are teaming up to pilot a new role in the Jupyter community: the JupyterHub Contributor in Residence

Thanks to funding from the CZI Essential Open Source Software for Science initiative we are welcoming Georgiana Elena as our first Contributor in Residence!

Managing a codebase that spans many repositories and sub-projects as well as operating mybinder.org is a significant effort. Over time we have identified a need for dedicated support to ensure that activity and communication within these repositories is efficient and productive. Driven by this need and inspired by the Django Fellowship model, we shaped the idea of a JupyterHub/Binder Contributor in Residence (CIR).

Thanks to the Chan Zuckerberg Initiative, a Contributor in Residence within the community is now possible. Starting this year, CZI launched the Essential Open Source Software for Science grant program. It is aimed at helping open source communities that make up the core foundation of the scientific stack. We applaud CZI in providing support for core infrastructure across the sciences, and we’re honored to be part of CZI’s vision of funding maintenance, growth, development, and community engagement for open-source projects.

In this post we’ll introduce the community to our first Contributor in Residence (Georgiana) and her mentor (Tim)! The rest of this post is in the form of a conversation between Georgiana and Tim. We hope you enjoy it and that it let’s you get to know the team a bit better.

  • Tim: Hello Georgiana. Congrats on becoming the inaugural CIR. We’ve been lucky to have you already as a contributor on JupyterHub especially your work adding support for using Traefik as a proxy.
  • Geo: Hey Tim! Thank you, it’s been a really amazing experience to work on JupyterHub. I’m really excited for this next year’s journey.
  • Tim: You are the very first Contributor in Residence. Can you explain a little what this means?
  • Geo: The CIR role is about helping make the little things around the projects be amazing. Little things like answering newly opened issues, guide people towards the right place for discussion, improve bits of documentation or code, helping with releases, give people the right tools to make great contributions. And I have a great team of existing contributors to work with.
  • Tim: That’s right! My role in all this is to be a mentor and guide. JupyterHub and Binder have a small team that does a good job looking after the main repositories, but the universe of JupyterHub and Binder repositories is huge. So sometimes things get overlooked or missed. For example Pull Requests or Issues in one of the less-popular repositories can go for a long time without getting attention. This may hold up progress somewhere else, and people have a bad experience contributing to the project.
  • Geo: Hopefully we can improve on this a bit so that more people have a great experience when contributing.
  • Tim: Let’s find out a bit more about you. What did you do before getting involved with JupyterHub?
  • Geo: I studied computer science in Bucharest spending the last semester developing my bachelor’s project as an Erasmus exchange student in Madrid. During summers I worked in the industry, doing a few internships abroad. I tried to gain as much practical experience as possible, to get in touch with different people and cultures, and interning seemed like the right move.
  • Tim: How did you get involved with JupyterHub?
  • Geo: I got involved with JupyterHub thanks to the amazing Outreachy community. The Outreachy program introduced me to the world of opensource and guided me through my first steps. I now feel very lucky to have applied for the Outreachy internship at the same time that JupyterHub came up with two amazing projects. I chose JupyterHub because I “clicked” with the TraefikProxy project and I was happy to find a very welcoming community that made contributing a very pleasant and not at all a scary experience.
  • Tim: Have you turned into someone who spends all their time contributing or do you still have other hobbies?
  • Georgiana: Because I currently live in the busiest/loudest city of Romania, I love to escape it at the end of the week and spend some time in nature. My favorite place to be is the village I grew up in, where I recharge my batteries walking my dogs and cooking with my mother and sister. I like painting and photography and sometimes I do both using the photos I take as an inspiration for my paintings.
  • Tim: Moving on to the things you will be doing as CIR, have you already picked out something to get started with?
  • Geo: At first, I will be focusing on trying to get an overview of what issues are feature requests, support questions and which have been solved already. I will also spend some time getting familiar with more of the JH/Binder projects. And what better way to learn about a project than to try and use it, and to be a first time contributor.
  • Tim: If you had to name the top three goals you hope to achieve while being contributor in residence, what would those be?
  • Geo: From the perspective of someone with still a lot to learn and figure out, I would be extremely happy if during this year I will grow to be the CIR this community needs. I would love to help improve the contributing experience, increase the overall productivity of the projects and become a better contributor myself.
  • Tim: How can people interact with you?
  • Geo: People can use the usual community communication channels, GitHub, Gitter and Discourse to provide feedback or ask questions and I’ll do my best to help out.
  • Tim: Now that we have a CIR, do people still need to contribute to the projects?
  • Geo: Definitely! People are always welcome to contribute. The CIR role wasn’t created to replace anything, but to be an extra pair of hands helping the daily activities around the community become more efficient.
  • Tim: How can people stay engaged with the CIR program or the broader JupyterHub community moving forward?
  • Geo: As the CIR, I am a team member just like anyone else, and the best way to stay connected is to connect is through our team channels — we have team discussions in the Jupyter Community Forum, and team information in the JupyterHub Team Compass repository (including monthly team meetings that anyone is welcome to join).

The JupyterHub and Binder Contributor in Residence! was originally published in Jupyter Blog on Medium, where people are continuing the conversation by highlighting and responding to this story.

Connect to a JupyterHub from Visual Studio Code

$
0
0

Visual Studio Code has pretty good support
for running Jupyter Notebooks. But what if your organization has a JupyterHub running remotely, with more compute resources & access to large amounts of data? How can you access that from Visual Studio Code running on your local machine?

It’s pretty easy to do, and this blog post will guide you through it.

jupyterhub and vscode logos

Step 1: Get a JupyterHub access token

JupyterHub lets you create tokens for yourself for use by third party applications. These tokens can be used anywhere a Jupyter Notebook access token is needed. Since this is what Visual Studio Code needs, let’s acquire one.

  1. Log-in to your JupyterHub
  2. Access your Control Panel. In classic notebook, there is a ‘Control Panel’ button on the top right. In JupyterLab, you can access it under ‘File -> Hub Control Panel’
Top Right ‘Control Panel’ button in classic Jupyter Notebook
File -> Hub Control Panel in JupyterLab

3. Go to the ‘Token’ page by clicking ‘Token’ in the top bar

Token link in the top bar to go to the token page

4. Type in a description for the new token you want, and click ‘Request new API Token’

Type in a description for what this token will be used for

5. Copy your token and keep it somewhere safe. You should treat this like a password to your JupyterHub. You can (and should!) revoke it (as I have done) from the same page when you are no longer using it.

Copy your token

This is all the information you need from JupyterHub! Now let’s go to vscode.

Step 2: Connect VS Code to your JupyterHub

Visual Studio Code supports connecting to a remote notebook server, and we can use that to connect to our JupyterHub. You must perform these steps before opening your notebook.

  1. Open the command palette in Visual Studio Code (‘Cmd+Shift+P’ on MacOS, ‘Ctrl+Shift+P’ elsewhere)
  2. Select ‘Python: Specify local or remote Jupyter server for connections’
vscode command palette

3. Construct your notebook server URL with the following template: https://<your-hub-url>/user/<your-hub-user-name>/?token=<your-token>. Note that your hub user name might sometimes be escaped from whatever you used to actually log in, if it has special characters. You can verify this by looking at the URL you get once you log in to your JupyterHub — it should have the right one after ‘user’.

Enter your notebook server URL

4. Create or open a new notebook. The kernel for this should now live on your JupyterHub! You can verify this by running !hostname, which should return the hostname of your remote JupyterHub server instead of your local hostname. You can also try importing libraries that are in the remote JupyterHub server, but not your local file system.

Tada! That wasn’t so hard, was it?

Limitations

  1. Your JupyterHub notebook server must be running already when you try to open your notebook — Visual Studio Code will not automatically start it. If it has stopped, you need to log in to your JupyterHub & start it again. You do not need a new token though.
  2. Watch out for filesystem access. When you are calling open()(or a helper function that eventually access a file) to read a file, it is going to be read from your remote JupyterHub server’s home directory, not your local system’s current directory. So if you add a new file next to your Jupyter Notebook locally, that is not automatically going to be available to the Jupyter Notebook to read.
  3. Installing pip or conda packages locally will have no effect, since your Python kernel is running on your JupyterHub. Use the %pip or %conda magics to install packages in the correct environment.

Connect to a JupyterHub from Visual Studio Code was originally published in Jupyter Blog on Medium, where people are continuing the conversation by highlighting and responding to this story.

GESIS Joins the Binder Federation

$
0
0

This is an invited post from the GESIS Institute, collaborators on the Binder project and new members of the Binder federation

In 2015 Jeremy Freeman and Andrew Osherof posted on the Jupyter mailing list to “let the community know that [they]’ve been prototyping a way to make it really easy to turn a GitHub repo with Jupyter notebooks into a tmpnb deployment.” The temporary notebook service (tmpnb) already allowed the on-demand launch of a docker container and access to executable notebooks. However, containers had to be built and deployed manually to be used in tmpnb. First presented at the PyData NYC 2015 event, Binder simplified the way that repositories with Notebooks inside are built and launched. This democratized the publication of reproducible and executable research designs far beyond the computer sciences.

The Binder Federation

Since 2015, Binder has come a long way. The system has evolved into a collection of open-source tools such as repo2docker for building, and BinderHub for executing and serving, reproducible computational environments. The deployment at mybinder.org is a free public service for the scientific community. While from a user’s point of view, mybinder.org looks like a single web application, it is in fact a federation of independent organizations and groups that provide the service. This collaboration meets the growing demand of researchers, with currently 100,000 launches per week over a total of 8,000 different repositories. The Binder federation started earlier this year and we are happy to announce that it is welcoming its third member, GESIS.

About GESIS

GESIS — Leibniz Institute for the Social Sciences is the largest German infrastructure provider for the social sciences. The institute is headquartered in Mannheim, with a location in Cologne. With research-based services and consulting covering all steps in the scientific process, we increasingly recognize the need of researchers to work with new forms of digital behavioral data. To address this need, we started the GESIS Notebooks project. This project aims to support researchers in the area of Computational Social Science with accessible analytics and publication services for data-driven research designs. As part of the project, GESIS has become a regular contributor to the development of the Binder and Jupyter open-source toolbox.

Why Join?

We all recognize the need for international collaboration in the development of software systems demanded by modern research lifecycles. However, such collaboration cannot end with the development of the software itself. We believe that cooperation needs to extend to the provisioning and integration of the services in a reliable and transparent way. Joining the federation has clear advantages to social scientists, as well as Binder users from other disciplines. Our community at GESIS can enjoy the reliability of the service, at times of lectures and workshops, and the larger Binder ecosystem benefits from resources that might otherwise be idle. More importantly, researchers are not locked into proprietary and closed solutions whose availability depends on a single entity. Joining the federation guarantees the long-term availability of hard-won research outputs in our discipline, irrespective of future changes to funding for individual groups or institutions.

Get Involved

To join the federation a combination of two resources is needed: computer power and expertise to operate the deployment. Organizations can donate computational resources over which the mybinder.org team has full control, or host and maintain a BinderHub instance for their community. If this caught your attention, consider joining the Binder federation.

Thanks to Kenan Erdogan, Tim Head, Chris Holdgraf and Lisa Posch. Furthermore, we thank the German Research Foundation DFG (project number 324867496) for their support.


GESIS Joins the Binder Federation was originally published in Jupyter Blog on Medium, where people are continuing the conversation by highlighting and responding to this story.


Voilà is now a Jupyter subproject

$
0
0
It is a great pleasure to announce that the Voilà project has been incorporated as a Jupyter subproject. Voilà will now be subject to the Jupyter governance and code of conduct.

For reference, the Jupyter Enhancement Proposal (JEP) for the Voilà incorporation is available here.

What is Voilà?

Voilà helps you communicate insights, by transforming a Jupyter Notebook into a stand-alone web application you can share. It gives you control over what your readers experience in a secure and customizable interactive dashboard.

The easiest way to get started with Voilà is to install it via pip or conda and type voila some_notebook.ipynb to turn the said notebook into a dashboard.

Besides, Voilà includes a templating system that allows to overload the behavior of the front-end. Using this templating system, Voilà can be used to create

  • slideshows (with voila-reveal)
A Voilà slideshow created with the voila-reveal template.
  • dashboards (with voila-gridstack)
A Voilà Dashboard based on the voila-gridstack template.

Why moving Voilà under the Jupyter governance?

While the project was initially started by QuantStack, the team now comprises developers from Bloomberg, UC Berkeley, JP Morgan, and Cal Poly San Luis Obispo. OVH has been supportive of the project by kindly providing the free hosting of the gallery on their infrastructure.

We believe that the multi-stakeholder nature of the Voilà project is well-suited for the Jupyter organization.

The Voilà project is largely built upon Jupyter subprojects and standards.

  • The standard notebook file format is the main entry point to Voilà.
  • nbconvert is used for the conversion to progressively-rendered HTML.
  • naturally, we use jupyter_client for handling the execution of notebook cells
  • jupyter_server is the default back-end.
  • JupyterHub is at the foundation of the voila-gallery project.
  • JupyterLab components (mime renderers, input and output areas) are used in the front-end implementation. Voilà also includes a preview JupyterLab extension.
  • ipywidgets and custom jupyter widget libraries such as bqplot, ipyvolume, ipyleaflets provide the bulk of the interactivity of Voilà applications.

Voilà is more a remix of existing Jupyter components (with changes to enable that use case) than a completely new application.

Resources

Should you be interested in Voilà, feel free to try it on Binder or locally! You can also engage with the developer community during our public team meetings and the various GitHub repositories of the project:

Acknowledgements

  • Voilà was started by the team of open-source developers at QuantStack as a separate project, but with the full intent to incorporate it into Jupyter. The initial project development at QuantStack was funded by Bloomberg.
  • Now, Voilà contributors work in many institutions, including UC Berkeley, Cal Poly San Luis Obispo, JP Morgan, and Faculty (formerly ASI Data Science).
  • The Voilà Gallery is kindly hosted by OVH.

Voilà is now a Jupyter subproject was originally published in Jupyter Blog on Medium, where people are continuing the conversation by highlighting and responding to this story.

A 2019 retrospective from the Binder Project

$
0
0

2019 was a busy year for the Binder and JupyterHub projects — each saw growth in both their community and technology. Now that the year has wrapped up, it is a good time to reflect on some of the highlights from the year.

Overall, 2019 was about improving the robustness, stability, and team dynamics around the JupyterHub and Binder projects, as well as connecting these projects with other tools and services in the open source community. Here are a few things that we are most-excited about.

More people are using mybinder.org

The Binder Federation is a collection of BinderHubs accessible from mybinder.org. This deployment is run as a public service and a demonstration of BinderHub, the underlying technology of the Binder project. This deployment is run on a volunteer basis by Binder community members, and is supported through grants and donations in infrastructure from project stakeholders. In 2019, the user base of mybinder.org grew from around 70,000 users per week to around 100,000 users (a growth of nearly 40%). mybinder.org is being used for teaching classes, sharing reproducible analyses, creating interactive documentation and narratives, and much more. We’re astonished at the rapid growth of Binder-ready repositories, and we’re excited to see what the community creates next.

Weekly user sessions at mybinder.org. Here you can see a typical pattern of activity over the course of a year. There are dips in activity over the summer and winter months, reflecting reduced activity from academic institutions.

JupyterHub reaches 1.0

JupyterHub, the underlying technology that provides interactive computing sessions to multiple users, is now at 1.0 status. The JupyterHub Python application was written several years ago, and reaching 1.0 reflects the work of dozens of open source contributors over time. JupyterHub is now a robust and stable application, having been used at smaller scales (think 5–10 people running on a single VM) as well as much larger scales (think 5,000 students running Jupyter sessions for a class).

The JupyterHub logo

BinderHub is out of beta

BinderHub, the kubernetes-based technology that powers mybinder.org, also came out of Beta this year. This reflects the fact that BinderHub is a battle-hardened application that can provide a stable service over time. mybinder.org runs nearly 100,000 sessions a week, and requires minimal maintenance time from the Binder’s projects team of volunteer operators.

The BinderHub federation is launched

The Binder Project envisions a world in which technology can be used in vendor-agnostic and decentralized ways. BinderHub runs on Kubernetes, which can be deployed on a variety of cloud and local infrastructure. While the Binder team runs one BinderHub deployment at mybinder.org, our goal has always been to see other organizations running their own BinderHubs. This year, we went one step beyond this by launching the BinderHub Federation. This is a collection of research and technology organizations that combine their expertise and computational resources to power mybinder.org. When users visit mybinder.org, they are now directed to one of several BinderHub instances. This makes mybinder.org more robust, and grows the number of organizations that utilize the project’s technology for their communities. A BIG THANKS goes out to Google, OVH, GESIS, and the Turing Institute for supporting the Binder Federation.

The (rough) location of each BinderHub deployment in the mybinder.org federation

Binder connects with open science services

Another goal of Binder is to be a part of the solution to more transparent, sharable, reproducible computational work. This means plugging in to other ecosystems and projects in order to leverage the broader open science community. This year we saw a number of new connections with other services. BinderHub now supports links that point directly to Zenodo and Dataverse repositores, and we are working on a few other integrations in the coming months. This means that projects utilizing these resources will be able to share reproducible and interactive links to their work out-of-the-box.

Binder now works with Zenodo repositories!

The Binder community grows

The Binder project’s most important asset is its people — this is a collection of volunteers spread across the world and from a variety of organizations. Binder community members do a variety of things — from working on technology, to teaching others how to make their work more reproducible, to participating in community discussions, to maintaining and debugging Binder tech. There is also a “core team” of Binder members that dedicates a significant part of their time to supporting the project. In 2019, we saw several new members join the core team, as well as a general growth in the Binder community. Welcome to all of our new team members! You can find a list of our current team members here.

We welcome our first Contributor in Residence

One challenge with running large, open projects is that resources tend to be scarce. The Binder Project has no formal project funding, and must find ways to both grow its technology as well as run mybinder.org on resources that are donated from its community. One thing that often suffers as a result is the maintenance and general improvement of our open source technology. This work is often under-appreciated, difficult, and unlikely to happen with purely volunteer labor.

For this reason, the Binder project decided to apply for the CZI Essential Open Source grant series. We proposed the creation of the “Binder Contributor in Residence” position — an annual contractor position that pays a member of the Binder community to do many of the daily things that are crucial for the project’s growth. We are excited to have Georgiana Dolocan as our first contributor in residence, and look forward to where this project will go in 2020.

Many thanks to CZI for their support of the Binder and JupyterHub projects in 2020!

There are more BinderHub deployments

The Binder Project aims to create technology that is deployable anywhere so that other organizations can support their communities with Binder infrastructure. In 2019 we ran a number of training sessions for how other groups can run their own BinderHub. In particular, the Turing Institute ran several workshops that had attendees up-and-running with their own functioning BinderHubs.

Thanks to our community

As you can see, 2019 was a busy and exciting year for the Binder community. As a final note, we want to say thanks to all of you who have supported Binder in one form or another over the years. Binder is a project run by the community, for the community. It wouldn’t be possible without all of your hard work and friendly faces, thanks! We look forward to what’s coming next in 2020!


A 2019 retrospective from the Binder Project was originally published in Jupyter Blog on Medium, where people are continuing the conversation by highlighting and responding to this story.

Xeus is now a Jupyter subproject

$
0
0
It is a great pleasure to announce that the Xeus project has been incorporated as a Jupyter subproject. Voilà will now be subject to the Jupyter governance and code of conduct.

For reference, the Jupyter Enhancement Proposal (JEP) for the Xeus incorporation is available here.

What is Xeus?

The Xeus project is a C++ implementation of the Jupyter kernel protocol. Xeus is not a kernel, but a library meant to facilitate the authoring of kernels.
Several Jupyter kernels have been created with Xeus:

  • xeus-cling, a kernel for the C++ programming language, based on the Cling C++ interpreter. The cling project comes from CERN and is at the foundation of the ROOT project.
The xeus-cling Jupyter kernel for the C++ programming language.
The xeus-cling Jupyter kernel for the C++ programming language.
  • xeus-python, a kernel for the Python programming language, embedding the Python interpreter.
The xeus-python Jupyter kernel for the Python programming language
  • xeus-calc, a calculator kernel, meant as an educational example on how to make Jupyter kernels with Xeus.

Beyond these three kernels built on top of Xeus by the Xeus maintainers, third-parties have developed other Jupyter kernels with Xeus:

  • JuniperKernel, a kernel for the R programming language by Spencer Aiello.
  • xeus-fift, a kernel for the fift programming language by Michael Zaikin. The fift programming language was developed by Telegram to create TON blockchain contracts.
  • SlicerJupyter, a kernel for the Python programming language by Kitware which integrates into the Qt event loop of the Kitware “Slicer” project.

Finally, the xeus-python kernel includes a first implementation of the Jupyter debugger protocol used by the Jupyter debugger project. xeus-python enables the Debug Adapter Protocol over the Control channel through new debug request/reply and debug event messages.

Why moving Xeus under the Jupyter governance?

While Xeus started as a side project for QuantStack engineers, the project now has several stakeholders who depend on it. We think that moving the project to an open governance organization may be a better way to reflect this situation.

Acknowledgements

  • Xeus was started by the team of open-source developers at QuantStack as a separate project, but with the full intent to incorporate it into Jupyter. The initial project development at QuantStack was funded by Bloomberg.
  • Now, Xeus contributors work in many institutions, including Université Paris Sud and École Polytechnique.

About the Author

Johan Mabille is a Scientific Software Developer at QuantStack, specializing in high-performance computing in C++. He holds master’s degree in computer science from Centrale-Supelec.

As an open source developer, Johan coauthored xtensor, xeus, and xsimd. He also made major contributions to the JupyterLab debugger project and bqplot.


Xeus is now a Jupyter subproject was originally published in Jupyter Blog on Medium, where people are continuing the conversation by highlighting and responding to this story.

JupyterCon 2020 is a go!

$
0
0
JupyterCon 2020 will be brought to you by Project Jupyter and NumFOCUS. Photo: O’Reilly Media CC BY-NC.

Just over a year ago, Project Jupyter announced it was reevaluating its annual community conference. An advisory committee of volunteers recommended a JupyterCon 2020 emphasizing a focus on access and leadership. We are now thrilled to announce a global Jupyter conference:

JupyterCon 2020 will be held August 10–14 in the historic city of Berlin, Germany.

The call for proposals (tutorials, talks, sprints) will open soon. In the meantime, we invite you to sign up as a proposal reviewer now.

Interested in sponsoring JupyterCon? Email us at jupytercon-sponsor@numfocus.org to receive a prospectus.

What is JupyterCon? Why do we do this?

Open-source software projects are fueled by the labor of many people — most of them volunteers — who often interact only online for years. It’s amazing the value we create together, coordinating our work in a distributed fashion. Even if we spend most of our time interacting through digital mediums, we see their limits and come in person to overcome them. In the community conference, we confirm in-person our virtual social links and our common ethical and professional aims.

In JupyterCon, we celebrate the distributed nature of our community, face-to-face. We enhance the year-long remote collaborations with in-person interactions for just a few days. This helps sustain a community, building and strengthening relationships among Jupyter aficionados and developers, newbies and veterans. JupyerCon inspires keen appreciation for the collective labor that creates high-quality technology for anyone in the world to use freely.

We come together to share, hack, eat, play, bond, and imbue with meaning our work, whether paid or volunteer. JupyterCon is a place where we feel free to express our intellectual aspirations, doubts, and pleasures. For the conference to unify us and develop our group solidarity, we welcome and include everyone. JupyterCon 2020 makes a strong public commitment to diversity and inclusion, with a rich program of actions to support it.

Message from the JupyterCon Diversity Chair:

I’m excited to join the JupyterCon 2020 leadership as Diversity Chair. My goals are two-fold: to drive the conference’s diversity and inclusion actions; and to move the needle on long-term diversity achievement so this position is redundant in the future. An inclusive environment at JupyterCon is a common resource. It is like air and water. By using Jupyter, within its extensive ecosystem, we are all participants and contributors and have the potential to positively impact our environment, both within the conference and the greater community.
I invite you to share any thoughts, concerns, or suggestions related to diversity by writing to jupytercon-diversity@numfocus.org.
— Reshama Shaikh, Diversity Chair

The JupyterCon program—with tutorials, talks, social events, and more—will be full of opportunities for participants to share knowledge with each other and to gain professional skills. Save the date and look out for coming announcements!

We want to thank the fabulous team of volunteers that is planning JupyterCon (listed below), as well as the many community members who participated on the JupyterCon advisory committee last year (co-chaired by Safia Abdalla, Paige Bailey, and Doug Blank), and the Jupyter Events working group (Lorena Barba, Sylvain Corlay, Brian Granger, Jason Grout, Ana Ruvalcaba).

Stay tuned for more details coming soon!

Lorena A Barba, JupyterCon 2020 General Chair

JupyterCon 2020 Venue and dates

Berlin Conference Center, 10–13 August 2020
(sprints on Aug. 14, venue to be announced)

JupyterCon 2020 Committee

  • Lorena A. Barba, General Chair
  • Sylvain Corlay, Vice Chair
  • Jason Grout, Technical Program Chair
  • Reshama Shaikh, Diversity Chair
  • Paco Nathan, Sponsorships Chair
  • Rosie Pongracz, Finance Chair
  • Paige Bailey, Communications/Marketing Chair
  • Joshua Patterson, Exhibits Chair
  • Tania Allard and Gerard Gorman, Tutorials Chair
  • Safia Abdalla and Kirstie Whitaker, Sprints Co-Chairs
  • Jeremy Tuloup, Local Organizing Committee Chair
  • Wolf Vollprecht, Local Sponsorships Liaison
  • Laura Norén, Eco-friendly Event Chair
  • Amanda Casari, Speaker Management
  • Jim Weiss (NumFOCUS), Logistics Chair
  • Carol Willing, Special Advisor to the General Chair
JupyterCon 2020 is an event brought to you in partnership by Project Jupyter and NumFOCUS.

JupyterCon 2020 is a go! was originally published in Jupyter Blog on Medium, where people are continuing the conversation by highlighting and responding to this story.

Report on the Jupyter Community Workshop on Dashboarding

$
0
0

From June 3rd to June 6th 2019, thirty-five developers from the Jupyter community met in Paris for a four-day workshop on dashboarding with Project Jupyter.

Attendees to the Jupyter Community Workshop on Kernels (Photo credit to Lindsey Heagy)

For four days, attendees worked full time on the Jupyter project, including hacking sessions and discussions on improvements to Jupyter components and new development. We were lucky to count a large number of core developers to the project in the group.

Beyond the hacking sessions, each day was concluded with a series of presentations and demos of the progress made during the workshop. In partnership with the PyData Paris team, we had a special installment of the PyData Paris Meetup with

  • an invited presentation by Emmanuelle Gouillart on Plotly Dash,
  • a series of lightning talks by attendees of the workshop on their achievements, including a talk by Philip Rudiger on the first release of Panel, and an announcement of the first releases of Voilà and the Voilà Gallery.

We ended the week with a social evening at the QuantStack offices in Paris.

Why a workshop on Jupyter Dashboarding with Jupyter?

The Jupyter ecosystem is used extensively in scientific computing both in academia and industry, and a rich ecosystem of data visualization tools has been developed around the Jupyter widgets frameworks, from geographical data visualization to protein folding simulation.

However, the Jupyter ecosystem still did not provide a means for developers to transition from notebooks to stand-alone web applications that can be accessed by multiple users.

This has been a longstanding request from the community: provide better tools built upon the Jupyter stack to share results with students, peers, or the general public.

These are the challenges that we decided to tackle during that week. The workshop was attended by many Jupyter core developers.

Highlights of the week

Many of the developers spent the week working on the Voilà and Panel projects. Both projects had their first public releases during that week (see the first public announcement of Panel and Voilà).

  • During this week, a team of participants including Yuvi Panda, Pascal Bugnion, and Jeremy Tuloup iterated on the first version of the Voilà gallery. Several first-time contributors to the widget framework authored example dashboards for the gallery, showcasing their existing work. Yuvi also produced the first deployment scenarii for Voilà on Heruku.
  • Cheryl Quah, from Bloomberg MC-ed a panel on dashboarding in the Jupyter ecosystem, including lots of questions and comparisons with Dash.
  • Philip Rudigger started working on a Bokeh/ipywidgets integration for better interoperability between the two frameworks.
  • Other contributors iterated on creating new Voilà templates, such as voila-vuetify, adding the ability to position Jupyter widgets and outputs in arbitrary location in the dashboard template. Grant Nestor created visual mockups for a UI for creating dashboard layouts in JupyterLab. Grant also helped iterating on logos for the project, and gave a presentation on dynamically loading JavaScript modules in the browser.

Acknowledgments

This event would not have been possible without the generous support provided by Bloomberg, who made this workshop series possible

We are grateful to Société Générale for funding the catering for the workshop.

The hosting of the workshop at CRI was paid for by QuantStack.

The public meetup was organized in partnership with the PyData Paris team.

Finally, we especially thank Ana Ruvalcaba from Project Jupyter for her incredible work on the logistics and finances of the Jupyter Community Workshop series.


Report on the Jupyter Community Workshop on Dashboarding was originally published in Jupyter Blog on Medium, where people are continuing the conversation by highlighting and responding to this story.

Jupyter Community Workshops 2019 Year in Review

$
0
0
Top Image: Jupyter for Scientific User Facilities and High-Performance Computing, (Photo Credit, Fernando Perez). Bottom Left Image: Intro to Python for Kids, Parents & Teachers Workshop Series. Bottom Middle Image: Dashboarding in the Jupyter Ecosystem Workshop (Photo Credit, Lindsey Heagy) Bottom Right Image: South America Jupyter Community Workshop

2019 was a busy, successful and exciting time for Jupyter Community Workshops! Led by community members, this global event series brought together small groups of people in order to strengthen the Jupyter community and for high-impact strategic work on focused topics.

Thank you to all those in our community who made these workshops a success. It’s important to recognize the organizers who proposed, planned, and led the workshops, as well as those that supported their efforts and those that attended the gatherings. It is truly inspiring to see everyone come together to strengthen our community and to make the future of Jupyter brighter and more connected.

Our third call for proposals was announced in November of 2019. We’re happy to share that planning efforts are already under way! Stay tuned to this blog for an announcement of the Jupyter Community Workshops to be hosted in the first round of 2020.

Jupyter Community Workshops 2019

Many thanks to all those who led a workshop in 2019 (listed here in chronological order):

A special thanks to the financial sponsors of this event series, Bloomberg and Amazon Web Services. If your organization would like to support this program in the future, please contact NumFOCUS.

Jupyter Community Workshops are managed by Jupyter and NumFOCUS contributors: Ana Ruvalcaba, Jason Grout and Walker Chabbott.

Building Upon the Jupyter Protocol Workshop

Jupyter Community Workshops 2019 Year in Review was originally published in Jupyter Blog on Medium, where people are continuing the conversation by highlighting and responding to this story.

nbviewer has a new host: OVHcloud

$
0
0

For several years, nbviewer has been generously hosted by Rackspace. That sponsorship program appears to be ending, so nbviewer needed a new home; it has found one in OVHcloud. We are extremely grateful to OVHcloud for their support in keeping nbviewer running, building on their existing participation in the Binder Federation (literally—nbviewer is now running on the same kubernetes cluster as ovh.mybinder.org).

How we moved

Where were we before?

nbviewer was previously deployed using a private repo (because it contained credentials) and various commands using invoke. It was a mixture of custom steps, using the openstack Python API and docker machine to:

  • allocate two VMs
  • build docker images
  • deploy two nbviewer instances per node
  • deploy memcached via nbcache on each node
  • deploy statuspage publisher as a separate step
  • update fastly to point to the running nbviewer instances

The upside was that we had a repo that could immediately deploy nbviewer to anywhere with docker. With this repo, we moved our deployment strategy from a CoreOS cluster to Rackspace’s short-lived Carina service to deploying VMs ourselves with docker-machine. Migrating to a new source of VMs would not have been hard, but it wouldn’t have solved any of our challenges.

Known downsides of this deployment that we’ve experienced over the years:

  • only a few folks ever knew how to use it and could thus deploy updates to nbviewer
  • it was private, contributing to above. It’s hard to onboard folks in an open community to a private repo!
  • independent machines meant cache was not shared (minor, but contributes to our consumption of the GitHub API rate limit)
  • no automatic recovery based on health monitoring, so when a container had issues, some humans got automated emails but no action was automatically taken. Extra frustrating because the fix was ~always to restart the container.

The credentials used by this repo have been revoked and an archived version of the repo is available (with credentials redacted from history) in the new, public nbviewer.org-deploy repo.

What have we learned?

We’ve learned a lot about open, automatic, and sustainable deployments, and can now comfortably address all of the downsides above. Most of this has been learned from the communities participating in the JupyterHub and Binder projects, as seen in the mybinder.org-deploy repo. mybinder.org-deploy is a public repo that automatically deploys and tests updates to at least four different Kubernetes clusters at the push of a button (the Big Green Merge Button, to be precise). Some things we have learned in the years since we set up our nbviewer deployment:

  • kubernetes and helm are great :)
  • git-crypt allows us to have public repos with some secret contents, so deployment repos don’t need to be fully private just to protect a couple api keys.
  • Continuous Deployment via services like Travis or Circle lowers the bar to adding maintainers on a given deployment since all that’s needed is to press the Big Green Button.

The OVH sponsorship came in the form of a slice of a kubernetes cluster; that meant we had to migrate the nbviewer deployment tools from using docker-machine to kubernetes, which for us means helm.

Step 1: helm chart for nbviewer

The first step was to create a helm chart for nbviewer, which is done here. Before, nbviewer was two docker containers running nbviewer and one running memcached per server. To turn this into a helm chart we need:

  • dependency on memcached helm chart for easy deployment of the cache. All the nbviewer instances will talk to this memcache, which should improve our GitHub rate limit consumption, since the cache will be shared. This means the deprecation of our own nbcache repo.
  • Deployment for nbviewer. This is (one of) the kubernetes wrappers around containers. It has a nice ‘replicas’ field to easily scale nbviewer up and down. We are currently running with 3 replicas.
  • Service to expose nbviewer to the Internet (likely also need an ingress in the future)
  • Deployment for the statuspage publisher, which updates https://status.jupyter.org with the remaining github rate limit available

We also needed helm charts and docker images for some other components, such as cdn.jupyter.org and the nbviewer statuspage data source.

Step 1b: helm chart for cdn.jupyter.org

nbviewer and some other services interact with cdn.jupyter.org, which is a lightweight nginx configuration to serve the classic notebook’s javascript and css as static files(this is no longer needed for npm-based jupyterlab, which can use unpkg as a CDN). This used to run on a multipurpose server VM, but that is also being retired for the same reason. The result was creating a docker image and helm chart for serving the contents of cdn.jupyter.org. The scripts used to run the existing CDN were already in a repo, so it was a small amount of work to adapt this to run in a container instead of on a server. We may retire cdn.jupyter.org in the future, so please don’t rely on it :)

This is where we are right now — nbviewer.jupyter.org and cdn.jupyter.org are being served by OVH and the Rackspace machines are being retired.

Step 2: automatic deployment

It would have been ideal for this to be step 1, but that’s now how it happened. Sometimes you need to get it done quickly before you get it done right. The task here will be to make a new, public nbviewer-deploy repo following the patterns we have learned in mybinder.org-deploy. That will mean:

  • create public nbviewer.org-deploy repo with configuration and git-crypt encrypted secrets for deploying nbviewer on the ovh cluster (done)
  • adopting chartpress to publish our helm charts for nbviewer and version-tagged images from the nbviewer repo
  • configure Travis-CI or other CI service to automatically deploy updates with helm, so merging a PR is all we need to do to deploy updates to nbviewer

Thanks to OVHCloud for their continued support of Jupyter and Binder, and the JupyterHub and Binder communities for teaching me how to operate services in the open with kubernetes and helm.


nbviewer has a new host: OVHcloud was originally published in Jupyter Blog on Medium, where people are continuing the conversation by highlighting and responding to this story.


The superheros and the magic wand

$
0
0

The superheroes and the magic wand

This is the first of a series of posts describing tools in the JupyterHub ecosystem, written by our wonderful Contributor in Residence, Georgiana. For our first post, we’ll share some lore of JupyterHub, and tell you a story of how it all began…

In a place far, far away, on a planet called Jupyter, magic happens every day. This land is special because it’s full of magical tools with all kind of powers, devoted to one common purpose: to help people. Everyone sympathizing with this goal either advocates, uses, or cares for these tools. So, in the blink of an eye, these people with sometimes nothing else in common than the same drive to help others, gathered together and formed a community.

This special group is known in the galaxy as “The Jovyans” and new recruits join every day. 🚀
This image was created by Scriberia for The Turing Way community and is used under a CC-BY licence. Zenodo record.

But somebody needed to take care of these magical tools. So, the first Jovyans decided that from then on, they will become tool-keepers. Because their greatest responsibility was to teach new Jovyans how to use the magic, shortly after, they created a set of guiding laws. Some call this “the Documentation”.

The magic wand

One greatly cherished tool on planet Jupyter is the magical wand. This wand’s very special power is to help people work together as a team and find solutions to important problems.

The wand is called “JupyterHub”.

But the wand is of vast complexity and because of this, so are the guiding laws that control it.

As a consequence, people trying to benefit from the magic of JupyterHub, spent a lot of time reading and understanding the instructions.

The superheros

The tool-keepers noticed that most of the people coming to planet Jupyter to become Jovyans were from a big city in the cloud 🌤 called Kubernetes. So they gathered together and debated what is the best way to help the people of Kubernetes.

After 3 days and 3 nights of intense discussions (also lots of pizza breaks of course) they decided that one of them needed to get special training, rent a house in Kubernetes and teach the people there the wonders of the JupyterHub magic wand.

The Chosen One gained the people’s trust, and it got better and better at anticipating and understanding their needs. So the locals started seeing it as a superhero and they even gave it a name, “Z2JH”. 👓

The news of these events started to spread far and wide, and people living in little towns outside of Kubernetes felt that they deserved the support of a superhero too. They also had great ideas and little time and needed to use the wand’s magic to do good.

The tool-keepers were inspired by the magnificent accomplishments of these little towns, taking place even without a superhero around. So the littlest of them all, volunteered to go into superhero training to help the people of the little towns do good, faster. Though it was little, in no time, it grew to become the helper these small groups needed.

The residents called it “The Littlest JupyterHub” or “TLJH” because of its stature. But everybody knew that its stature didn’t reflect its enormous tenacity, speed and ability to do great things. This ended up inspiring people all over to believe that not all superheros wear capes, nor do they need to know how to fly in the cloud to be cool.

Nowadays, people from all over join their forces and help the superheros learn new skills. Thanks to them, TLJH and Z2JH get stronger and better and have more special powers than they had when first created by the tool-keepers. Join them, become a Jovyan! ツ

The superheros and the magic wand was originally published in Jupyter Blog on Medium, where people are continuing the conversation by highlighting and responding to this story.

A visual debugger for Jupyter

$
0
0

Most of the progress made in software projects comes from incrementalism. The ability to quickly see the outcome of an execution and iterate has been one of the main reasons for the success of Jupyter, especially in scientific exploratory workflows.

Jupyter users like to experiment in the notebook, and to use the notebook as an interactive communication tool. However, for more classical software development tasks such as the refactoring of a large codebase, they often switch to general-purpose IDEs.

The JupyterLab environment.

The Jupyter project has made strides in the past few years towards filling that gap, notably with the JupyterLab project, which enables a richer UI including a file browser, text editors, consoles, notebooks, and a rich layout system.

However, a missing piece (which has remained one of the main reasons for users to switch to a different tool) is a visual debugger. This feature has long been requested by users, especially those accustomed to general-purpose development environments.

A debugger for Jupyter

Today, after several months of development, we are glad to announce the first public release of the Jupyter visual debugger!

This is just the first release, but we can already set breakpoints in notebook cells and source files, inspect variables, navigate the call stack and more.

Screencast of the JupyterLab visual debugger in action

Try the debugger on binder

You can also try the debugger online with binder. Just click on the binder link:

Click on the binder link to launch the demo

Installation

  • The debugger front-end can be installed as a JupyterLab extension.
jupyter labextension install @jupyterlab/debugger

The debugger front-end will be included in JupyterLab by default in a future release.

  • In the back-end, a kernel implementing the Jupyter Debug Protocol (which will be detailed in the next section) is required. The only kernel implementing this protocol so, for now, is xeus-python a new Jupyter kernel for the Python programming language. (Support for the debugger protocol in ipykernel is also on the roadmap).
conda install xeus-python -c conda-forge

Once xeus-python and the debugger extension are installed, you should be all set to use the Jupyter visual debugger!

Note: Depending on the platform, PyPI wheels are available for xeus-python, but they are still experimental.

The Jupyter Debug Protocol

New message types for the Control and IOPub channels

Jupyter kernels (the part of the infrastructure that executes the user's code) communicate with the rest of the infrastructure with a well-specified inter-process communication protocol.

Several communication channels exist, such as

  • the Shell channel, which is a request/reply channel for e.g. execution requests
  • the IOPub channel, which is a one-directional communication channel from the kernel to the client, and is used e.g. to forward the content of the standard output streams (stdout and stderr).

The Control channel is similar to Shell but operates on a separate socket so that messages are not queued behind execution requests, and have a higher priority. Control was already used for Interrupt and Shutdown requests, and we decided to use the same channel for the commands sent to the debugger.

Two message types were added to the protocol:

  • the debug_[request/reply] to request specific actions to be performed by the debugger such as adding a breakpoint or stepping into a code, which is sent to the Control channel.
  • the debug_event uni-directional message used by debugging kernels to send debugging events to the front-end. Debug events are sent over the IOPub channel.

Extending the Debug Adapter Protocol

A key principle to the Jupyter design is the agnosticism to the programming language. It is important for the Jupyter debug protocol to be adaptable to other kernel implementations.

A popular standard for debugging is Microsoft's "Debug Adapter Protocol" (DAP) which is a JSON-based protocol underlying the debugger of Visual Studio Code and for which there already exist multiple language back-ends.

It was therefore natural for us to use the DAP messages over the debug_[request/reply] and debug_event messages that we just added.

However, it was not quite sufficient in the case of Jupyter. Indeed

  • In order to support page reloading, or a client connecting at a later stage, Jupyter kernels must store the state of the debugger (breakpoints, whether the debugger is currently stopped). The front-end can request that state over with a debug_request message.
  • In order to support the debugging of notebook cells and of Jupyter consoles, which are not based on source files, we also needed messages to submit code to the debugger to which breakpoints can be added.

Besides these two differences, the content of the debug requests and replies corresponds to the debug adapter protocol.

All these extensions to the Jupyter kernel protocol have been proposed for inclusion in the official specification. The JEP (Jupyter Enhancement Proposal) can be found here.

Xeus-python, the first Jupyter Kernel to support debugging

Xeus is a C++ implementation of the Jupyter kernel protocol. It is not a kernel by itself but a library that helps kernel authoring. Xeus is useful when developing a kernel for a language that has a C or a C++API (like Python, Lua, or SQL). It takes the cumbersome task of implementing the Jupyter messaging protocol for the kernel author to focus on the core interpreter tasks: executing code, inspecting, etc.

Several kernels have been developed with xeus, including the popular xeus-cling kernel for the C++ programming language, based on the cling C++ interpreter from CERN. The xeus-python kernel is an alternative Python kernel to ipykernel, based on xeus. The first release of the xeus-python kernel was announced on this blog earlier this year: https://blog.jupyter.org/a-new-python-kernel-for-jupyter-fcdf211e30a8

Xeus-python was an appropriate choice for this first implementation of the debugging protocol because

  • it has a pluggable concurrency model, which allowed running the processing of the Control channel in a different thread.
  • it has a lighter-weight codebase which made it a convenient sandbox to iterate upon. Implementing the first version of the protocol in ipykernel would have required more significant refactoring and consensus building at an early stage.

The roadmap of Xeus-python

The short-term roadmap for xeus-python includes

  • adding support for IPython magics in xeus-python, which is the main missing feature with respect to ipykernel.
  • improving the PyPI wheels of xeus-python.

What about other kernels?

The work in the front-end is valid for any kernel implementing the extended kernel protocol.

We will be working in 2020 to enable debugging with as many kernels as possible.

This will soon be the case for other xeus-based kernels which share a large part of the implementation with xeus-python, such as xeus-cling.

Diving into the debugger front-end architecture

The debugger extension for JupyterLab provides what users would typically expect from an IDE:

  • a sidebar with a variable explorer, a list of breakpoints, a source preview and the possibility to navigate the call stack
  • the ability to set breakpoints directly next to the code, namely in code cells and code consoles
  • visual markers to indicate where the current execution has stopped

When working with Jupyter notebooks, the state of the execution is kept in the kernel. But a cell can be executed and then deleted from the notebook. What should happen when a user wants to step in deleted code?

The extension supports that particular use case and enables retrieving a read-only view of the previously executed cell.

Stepping into a deleted cell

Consoles and files also have support for debugging.

Debugging code consoles in JupyterLab
Debugging files in JupyterLab

Debugging can be enabled on a notebook level, which lets users debug a notebook and work on a different one at the same time.

Debugging multiple notebooks simultaneously

Variables can be inspected using a tree viewer and a table viewer:

The variable explorer

The debugger extension for JupyterLab has been designed to work with any kernel that supports debugging.

By relying on the Debug Adapter Protocol, the debugger extension abstracts away language-specific features and provides a consistent debugging interface to the user.

The following diagram shows how the debug messages flow between the user, the JupyterLab extension and the kernel during a debugging session.

Using the Debug Adapter Protocol in the debugger extension (source)

Future developments

In 2020, we plan on making major improvements to the debugger experience:

  • Support for rich mime type rendering in the variable explorer.
  • Support for conditional breakpoints in the UI.
  • General improvements of the debugger user experience.
  • Enable the debugging of Voilà dashboards, from the JupyterLab Voilà preview extension.

Acknowledgements

The JupyterLab debugger is the result of the collaboration and coordination of developers from several institutions, including QuantStack, Two Sigma, and Bloomberg.

About the developers

Jeremy Tuloup

Jeremy Tuloup is a Scientific Software developer at QuantStack. He authored a large part of the front-end of the JupyterLab debugger.
.
.
.

Borys Palka

Borys Palka is a software developer at Codete. He authored a large part of the front-end of the JupyterLab debugger.
.
.
.

Johan Mabille

Johan Mabille is a scientific software developer at QuantStack. Johan is a co-author of xeus, and developed the debugger extension to xeus-python. He also authored a large part of the debugger front-end.
.

Martin Renou

Martin Renou is a scientific software developer at QuantStack. He is the original author of xeus-python, the xeus-based Python kernel, and contributed to the new concurrency model used for the debugger.

Afshin Darian

Afshin Darian is a software developer at Two Sigma, and one of the authors of JupyterLab.
.
.
.
.

Sylvain Corlay

Sylvain Corlay is the founder and CEO of QuantStack, and a core Jupyter developer. He co-authored xeus and xeus-python.


A visual debugger for Jupyter was originally published in Jupyter Blog on Medium, where people are continuing the conversation by highlighting and responding to this story.

Interactive Graph Visualization in Jupyter with ipycytoscape

$
0
0

The Jupyter widgets ecosystem offers a broad variety of data visualization tools for exploratory analysis in the notebook. However, we lack a good story for exploratory graph visualization.

Cytoscape is an open-source software platform for visualizing complex networks and integrating these with any type of attribute data. While it comes from the computational biology community, cytoscape is fully-fledged general-purpose tool for graph visualization and analytics. It now includes a modern web front-end (Cytoscape.JS) which is a great candidate for integration with Project Jupyter. This is the raison d’être of ipycytoscape.

Gene visualization in ipycytoscape

The goal of ipycytoscape is to enable users of well-established libraries of the Python ecosystem like Pandas, NetworkX, and NumPy, to visualize their graph data in the Jupyter notebook, and enable them modify the visual outcome programmatically or graphically with a simple API and user interface.

Fortunately, Cytoscape offers a broad enough API that allows ipycytoscape to be a tool that can, in fact, be used to solve any type of problem modeled as a graph. Some examples consist in the development of new chemicals to analyze interactions between substances in the pharmaceutic industry, in security systems to create attack graphs that can be useful to show possible vulnerabilities in systems, modeling human behavior to understand people’s interaction with business or even to understand complex phenomena like the current crisis. Currently, there is an effort to make ipycytoscape an accessible tool for researchers that are trying to find ways to mitigate and understand it, there is more information about this initiative on the COVID OSS Help website and the discussion is happening in this repository if you're interested in joining it.

Current state

IPycytoscape is part of the PLASMA project (aka in French, Plateforme d'eLearning pour l'Analyse de données Scientifiques MAssives). This project aims at creating an interactive tool to teach computational analysis of massive scientific data. Its first instance, PlasmaBio, is designed for the needs of teachers and students of the European Master of Genetics at Université de Paris. PlasmaBio provides an authentic experience of the actual genomic and bioinformatic analyses performed in research labs. For that purpose, a custom JupyterHub-based system to control many different Jupyter instances is being specially developed by Jeremy Tuloup at QuantStack.

In this first version of ipycytoscape, there are still some limitations to what you may be able to do, but there are also some extents from the Python world that will just work out of the box for you. ipycytoscape offers integration between Pandas DataFrames and NetworkX, meaning that you can have a graph visualization of the data you already have with minimal or none adjustments and just a few lines of code.

Usage with NetworkX and DataFrame

ipycytoscape supports all of the built-in CytoscapeJS layouts. This includes the cola, grid, breadthfirst, circular, concentric and Dagre layout as well as the random, null or preset options to build a graph visualization that fits better to your data .Additionally, ipycytoscape also supports the PopperJS and TippyJS extensions, that allows you to create customizable tips for your nodes and edges .

cola, concentric, dagre and grid layouts

You can also use a variety of labels for a quick visualization of your nodes’ and edges’ contents.

Labels on nodes

Like most Jupyter interactive widgets, ipycytoscape relies on the traitlets library to synchronize data between the back-end and the front-end model.

Unfortunately, traitlets have a limitation when it comes to container objects and other mutable structures, because synchronization is only triggered upon assignment of the container and not when modifying individual elements.

To work around this limitation, we make use of the excellent Spectate library by Ryan Morshead, which triggers observers upon individual element changes in containers.

Interaction between ipywidgets and ipycytoscape

Try it online!

You can try it without the need of installing anything on your computer just by clicking on the image below:

https://mybinder.org/v2/gh/QuantStack/ipycytoscape/stable?filepath=examples

Installation

Note that you first need to have Jupyter installed on your computer. You can install ipycytoscape using pip:

pip install ipycytoscape

Or using conda:

conda install -c conda-forge ipycytoscape

If you use JupyterLab, you would need to install the JupyterLab extension for ipycanvas (this requires nodejs to be installed):

jupyter labextension install @jupyter-widgets/jupyterlab-manager jupyter-cytoscape

About the author

My name is Mariana Meireles and I’m a software developer working for QuantStack. I care deeply about the impacts that technology has in the world and try my best to be the change I want to see by contributing to open source projects that stand upon libre and diverse standards.

Prior to QuantStack I worked as a developer on the PySide team at the Qt Company and as a web performance developer at Mozilla.

Currently I’m working on expanding the Jupyter ecosystem with new libraries and functionalities, like an experimental SQLite kernel.

Acknowledgements

The development of ipycytoscape at QuantStack was funded as part of the PLASMA project, led by Claire Vandiedonck, Pierre Poulain, and Sandrine Caburet, associate professors at Université de Paris.

Sponsors to the PLASMA initiative include:


Interactive Graph Visualization in Jupyter with ipycytoscape was originally published in Jupyter Blog on Medium, where people are continuing the conversation by highlighting and responding to this story.

CIR Report I

$
0
0

The Jupyter Contributor In Residence, update 1

This year Jupyter was awarded a one-year “Essential Open Source Software for Science” grant from the Chan Zuckerberg Initiative, which we are using to bring on our first “Contributor in Residence” (many thanks, CZI ). This is a short report back from Jupyter’s first “Contributor in Residence”, Georgiana Dolocan. We are now about one third of the way through her tenure as CIR. Editor’s note: Georgiana has also made all of her own GIFs, because she’s just awesome like that ;-)

The JupyterHub and Binder Contributor in Residence. The first keepalive message.

At the end of last year, I found out that the CZI grant (so nicely put together by Chris Holdgraf), was accepted. I was very happy to have gotten the chance to continue working alongside an amazing community that I grew very fond of from day one that I started contributing.

Plan 1.0

Being the first project of its kind, we had millions of ideas about what the Contributor in Residence (CIR) project should be about. We discussed and converged towards a common plan, then we decided to continuously iterate once we got some real life feedback.

See this GitHub issue with the end result of the brainstorming, and with various milestones set in place. In short, the purpose of the CIR is to work on the little day-to-day tasks that occur across a project (respond to questions, review PRs, fix bugs, write documentation, maintain infrastructure), nothing fancy, no big projects.

The biggest little achievements

Among the little day-to-day tasks, there are a few achievements that are more noticeable and worthwhile to mention, that will potentially have a bigger impact than others.

🚩 The first blog post about The-Littlest-JupyterHub is out

The superheroes and the magic wand is the first blog post about TLJH on the Jupyter Blog. It first started as a technical blog post but while writing and constantly relating to TLJH as having superpowers to explain how it works, it turned rapidly into a superhero story.

TLJH is an important project in the Jupyter ecosystem and one that I care about a lot, so stay tuned as there are more blog posts to come about it, and although technical this time, I won’t restrain myself from talking about its superpowers.

🚩 The TLJH CI tests were refactored and an upgrade test was added

Are you familiar with that feeling when a pull request looks OK, it passes all tests and all the bots are happy, so you hit that merge button? Yet, soon after, the unexpected happens and bug reports start showing up. 🙈

Even though you manage to find the issue and solve it, one question remains: “Why did the checks pass?” The answer isn’t easy to hear: “The tests need to be improved”.

TLJH operates on a “rolling release” model, where people just install from master, so it’s very important to have strong tests. However, our issue wasn’t exactly with the tests themselves, but rather our CI didn’t match the production installation workflow, and that small difference between the workflows left TLJH in an uninstallable state. Merging that PR in the first place along with finding the divergence point was the key to solve the problem.

Bonus point: whenever someone wants to upgrade TLJH, they do it by running the installer again. Thus, the changes we’re adding by merging a PR, don’t just need to pass the tests, but also they must not break the upgrade process. So, we also added an upgrade test to our CI to solve any potential future upgrade problems.

🚩 Three repositories in the JupyterHub organization are now being watched by a support bot

The Jupyter community currently uses GitHub, Gitter and Discourse (the newest of them) to connect to one another. Having three communication channels might be confusing if there isn’t any clear distinction between them in terms of their scope. So we needed to find a way to organize our discussion to make the contributing experience better for everyone (users, core developers, and everyone in between).

The Discourse forum was already the place where most of our general discussions happened. So why not make this place even more popular and encourage everybody to share their ideas and questions there? This way, the forum becomes the place where we can help and inspire each-other, and GitHub issues are where we talk about cool new features, find bugs, and figure out how to solve the issues.

The support bot will help spread the word about the forum and will make the transition easier. Currently, it is plugged into the jupyterhub, binderhub and tljh repositories and will act each time the “support” label is added to a GitHub issue that should be on Discourse.

We kindly encourage everybody to use Discourse to start or participate in awesome discussions or ask questions of any kind. A great place to start is the Introduce yourself thread where you can just stop by and say “Hi”.

Struggles and solutions

Working on little tasks sounds simpler and easier than starting a big, complex project from scratch. Well, this is not entirely true… Those small tasks are happening around big projects that are used by a lot of people and are constantly evolving. So, even a small, not very complicated task, can be hard when you’re still figuring out the “hows” and “wheres” of a project.

Struggle: distributing focus across multiple repositories and issues

When working on a bigger project, I used to switch between tasks as a means to unwind when running out of ideas and things seemed impossible. I guess one could call this multitasking. However, switching between tasks of different projects that you know little about is challenging enough to cause stress, resuscitate that impostor syndrome, and ultimately turn into burnout.

Solution: Although I don’t have a solution to this and I’m still having troubles mastering this skill, I find inspiration and hope among the people in the Jupyter community. Even if maintaining lots of projects at once, they manage to successfully juggle between tasks, sometimes even when this isn’t their main job, so they deserve a huge shout out ❤️

Struggle: keeping track of little things

The difference between working on a big project and solving small tasks here and there (besides the obvious size distinction) is that solving smaller tasks makes you prone to loosing a sense of progress. Though not a project in the traditional way, the CIR is a concept that would be worth extending if successful, so evaluating its progress is a must and can help determine and improve those weak points — and why not — even point out when it’s time to stop. But keeping track of little things is tricky.

Solution: GitHub issues, comments or PRs can automatically be retrieved through GitHub’s API (Chris Holdgraf has a great tool for this called github-activity). However, to get a sense of accomplishment and progress at the end of the day, I tried to keep my own list of todos and tasks done. I experimented with two approaches in this sense: keeping a HackMD “journal” and creating a GitHub project. After these three months I think I like the GitHub project better because it allows me to keep a todo list with small enhancements I’d like to work when that notification bell doesn’t ring that often and it’s less copy and pasting links.

Struggle: getting out of the comfort zone

Implementing that cool new feature or solving that annoying bug when feeling adventurous are part of the fun of software development. However, being a CIR is more than writing code. Things like writing blog posts and reviewing PRs are core responsibilities as well. But these are all new territory, so much that, I think up until last year I could count on the fingers of one hand the number of blog posts I had written or the number of PRs I had reviewed. So, the only thing left to say is “Hello new challenges! Farewell comfort zone 👋”.

Solution: Accept the challenge and practice every day until things don’t feel that scary anymore. Patience is key.

Plan 2.0

Version 1 of the plan was solid enough not to have to modify it completely.
So, things like tracking down the bugs and failing tests won’t be halted.
Instead, during these next two months, the plan will only suffer some little focus-related adjustments.

Specifically, I want to focus more on:
- Getting more familiar with some of the repositories that didn’t get enough attention during the first phase of the project (zero-to-jupyterhub-k8s, jupyterhub-deploy-docker, binderhub, dockerspawner, kubespawner, repo2docker), potentially improving the documentation along the way. No more dipping the toes in the water, just dive in 🌊.
- Writing more blog posts about the superpowers of TLJH.
- Have more bots 🤖 helping the community.

In the process, I hope I can check the boxes for some of the items on my “learn and improve” wish-list:
- technical writing
- Kubernetes
- UI/UX
- courage

Stay tuned, because the next blog post about the JupyterHub and Binder Contributor in Residence adventures should be out in two months time.


CIR Report I was originally published in Jupyter Blog on Medium, where people are continuing the conversation by highlighting and responding to this story.

Plasma: A learning platform powered by Jupyter

$
0
0

Jupyter has been a great choice for education for many years. The Jupyter Notebook has become one of the most popular tools to conduct workshops, tutorials, and teach online classes.

Recently we have seen the emergence and adoption of JupyterHub distributions to facilitate the deployment of Jupyter-based platforms, both on private servers and in the cloud.

We would like to share with you an open-source learning platform called Plasma, built with Jupyter at its core.

JupyterHub Distributions

JupyterHub is a highly customizable and modular framework. To simplify its adoption, JupyterHub distributions target specific deployment scenarios with opinionated defaults. They make it easier to deploy JupyterHub on a single server and in the cloud.

There are currently two popular JupyterHub distributions:

Although not an official distribution, jupyterhub-deploy-docker is also a good resource for a full Docker-based setup. It runs JupyterHub itself in a Docker container and orchestrates the stack with Docker Compose.

The Plasma Stack

Plasma stands for PLateforme d’e-Learning pour l’Analyse de données Scientifiques MAssives, which can be translated to “An e-learning platform for massive scientific data analysis”.

The platform is typically meant to be deployed on high-end machines with multiple cores and GB of RAM.

Some of the requirements for the Plasma project fall somewhere between the TLJH and ZTJH distributions:

  • The deployment should be on a single server, and reproducible on other machines too (running on Ubuntu 18.04+)
  • It should support multiple user environments with different sets of dependencies
  • Users should authenticate as system users and their data should be persisted in their home directories on the host machine

Although TLJH doesn’t officially support container technology, its plugin system opens the door to many other use cases.

Because of this and to foster the TLJH plugin ecosystem, we decided to develop the Plasma stack as a plugin for TLJH and consolidate the deployment story with Ansible playbooks.

Plasma is an opinionated JupyterHub stack powered by The Littlest JupyterHub, with the following defaults:

  • PAMAuthenticator to authenticate JupyterHub as users existing on the host machine
  • SystemUserSpawner to start single-user servers in Docker containers, using the system user home directories for data persistence

To enable extra functionalities, the Plasma stack relies on:

The Plasma stack also contains tools to monitor the system, create and configure users, and add hub admins. It can be visualized with the diagram below:

The Plasma Overview Diagram

The stack is defined in the following repository: https://github.com/plasmabio/plasma

There is also extensive documentation with detailed explanations on how to deploy the stack on a new server: https://docs.plasmabio.org

A repo2docker plugin for The Littlest JupyterHub

The tljh-repo2docker plugin lets JupyterHub admins create new user environments using repo2docker. This plugin starts a JupyterHub service to manage user environments from the JupyterHub UI.

For those already using Binder, the idea will sound very familiar. Under the hood, the tljh-repo2docker plugin also uses repo2docker to build the Docker images. It follows the same patterns and naming conventions as Binder, which makes it easier and more natural to test the environments on Binder before adding them to JupyterHub.

New environments can be added by clicking on the Add New button and providing a URL to the repository. Optional names, memory, and CPU limits can also be set for the environment:

Adding a new environment

The Environments page shows the list of built environments, as well as the ones currently being built:

Building a new environment

The status of the environment changes once the underlying Docker image has been built:

The list of available user environments

Once ready, the environments can be selected from the JupyterHub spawn page:

Selecting an environment

Because it is separate from the Plasma stack, this plugin can also be used for other temporary TLJH deployments on a virtual machine. For example, the organizer of a workshop can prepare a list of environments before the event, just like they would with Binder.

Automating deployments with Ansible

To minimize the number of manual steps involved in the setup of the stack, Plasma also provides a list of Ansible playbooks.

Ansible is an open-source tool to automate the provisioning of servers, configuration management, and application deployment.

Playbooks define a list of tasks that should be executed and declare the desired state of the server.

The list of playbooks and instructions on how to use them are provided in the Installation section of the documentation.

Overall, the playbooks make it easier to perform upgrades, to automate the deployment process, and to replicate the setup at other institutions and universities.

Here is an example of what an upgrade looks like:

Upgrading the stack with an Ansible playbook

This playbook defines the tasks to:

  • download the TLJH installer
  • execute the TLJH installer to perform the upgrade
  • update the TLJH plugins
  • set the idle culler timeout
  • set the default memory and CPU limits
  • reload JupyterHub
  • pulls the latest jupyter/repo2docker Docker image

Further reading

Acknowledgements

The development of the Plasma stack at QuantStack was funded as part of the Plasma project, led by Claire Vandiedonck, Pierre Poulain, and Sandrine Caburet, associate professors at Université de Paris.

Sponsors to the Plasma initiative include:


Plasma: A learning platform powered by Jupyter was originally published in Jupyter Blog on Medium, where people are continuing the conversation by highlighting and responding to this story.

Viewing all 311 articles
Browse latest View live