-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature Proposal: xarray.interactive
module
#3709
Comments
Difficulties with method chainingArbitraily long method chaining would be great, i.e. da.interactive.isel(time=10).mean('time').plot() but I think it will be considerably more complicated. The problem is that the way the I've found a way to get around this, but I'd like some feedback on the approach because it might be needlessly complicated. I would like to do it by subclassing to create an ida = da.interactive.isel(time=10) This class would store the widgets and decorate it's inherited methods to either propagate them (e.g. through To allow for the final method to recompute all the previous steps, each inherited computation method would be wrapped by a decorator which records the function used and it's arguments. I've got a very rough example of this working, but as I said there might be a much easier way... |
(also I realise that the suggestion at the end is similar to a task graph of |
This looks fantastic @TomNicholas!! IMHO, I would rather see this maintained in a separate project (something like |
related: #2034 |
Yeah that's a fair point. I think this is another case where the ecosystem of packages orbiting xarray could do with being more explicitly organised. Reasons for direct integration in xarray:
Reasons for a separate
I guess either way I could just write it in a separate repo and if in future we decided to include it in xarray master then move it. @philippjfr @rabernat would be interested in your perspectives as developers/users of these downstream libraries? Would this be useful or not really? |
This looks really cool and I like the API! I'll have to give it a try to give more detailed feedback. Note that I'm not a core developer of xarray but I also think this is best managed as an external project. Just wanted to ask some clarification on some of your comments.
What do you mean by this? hvPlot does let you explore n-dimensional data using widgets, what is the limitation you were seeing there?
This is a good point, but I guess I'm not yet entirely clear on how your proposed APIs would deal with this. |
Great!
Thanks, but it's definitely not ready for that yet, I'll post here and tag you when it is.
I had a go with hvPlot's gridded data classes and although it worked well for plotting variation along one dimension with a single slider, I got some errors when I tried to plot N-D data with multiple slider widgets along more than one dimension. It looks like that might have been user error though... I'll compare more closely and raise issues if necessary.
I'm referring to the discussion on method chaining: that proposed API (using an ida = da.interactive.isel(lat=50, lon=60) before specifying the analysis to perform on it ida = (ida - ida.mean('time')).std(dim='time') and an |
Thanks @dcherian , I hadn't seen those. I think the difference between what I'm proposing here and what already exists (e.g. in holoviews, xrviz, etc.) is considering interactivity as something that is useful independent of plotting. The aim would be to allow interactive parameterization of arbitrary functions, which could (and often would) be plotting functions, but could actually be anything. That way analysis can be interactively parameterized, and the plotting can be handled by any library. (Plotting libraries could also choose to reuse these interactivity functions, but wouldn't have to.) I think that approach would integrate well with being able to change plotting backends too (#3553). |
That would be awesome! I have a strong interest in that with xarray-simlab, i.e., setting-up model parameters and running simulations interactively. |
The interactive widgets in holoviews and xrviz are obtained from Panel, which is a separate library that is already explicitly designed for specifying and constructing interactivity independent of plotting. E.g. we often use Panel widgets with no plotting to set up simulations or analyses interactively, then run whatever we specified. The It sounds like you're hoping for something that is independent of plotting (like Panel) and provides interactive widgets (like Panel) but also has specific support for multidimensional arrays (like HoloViews)? I don't think that's much code, but it could be useful to provide for Xarray in a convenient API. |
I think the real power in this proposal is in the ability to chain operations on interactive components using an API that will be familiar to xarray users. We have a similar concept in HoloViews which allows you to build complex processing and visualization pipelines. I'll work through some examples in HoloViz ecosystem to show what is possible there and maybe provide some ideas or approaches that might work here. Let's work with a relatively contrived but simple example and load the air_temperature sample dataset: airtemps = xr.tutorial.open_dataset('air_temperature')
ds = hv.Dataset(airtemps) In this example you explode your dataset into individual chunks for each longitude, then apply a reduction along the latitude and finally cast the output to a Curve giving us a Curve of the mean temperature at each longitude: curves = ds.groupby('lon', dynamic=True).apply.reduce(lat=np.mean).apply(hv.Curve).opts(width=600, framewise=True) Now we decide we want to resample the data too, so we import the resample operation and apply it to our existing pipeline:
But really we don't just want to compute the mean we want to pick the reduce function and we also want to be able to set the resampling frequency and pick a color. By combining Panel and HoloViews you can inject widget parameters at every stage: function = pn.widgets.Select(name='Function', options={'mean': np.mean, 'min': np.min, 'max': np.max})
color = pn.widgets.ColorPicker(name='Color', value='#000000')
rule = pn.widgets.TextInput(name='Rule', value='7d')
obj = (ds.groupby('lon', dynamic=True)
.apply.reduce(lat=function)
.apply(hv.Curve)
.apply.opts(width=600, color=color, framewise=True)
.apply(resample, rule=rule)
)
hv_pane = pn.pane.HoloViews(obj)
pn.Row(
hv_pane[0],
pn.Column(*hv_pane[1][0], function, color, rule)
) So this shows pretty clearly how useful this kind of chaining/pipeline building can be, especially when built on top of an API like xarray which allows for very powerful data manipulation. I don't have enough of a perspective to say how feasible it would be to implement something like this that comprehensively wraps xarray's API but I'd certainly love to see it. Whether it is built on Panel (which I am of course partial to as the author) or ipywidgets or even supporting both. My main comments therefore are about the API, it is not clear to me based on what you have said so far which parts of the API are actually interactive, e.g. in this case: ida = da.interactive.isel(lat=50, lon=60)
ida = (ida - ida.mean('time')).std(dim='time') Is only interactive.isel(da, plot_mean_over_time, time=slice(100, 500)) but expanded to include support for discrete lists of items, explicit widgets, and so on. Hope that's at all helpful! I think the idea is really neat and it could be very powerful indeed. |
One thing I didn't mention above is that in the pipeline I showed HoloViews will cache the intermediate changes so that if you change the color or change the resampling frequency it only executes the part of the pipeline downstream from where the parameter changed. |
Thanks @jbednar , I think that's a good summary of most of what I was imagining.
Yes exactly. There will be a lot of users who do their work in xarray and being able to achieve interactivity in their existing workflows with almost exactly the same API would improve their experience without presenting much of a barrier to adoption. Thanks for the (impressive) example @philippjfr !
I was imagining that functions/methods following the I didn't appreciate exactly how much of this panels/holoviews can already do - I think I need to go away and experiment with using/wrapping them but aiming for an xarray-like syntax. |
Maybe wait until early next week when I anticipate new Panel and HoloViews releases to be out which smooth out some issues with these workflows. |
On the one hand, yes, HoloViews + Panel is quite powerful and clean for what it can already do. But just so everyone is on the same page, the workflow @philippjfr shows above is only possible for the operations that HoloViews has implemented internally. The operations available in HoloViews are only a small subset of what can be done with the native Xarray or Pandas APIs, and adding new capability like that to HoloViews is difficult because HoloViews supports many different underlying data formats (lists, dictionaries, NumPy, Pandas, Xarray, etc.). So while there are advantages to what's already available in HoloViews:
there are also major disadvantages:
Note that hvPlot injects the plotting capability from HoloViews into Xarray and Pandas, letting you use the native data APIs for plotting, but it doesn't give you the control over lazy/interactive/reactive pipelines that HoloViews' native API offers. So to me what this issue's proposal would entail is taking the idea of hvPlot further, making Xarray (and Pandas) natively act like HoloViews already does -- with lazy operations where interactive controls can be inserted at every stage, letting people stay in their preferred rich, native data API while having the power to easily make anything interactive and to easily make anything visualizable. |
Having taken the ideas presented here as inspiration the latest HoloViews release actually extends what we had described above and provides the capability to use arbitrary xarray methods to transform the data and control the parameters of those transforms using Panel based widgets. The HoloViews docs show one such example built on xarray which is built around so call import panel as pn
import xarray as xr
air_temp = xr.tutorial.load_dataset('air_temperature')
# We declare a dim expression which uses the `quantile` method from the `xr` namespace
# and provides a panel FloatSlider as the argument to the expression
q = pn.widgets.FloatSlider(name='quantile')
quantile_expr = hv.dim('air').xr.quantile(q, dim='time')
# We now wrap the xarray Dataset in a HoloViews one, apply the dim expression and cast the result to an image
temp_ds = hv.Dataset(air_temp, ['lon', 'lat'])
transformed = temp_ds.apply.transform(air=quantile_expr).apply(hv.Image)
# Now we display the resulting transformation pipeline alongside the widget
pn.Column(q, transformed.opts(colorbar=True, width=400)) I am likely to integrate this capability with hvPlot with a more intuitive API, e.g. in this case I'd expect to be able to spell this something like this: xrds = xr.tutorial.load_dataset('air_temperature')
q = pn.widgets.FloatSlider(name='quantile')
quantile_expr = hv.dim('air').xr.quantile(q, dim='time')
xrds.hvplot.image(transforms={'air': quantile_expr}) |
Thanks, @philippjfr! What Philipp outlines above addresses the key limitation that I pointed out previously:
As of HoloViews release 1.13.2 that limitation is now completely gone, because a HoloViews interactive operation pipeline can now invoke arbitrary Xarray or Pandas API calls. So you're no longer limited to what has been encapsulated in HoloViews, and you can use the native Xarray method syntax that you're used to. Thus it's now possible to achieve most (all?) of the functionality discussed above, i.e. easily constructing arbitrarily deep Xarray-method pipelines with interactive widgets controlling any step along the way, replaying only that portion of the pipeline when that widget is changed. So, what's left? As Philipp suggests, we can make the syntax for working with this functionality simpler in hvPlot. At that point we should probably show the syntax required for each of the interactive pipelines demonstrated or suggested in this issue, and see if there's any change to Xarray that would help make the syntax easier or more natural for Xarray users. Either way, the power is now there already! |
This looks absolutely great @philippjfr ! I would be keen to help you and @jbednar with making the syntax as intuitive and familiar as possible for xarray users. If you have any relevant issues/PR's in holoviews or here then please tag me :) |
@TomNicholas I've been playing around with an |
That is so cool! I think the syntax is already as good as I can imagine. |
@philippjfr that looks incredible! The accessor syntax is exactly what I was imagining too, great job.
I would love to have a go, plus I had a few other ideas I would like to try out - is there a branch somewhere I could check out to get it going locally? |
This is very cool, nice work @philippjfr ! |
That's amazing. This would single-handedly turn xarray from "nice to have, pretty useful" to "I recommend it to all my friends". I would absolutely love to be able to use it. |
hvPlot's .interactive() support for xarray and pandas was released in in hvPlot 0.7.0 (installable with There are a few things I think we can still improve (listed at holoviz/panel#1824, holoviz/panel#1826, holoviz/hvplot#531, holoviz/hvplot#533), but it's already really fun to use -- just take your xarray or pandas pipeline You can use this with the native |
Update: hvPlot's .interactive support has been greatly improved and expanded in the new hvPlot 0.7.3 release. It is now showcased at holoviz.org, which introduces how to use hvPlot to build plots, then how to use xarray .interactive and pandas .interactive to add widgets (whether to hvPlot plots or to anything else, including .plot output or tables or xarray reprs). There are still plenty of improvements to make, but apart from documenting .interactive in xarray's docs, I would think this issue can now be closed. |
@jbednar that all looks amazing! Can't wait to properly try it out. Given that much of what I imagined is now available in holoviews, I will close this issue now. But if you would like to raise a PR pointing towards this functionality somewhere in xarray's docs (maybe either as a more detailed description in the Ecosystem page or as a note in the plotting page of the user guide) then that would be welcome! |
Just for completeness. You can find @philippjfr PyData 2021 Inspired by that I've created a hvplot-interactive-speedup15.mp4 |
Oh awesome! Can I watch this talk anywhere? That link just seems to have a summary. |
I'm not sure if this link will expire, but until it's on youtube, you can watch the talk at https://zoom.us/rec/play/DzaWjz_hMBP23Vqv7T5jPcY1zU4fps2ZL-yAi8MyM5-lbYq-ZQS4ejWMzwxRW53vGu2F1DybYiKSb8M.mYwmkdDSK6ECc8Ux?startTime=1635508803000&_x_zm_rtaid=hMxhM6kwS-ae1hLStT7UXA.1635955310424.1ade0b45b8e3297ff743d3acc0aa08e1&_x_zm_rhtaid=397 |
I meant to at this link to the PyData Talk on .interactive including video https://discourse.holoviz.org/t/pydata-2021-build-polished-data-driven-applications-directly-from-your-pandas-or-xarray-pipelines/3017/4 |
Sophia Yang and I wrote a blog post about hvplot interactive. It's based on Pandas dataframes but it works the same way for Xarray. Check it out https://towardsdatascience.com/the-easiest-way-to-create-an-interactive-dashboard-in-python-77440f2511d1 You can also find the repo and links to binder+colab here https://github.com/sophiamyang/hvplot_interactive |
Just been sent a link to this discussion after having worked on something very similar for our project (which resembles Xarray in many ways): scipp/scipp#2573 @philippjfr how much work would it be to implement an |
We'd probably have to write a so called HoloViews |
Great, I'll look at that implementation. Thanks! |
FYI. This has concept has now been generalized further by @philippjfr into Reactive Expressions which is now a part of Param. See https://param.holoviz.org/user_guide/Reactive_Expressions.html Here are a couple of examples with Panel |
Feature proposal:
xarray.interactive
moduleI've been experimenting with ipython widgets in jupyter notebooks, and I've been working on how we might use them to make xarray more interactive.
Motivation:
For most users who are exploring their data, it will be common to find themselves rerunning the same cells repeatedly but with slightly different values.
In
xarray
's case that will often be in an.isel()
or.sel()
call, or selecting variables from a dataset.IPython widgets allow you to interact with your functions in a very intuitive way, which we could exploit.
There are lots of tutorials on how to interact with
pandas
data (e.g. this great one), but I haven't seen any for interacting withxarray
objects.Relationship to other libraries:
Some downstream plotting libaries (such as @hvplot) already use widgets when interactively plotting xarray-derived data structures, but they don't seem to go the full N dimensions.
This also isn't something that should be confined to plotting functions - you often choose slices or variables at the start of analysis, not just at the end.
I'll come back to this idea later.
The default ipython widgets are pretty good, but we could write an
xarray.interactive
module in such a way that downstream developers can easily replace them with their own widgets.Usage examples:
Plotting against multiple dimensions interactively
Interactively select a range from a dimension
Animate over one dimension
API ideas:
We can write a function like this
which could also be used as a decorator something like this
It would be nicer to be able to do this
but Guido forbade it.
But we can attach these functions to an accessor to get
Other ideas
Select variables from datasets
Choose dimensions to apply functions over
General
interactive.explore()
method to see variation over any number of dimensions, the default being all of them.What do people think about this? Is it something that makes sense to include within xarray itself? (Dependencies aren't a problem because it's fine to have
ipywidgets
as an optional dependency just for this module.)The text was updated successfully, but these errors were encountered: