Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UserWarning when computing chunked pint arrays #116

Closed
TomNicholas opened this issue Jul 1, 2021 · 10 comments
Closed

UserWarning when computing chunked pint arrays #116

TomNicholas opened this issue Jul 1, 2021 · 10 comments
Labels
bug Something isn't working upstream issue Something isn't working upstream

Comments

@TomNicholas
Copy link
Member

da = xr.DataArray([1,2,3], dims=['x'], attrs={'units': 'metres'})

chunked = da.pint.quantify().pint.chunk(1)
# chunked2 = da.chunk(1).pint.quantify()  # also happens if I do it in this order instead

Everything looks fine here, excellent...

Screenshot from 2021-07-01 12-31-28

but when I go to compute then I get a UserWarning, even though it returns the correct answer:

chunked.mean().compute()
/home/tegn500/miniconda3/envs/py38-mamba/lib/python3.8/site-packages/dask/array/core.py:3139: 
UserWarning: Passing an object to dask.array.from_array which is already a Dask collection. This can lead to unexpected behavior.
  warnings.warn(

Even if this is working fine then we don't want to be giving warnings to the user ideally.

@keewis
Copy link
Collaborator

keewis commented Jul 1, 2021

you don't even need the compute to get the warning:

In [3]: chunked.mean()
.../lib/python3.8/site-packages/dask/array/core.py:3113: UserWarning: Passing an object to dask.array.from_array which is already a Dask collection. This can lead to unexpected behavior.
  warnings.warn(
Out[3]: 
<xarray.DataArray ()>
dask.array<mean_agg-aggregate, shape=(), dtype=float64, chunksize=(), chunktype=numpy.ndarray>

is enough, and computing returns

<xarray.DataArray ()>
<Quantity(dask.array<true_divide, shape=(), dtype=float64, chunksize=(), chunktype=numpy.ndarray>, 'meter')>

Note that there's no units in the result of .mean(), that the return value of compute is a dask array (wrapped by pint) and that we need to compute twice to get the actual result.

In conclusion: this is a pretty serious bug (in xarray, I think?) and the warning should actually be an error in this case.

@TomNicholas
Copy link
Member Author

Oh dear. Does the other order (da.chunk(1).pint.quantify()) behave any differently?

@keewis
Copy link
Collaborator

keewis commented Jul 1, 2021

no, it doesn't, which is why I believe this is a bug in xarray

@keewis keewis added the bug Something isn't working label Jul 1, 2021
@TomNicholas
Copy link
Member Author

It would be really nice to get this to work before we publish #114 (not that there is any time limit), but I have time now and am keen to help if I can. Should I re-raise this issue on xarray?

@keewis keewis added the upstream issue Something isn't working upstream label Jul 1, 2021
@keewis
Copy link
Collaborator

keewis commented Jul 1, 2021

yes, that would be good.

I didn't test xarray(pint(dask)) thoroughly, yet, so I guess we can expect more to fail. I really hope pydata/xarray#4972 would have caught something like this, which I guess means I should try to finalize that as soon as possible.

@TomNicholas
Copy link
Member Author

TomNicholas commented Jul 1, 2021

Note that there's no units in the result of .mean(), that the return value of compute is a dask array (wrapped by pint) and that we need to compute twice to get the actual result.

Are we definitely seeing the same behaviour as each other? When I do print(chunked.compute()) (after chunking in either way) I get

<xarray.DataArray (dim_0: 3)>
<Quantity([1 2 3], 'meter')>
Dimensions without coordinates: dim_0

which seems right to me?

Conda env

packages in environment at /home/tegn500/miniconda3/envs/py38-mamba:

Name Version Build Channel

_libgcc_mutex 0.1 conda_forge conda-forge
_openmp_mutex 4.5 1_gnu conda-forge
alsa-lib 1.2.3 h516909a_0 conda-forge
anyio 3.1.0 py38h578d9bd_0 conda-forge
appdirs 1.4.4 pyh9f0ad1d_0 conda-forge
argon2-cffi 20.1.0 py38h497a2fe_2 conda-forge
async_generator 1.10 py_0 conda-forge
atk-1.0 2.36.0 h3371d22_4 conda-forge
attrs 21.2.0 pyhd8ed1ab_0 conda-forge
babel 2.9.1 pyh44b312d_0 conda-forge
backcall 0.2.0 pyh9f0ad1d_0 conda-forge
backports 1.0 py_2 conda-forge
backports.functools_lru_cache 1.6.4 pyhd8ed1ab_0 conda-forge
black 21.5b0 pyhd8ed1ab_0 conda-forge
bleach 3.3.0 pyh44b312d_0 conda-forge
bokeh 2.3.2 py38h578d9bd_0 conda-forge
bottleneck 1.3.2 py38h5c078b8_3 conda-forge
brotlipy 0.7.0 py38h497a2fe_1001 conda-forge
bzip2 1.0.8 h7f98852_4 conda-forge
c-ares 1.17.1 h7f98852_1 conda-forge
ca-certificates 2021.5.30 ha878542_0 conda-forge
cairo 1.16.0 h6cf1ce9_1008 conda-forge
certifi 2021.5.30 py38h578d9bd_0 conda-forge
cffi 1.14.5 py38ha65f79e_0 conda-forge
cfgv 3.2.0 py_0 conda-forge
cftime 1.4.1 py38h5c078b8_0 conda-forge
chardet 4.0.0 py38h578d9bd_1 conda-forge
click 8.0.1 py38h578d9bd_0 conda-forge
cloudpickle 1.6.0 py_0 conda-forge
conda 4.10.1 py38h578d9bd_0 conda-forge
conda-package-handling 1.7.3 py38h497a2fe_0 conda-forge
cryptography 3.4.7 py38ha5dfef3_0 conda-forge
curl 7.76.1 hea6ffbf_2 conda-forge
cycler 0.10.0 py_2 conda-forge
cytoolz 0.11.0 py38h497a2fe_3 conda-forge
dask 2021.5.0 pyhd8ed1ab_0 conda-forge
dask-core 2021.5.0 pyhd8ed1ab_0 conda-forge
dataclasses 0.8 pyhc8e2a94_1 conda-forge
dbus 1.13.6 h48d8840_2 conda-forge
decorator 5.0.9 pyhd8ed1ab_0 conda-forge
defusedxml 0.7.1 pyhd8ed1ab_0 conda-forge
distlib 0.3.1 pyh9f0ad1d_0 conda-forge
distributed 2021.5.0 py38h578d9bd_0 conda-forge
editdistance-s 1.0.0 py38h1fd1430_1 conda-forge
entrypoints 0.3 py38h32f6830_1002 conda-forge
expat 2.3.0 h9c3ff4c_0 conda-forge
filelock 3.0.12 pyh9f0ad1d_0 conda-forge
font-ttf-dejavu-sans-mono 2.37 hab24e00_0 conda-forge
font-ttf-inconsolata 3.000 h77eed37_0 conda-forge
font-ttf-source-code-pro 2.038 h77eed37_0 conda-forge
font-ttf-ubuntu 0.83 hab24e00_0 conda-forge
fontconfig 2.13.1 hba837de_1005 conda-forge
fonts-conda-ecosystem 1 0 conda-forge
fonts-conda-forge 1 0 conda-forge
freetype 2.10.4 h0708190_1 conda-forge
fribidi 1.0.10 h516909a_0 conda-forge
fsspec 2021.5.0 pyhd8ed1ab_0 conda-forge
gdk-pixbuf 2.42.6 h04a7f16_0 conda-forge
gettext 0.19.8.1 h0b5b191_1005 conda-forge
giflib 5.2.1 h516909a_2 conda-forge
glib 2.68.2 h9c3ff4c_0 conda-forge
glib-tools 2.68.2 h9c3ff4c_0 conda-forge
graphite2 1.3.13 he1b5a44_1001 conda-forge
graphviz 2.47.1 h85b4f2f_1 conda-forge
gst-plugins-base 1.18.4 hf529b03_2 conda-forge
gstreamer 1.18.4 h76c114f_2 conda-forge
gtk2 2.24.33 h539f30e_1 conda-forge
gts 0.7.6 h64030ff_2 conda-forge
harfbuzz 2.8.1 h83ec7ef_0 conda-forge
hdf4 4.2.15 h10796ff_3 conda-forge
hdf5 1.10.6 nompi_h6a2412b_1114 conda-forge
heapdict 1.0.1 py_0 conda-forge
hypothesis 6.13.0 pyhd8ed1ab_0 conda-forge
icu 68.1 h58526e2_0 conda-forge
identify 2.2.6 pyhd8ed1ab_0 conda-forge
idna 2.10 pyh9f0ad1d_0 conda-forge
importlib-metadata 4.0.1 py38h578d9bd_0 conda-forge
importlib_metadata 4.0.1 hd8ed1ab_0 conda-forge
importlib_resources 5.2.0 pyhd8ed1ab_0 conda-forge
iniconfig 1.1.1 pyh9f0ad1d_0 conda-forge
ipykernel 5.5.5 py38hd0cf306_0 conda-forge
ipytest 0.9.1 pypi_0 pypi
ipython 7.23.1 py38hd0cf306_0 conda-forge
ipython_genutils 0.2.0 py_1 conda-forge
jedi 0.18.0 py38h578d9bd_2 conda-forge
jinja2 3.0.1 pyhd8ed1ab_0 conda-forge
jpeg 9d h516909a_0 conda-forge
json5 0.9.5 pyh9f0ad1d_0 conda-forge
jsonschema 3.2.0 py38h32f6830_1 conda-forge
jupyter_client 6.1.12 pyhd8ed1ab_0 conda-forge
jupyter_core 4.7.1 py38h578d9bd_0 conda-forge
jupyter_server 1.8.0 pyhd8ed1ab_0 conda-forge
jupyterlab 3.0.16 pyhd8ed1ab_0 conda-forge
jupyterlab_pygments 0.1.2 pyh9f0ad1d_0 conda-forge
jupyterlab_server 2.5.2 pyhd8ed1ab_0 conda-forge
kiwisolver 1.3.1 py38h1fd1430_1 conda-forge
krb5 1.19.1 hcc1bbae_0 conda-forge
lcms2 2.12 hddcbb42_0 conda-forge
ld_impl_linux-64 2.35.1 hea4e1c9_2 conda-forge
libarchive 3.5.1 h3f442fb_1 conda-forge
libblas 3.9.0 9_openblas conda-forge
libcblas 3.9.0 9_openblas conda-forge
libclang 11.1.0 default_ha53f305_1 conda-forge
libcurl 7.76.1 h2574ce0_2 conda-forge
libedit 3.1.20191231 he28a2e2_2 conda-forge
libev 4.33 h516909a_1 conda-forge
libevent 2.1.10 hcdb4288_3 conda-forge
libffi 3.3 h58526e2_2 conda-forge
libgcc-ng 9.3.0 h2828fa1_19 conda-forge
libgd 2.3.2 h78a0170_0 conda-forge
libgfortran-ng 9.3.0 hff62375_19 conda-forge
libgfortran5 9.3.0 hff62375_19 conda-forge
libglib 2.68.2 h3e27bee_0 conda-forge
libgomp 9.3.0 h2828fa1_19 conda-forge
libiconv 1.16 h516909a_0 conda-forge
liblapack 3.9.0 9_openblas conda-forge
libllvm10 10.0.1 he513fc3_3 conda-forge
libllvm11 11.1.0 hf817b99_2 conda-forge
libnetcdf 4.8.0 nompi_hcd642e3_103 conda-forge
libnghttp2 1.43.0 h812cca2_0 conda-forge
libogg 1.3.4 h7f98852_1 conda-forge
libopenblas 0.3.15 pthreads_h8fe5266_1 conda-forge
libopus 1.3.1 h7f98852_1 conda-forge
libpng 1.6.37 hed695b0_2 conda-forge
libpq 13.3 hd57d9b9_0 conda-forge
librsvg 2.50.5 hc3c00ef_0 conda-forge
libsodium 1.0.18 h516909a_1 conda-forge
libsolv 0.7.18 h780b84a_0 conda-forge
libssh2 1.9.0 ha56f1ee_6 conda-forge
libstdcxx-ng 9.3.0 h6de172a_19 conda-forge
libtiff 4.2.0 hbd63e13_2 conda-forge
libtool 2.4.6 h58526e2_1007 conda-forge
libuuid 2.32.1 h14c3975_1000 conda-forge
libvorbis 1.3.7 he1b5a44_0 conda-forge
libwebp 1.2.0 h3452ae3_0 conda-forge
libwebp-base 1.2.0 h7f98852_2 conda-forge
libxcb 1.13 h7f98852_1003 conda-forge
libxkbcommon 1.0.3 he3ba5ed_0 conda-forge
libxml2 2.9.12 h72842e0_0 conda-forge
libzip 1.7.3 he9f05b3_0 conda-forge
llvmlite 0.36.0 py38h4630a5e_0 conda-forge
locket 0.2.0 py_2 conda-forge
lz4-c 1.9.3 h9c3ff4c_0 conda-forge
lzo 2.10 h516909a_1000 conda-forge
mamba 0.13.0 py38h2aa5da1_0 conda-forge
markupsafe 2.0.1 py38h497a2fe_0 conda-forge
matplotlib 3.4.2 py38h578d9bd_0 conda-forge
matplotlib-base 3.4.2 py38hcc49a3a_0 conda-forge
matplotlib-inline 0.1.2 pyhd8ed1ab_2 conda-forge
mistune 0.8.4 py38h497a2fe_1003 conda-forge
more-itertools 8.7.0 pyhd8ed1ab_1 conda-forge
msgpack-python 1.0.2 py38h1fd1430_1 conda-forge
mypy 0.812 py38h497a2fe_2 conda-forge
mypy_extensions 0.4.3 py38h578d9bd_3 conda-forge
mysql-common 8.0.23 ha770c72_2 conda-forge
mysql-libs 8.0.23 h935591d_2 conda-forge
nbclassic 0.3.1 pyhd8ed1ab_1 conda-forge
nbclient 0.5.3 pyhd8ed1ab_0 conda-forge
nbconvert 6.0.7 py38h578d9bd_3 conda-forge
nbformat 5.1.3 pyhd8ed1ab_0 conda-forge
ncurses 6.2 h58526e2_4 conda-forge
nest-asyncio 1.5.1 pyhd8ed1ab_0 conda-forge
netcdf4 1.5.6 nompi_py38h5e9db54_103 conda-forge
nodeenv 1.6.0 pyhd8ed1ab_0 conda-forge
notebook 6.4.0 pyha770c72_0 conda-forge
nspr 4.30 h9c3ff4c_0 conda-forge
nss 3.65 hb5efdd6_0 conda-forge
numba 0.53.1 py38h0e12cce_0 conda-forge
numpy 1.20.3 py38h9894fe3_0 conda-forge
numpy_groupies 0.9.13 pyh9f0ad1d_1 conda-forge
olefile 0.46 pyh9f0ad1d_1 conda-forge
openjpeg 2.4.0 hb52868f_1 conda-forge
openssl 1.1.1k h7f98852_0 conda-forge
packaging 20.9 pyh44b312d_0 conda-forge
pandas 1.2.4 py38h1abd341_0 conda-forge
pandoc 2.13 h7f98852_0 conda-forge
pandocfilters 1.4.2 py_1 conda-forge
pango 1.48.5 hb8ff022_0 conda-forge
parso 0.8.2 pyhd8ed1ab_0 conda-forge
partd 1.2.0 pyhd8ed1ab_0 conda-forge
pathspec 0.8.1 pyhd3deb0d_0 conda-forge
pcre 8.44 he1b5a44_0 conda-forge
pexpect 4.8.0 py38h32f6830_1 conda-forge
pickleshare 0.7.5 py38h32f6830_1002 conda-forge
pillow 8.2.0 py38ha0e1e83_1 conda-forge
pint 0.17 pyhd8ed1ab_0 conda-forge
pint-xarray 0.2 pyhd8ed1ab_0 conda-forge
pip 21.1.1 pyhd8ed1ab_0 conda-forge
pixman 0.40.0 h36c2ea0_0 conda-forge
pluggy 0.13.1 py38h578d9bd_4 conda-forge
pooch 1.4.0 pyhd8ed1ab_0 conda-forge
pre-commit 2.12.1 py38h578d9bd_0 conda-forge
prometheus_client 0.10.1 pyhd8ed1ab_0 conda-forge
prompt-toolkit 3.0.18 pyha770c72_0 conda-forge
psutil 5.8.0 py38h497a2fe_1 conda-forge
pthread-stubs 0.4 h36c2ea0_1001 conda-forge
ptyprocess 0.7.0 pyhd3deb0d_0 conda-forge
py 1.10.0 pyhd3deb0d_0 conda-forge
pycosat 0.6.3 py38h497a2fe_1006 conda-forge
pycparser 2.20 pyh9f0ad1d_2 conda-forge
pygments 2.9.0 pyhd8ed1ab_0 conda-forge
pyopenssl 20.0.1 pyhd8ed1ab_0 conda-forge
pyparsing 2.4.7 pyh9f0ad1d_0 conda-forge
pyqt 5.12.3 py38h578d9bd_7 conda-forge
pyqt-impl 5.12.3 py38h7400c14_7 conda-forge
pyqt5-sip 4.19.18 py38h709712a_7 conda-forge
pyqtchart 5.12 py38h7400c14_7 conda-forge
pyqtwebengine 5.12.1 py38h7400c14_7 conda-forge
pyrsistent 0.17.3 py38h497a2fe_2 conda-forge
pysocks 1.7.1 py38h578d9bd_3 conda-forge
pytest 6.2.4 py38h578d9bd_0 conda-forge
pytest-repeat 0.9.1 pypi_0 pypi
python 3.8.10 h49503c6_1_cpython conda-forge
python-dateutil 2.8.1 py_0 conda-forge
python-graphviz 0.16 pyh243d235_2 conda-forge
python_abi 3.8 1_cp38 conda-forge
pytz 2021.1 pyhd8ed1ab_0 conda-forge
pyyaml 5.4.1 py38h497a2fe_0 conda-forge
pyzmq 22.1.0 py38h2035c66_0 conda-forge
qt 5.12.9 hda022c4_4 conda-forge
readline 8.1 h46c0cb4_0 conda-forge
regex 2021.4.4 py38h497a2fe_0 conda-forge
reproc 14.2.1 h36c2ea0_0 conda-forge
reproc-cpp 14.2.1 h58526e2_0 conda-forge
requests 2.25.1 pyhd3deb0d_0 conda-forge
ruamel_yaml 0.15.80 py38h497a2fe_1004 conda-forge
scipy 1.6.3 py38h7b17777_0 conda-forge
send2trash 1.5.0 py_0 conda-forge
setuptools 49.6.0 py38h578d9bd_3 conda-forge
six 1.16.0 pyh6c4a22f_0 conda-forge
sniffio 1.2.0 py38h578d9bd_1 conda-forge
sortedcontainers 2.4.0 pyhd8ed1ab_0 conda-forge
sqlite 3.35.5 h74cdb3f_0 conda-forge
tblib 1.7.0 pyhd8ed1ab_0 conda-forge
terminado 0.10.0 py38h578d9bd_0 conda-forge
testpath 0.5.0 pyhd8ed1ab_0 conda-forge
tk 8.6.10 h21135ba_1 conda-forge
toml 0.10.2 pyhd8ed1ab_0 conda-forge
toolz 0.11.1 py_0 conda-forge
tornado 6.1 py38h497a2fe_1 conda-forge
tqdm 4.60.0 pyhd8ed1ab_0 conda-forge
traitlets 5.0.5 py_0 conda-forge
typed-ast 1.4.3 py38h497a2fe_0 conda-forge
typing-extensions 3.10.0.0 hd8ed1ab_0 conda-forge
typing_extensions 3.10.0.0 pyha770c72_0 conda-forge
tzdata 2021a he74cb21_0 conda-forge
urllib3 1.26.4 pyhd8ed1ab_0 conda-forge
virtualenv 20.4.7 py38h578d9bd_0 conda-forge
wcwidth 0.2.5 pyh9f0ad1d_2 conda-forge
webencodings 0.5.1 py_1 conda-forge
websocket-client 0.57.0 py38h578d9bd_4 conda-forge
wheel 0.36.2 pyhd3deb0d_0 conda-forge
xarray 0.18.2 pyhd8ed1ab_0 conda-forge
xhistogram 0.1.3+40.g9f20e95.dirty dev_0
xorg-kbproto 1.0.7 h14c3975_1002 conda-forge
xorg-libice 1.0.10 h516909a_0 conda-forge
xorg-libsm 1.2.3 hd9c2040_1000 conda-forge
xorg-libx11 1.7.1 h7f98852_0 conda-forge
xorg-libxau 1.0.9 h14c3975_0 conda-forge
xorg-libxdmcp 1.1.3 h516909a_0 conda-forge
xorg-libxext 1.3.4 h7f98852_1 conda-forge
xorg-libxrender 0.9.10 h7f98852_1003 conda-forge
xorg-renderproto 0.11.1 h14c3975_1002 conda-forge
xorg-xextproto 7.3.0 h14c3975_1002 conda-forge
xorg-xproto 7.0.31 h14c3975_1007 conda-forge
xz 5.2.5 h516909a_1 conda-forge
yaml 0.2.5 h516909a_0 conda-forge
zeromq 4.3.4 h9c3ff4c_0 conda-forge
zict 2.0.0 py_0 conda-forge
zipp 3.4.1 pyhd8ed1ab_0 conda-forge
zlib 1.2.11 h516909a_1010 conda-forge
zstd 1.4.9 ha95c52a_0 conda-forge

@keewis
Copy link
Collaborator

keewis commented Jul 1, 2021

it is correct and I get the same result (which means .pint.chunk does not have a bug), but chunked.mean() is definitely wrong (I checked both master and v0.2)

@keewis
Copy link
Collaborator

keewis commented Jul 1, 2021

with .compute I meant that chunked.mean().compute().compute() is required to get the result for the mean

@TomNicholas
Copy link
Member Author

Right sorry, I had left out the call to mean.

@TomNicholas
Copy link
Member Author

This was fixed by pydata/xarray#5559

In [4]: da = xr.DataArray([1,2,3], dims=['x'], attrs={'units': 'metres'})

In [5]: chunked = da.pint.quantify().pint.chunk(1)

In [6]: chunked
Out[6]: 
<xarray.DataArray (x: 3)>
<Quantity(dask.array<xarray-<this-array>, shape=(3,), dtype=int64, chunksize=(1,), chunktype=numpy.ndarray>, 'meter')>
Dimensions without coordinates: x

In [7]: chunked.mean().compute()
Out[7]: 
<xarray.DataArray ()>
<Quantity(2.0, 'meter')>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working upstream issue Something isn't working upstream
Projects
None yet
Development

No branches or pull requests

2 participants