\
Following a long collaboration with my colleagues in the PyScript group and connecting together a deep stack of technologies, PyData workflows in a browser are now possible with full partial loading of data and concurrency. Here is an image of loading 1 year's worth of conda package download data from several parquet files in about 6s in my browser.

Anaconda has historically catered particularly to data oriented python practitioners; just see the list of packages listed at https://www.anaconda.com/open-source . This is no surprise, given that Anaconda's founders created scipy and numpy, just the kind of software that proved difficult to install on a wide variety of hardware and OS.
However, python is used for so much more! In particular, the learning and web-dev spheres were not particularly in scope. With the emergence of PyScript, acquisition of edublocks and sponsorship of beeware, there is now more of a conscious effort to cater to the "99%", people who only need a little programming and don't need heavy-duty installation and IDEs.
Turning to PyScript in particular, it allows for python coding alongside browser front-end code for easy interaction in a browser, without any python installation at all. The browser is the sandbox. BUT: these people are not doing PyData workloads, they are a completely different community. However, data now powers so much of python (and the world), that we need to at least allow for cross-over, for PyData to be accessible in a browser.
However, popular packages such as pandas make a bunch of assumptions about the
system they are running on, particularly for loading from remote data (and in a
browser, everything is remote).
There are many pieces of technology involved in getting data into a usable format. In the example here, the following are important:
pandas, the most widely-used table library in python. It uses fsspec to load data from remote locations.
![]()
In this blog, I will show you how I glued them all together!
Running python in a browser imposes the following tricky conditions.
conda packages are hosted by Anaconda, and package download stats have been accumulated
in parquet files for several years. Access to these have been provided for some time
via an intake catalogue and
the condastats CLI tool.
As a forerunner of the work here, a pyscript app exists which does indeed load parquet
data (the same as above) into a browser page with pure python and a proxy hosted by
pyscript.com: app. The
frontend was written by Philipp Rudiger. Note how long it takes to load just 30 days of
data for a given package (if it completes at all). The maximum lookback foreseen was 90 days.
Also worth mentioning in this space, PyPI collects download stats, as surfaced by, e.g., https://pypistats.org/ ; but they also have an upper limit on lookback time, in this case 120 days.
This was a big effort just to make one graph! Although this is all POC righ now and might not be used by many, I am proud to have made it this far.
The final results can be found as code at https://github.com/fsspec/fsspec-proxy , with subdirectories:
Additionally, I work was required in the following areas:
So you see, many pieces! However, the whole thing does work, and full 6x concurrency was achieved in Chrome. I think this saturated my bandwidth, but I don't know how you would increase the number of connections the browser makes anyway.