My wish list for Rust 2021

Calendar icon   2020-09-14   Scroll icon  1969
Tag icon  opinion , rust

TL;DR

STL; DR: higher-level ergonomic integration and entry points, please.

Introduction

Before I launch into a more detailed discussion of the points that I'd like to see from Rust, I wanted to say that I know I'm asking for a lot here. These things I will mention here I consider a wish list in the loosest way possible. I understand these are huge things I am asking for, and I am not skilled enough in Rust or project/community management to assess whether these things are attainable. I present them as things that would make me very happy to see.

Secondly, I think it's a good idea to give a little background about myself for the context of this blog post. If that doesn't interest you, you can safely skip to the next heading.

I am primarily a data engineer and data analyst, but I also love other forms of programming in my spare time. Sadly I don't get to use Rust at my day job, which already takes up a lot of my creative energy. That means that the times I code outside of that, I want to keep pragmatic. I don't have time to develop large, robust codebases, actively maintain the things I build for very long, or (and in my opinion most importantly), spend much time setting up complex development environments that use multiple languages. That has already caused me to give up on a few projects which I find a shame.

The next point somewhat ties into the last. While I love learning new concepts and improving my craft, there are a few things I have very little interest in learning: system's programming, other languages like JavaScript and frameworks in different languages like Qt. That is not because I see no value in them, but because I don't have the time for it. I have little enough time left for programming my projects and taking things like learning new frameworks in other languages makes it all too complex to understand effectively. I want to focus on Rust, not on other languages.

Finally, I want to tell you a bit about my preferences. I come to Rust primarily for its correctness guarantees and helpful error messages. The compiler in Rust is unlike anything I've ever seen in other places and in my opinion that is what makes it great. Things like performance, low run time overhead and low-level access when needed is nice and cool, but that's not why I'm here. I'm here because Rust helps me write better code from the get-go.

With that preamble out of the way, there are the three points I'd love to see addressed in Rust most.

Native GUIs

I use Linux, and a lot of the things I make are used via the command line. Clap is excellent, and the Rust support for making things programmatically available via the command line is good. However, some tasks, such as things that require visualisation or image manipulations that aren't very well suited to command-line interactions. Not only that, but sometimes I want to make tools for people that aren't as technically inclined. More specifically, people for whom the idea of using a command-line induces severe anxiety. Building tools for them that have simple GUIs just rolled into a single binary that doesn't require extra setup would be incredibly powerful to me.

Now I'm aware that bindings from Rust exist for a lot of the big GUI frameworks like Qt, but this has a few dealbreakers for me. First of all, I don't know them and spending time learning them takes time away from learning Rust, which I don't want for obvious reasons. Secondly, and more important to me, installing, setting up and compiling of projects with multiple languages is COMPLICATED. As good as Rust's FFIs are, they are still complicated, and even if those are perfect, I have to deal with the tools on the other side. I have tried and failed many times to get a project with Qt or webview going. The setup and the amount of extra tooling necessary for those kinds of projects make it not worth it for me to use them on projects at my (small) scale. I love that Rust has such robust integration tools and good FFIs. The cost of this, unfortunately, is that the native rust solutions have been neglected somewhat and as someone who isn't versed in the standard options like JavaScript and doesn't want to spend the time to learn that as well, there are few options.

Next to the bindings, there are also a handful of GUI libraries written in pure Rust. Such as Azul, orb-tk, druid, conrod and iced but these all have problems of their own. The main thing is that none of these libraries gives the impression that they will keep developing. I don't mind unstable APIs very much, most of the tools I use myself install from some dev branch. However, missing features that I consider essential and incomplete documentation are bigger problems than that.

Another problem is that GUI libraries or frameworks, by their nature, tend to be very complex due to a very circular flow of information. While learning complex libraries in and of itself is not a problem for me, if it does the job properly, having to learn several complex libraries to assess which one does the job well enough is.

More high-level computation and data wrangling

A lot of data science involves a Jupyter notebook and some python libraries for good exploration and prototyping. That is unlikely to change, just because of the nature of interpreted vs compiled languages and I don't think it has to change. However, I would like Rust to become a more significant part of my data science toolkit for various reasons. Not because I want one language for everything, but because I think Rust has some genuine strengths to offer.

It comes back again to correctness. Rust has good options for data wrangling that I would love to use. Algebraic enums, pattern matching, more robust mechanisms like Option and Result would significantly improve the quality of data wrangling. It would help flesh out different edge cases, consider what things are in scope and what isn't, and how to handle things when they go wrong, just like in regular code. However to make these an option I think we'd need more high-level file manipulation for various file types like CSV. I'd imagine a kind of syntax like

let data_set =
  wrangling::read_csv(&file_path, |rec| validate(&rec))?
  .map_over_successes(|valid_rec| valid_rec.preprocess())
  .map_over_errors(|invalid_rec| log!("invalid record: {}",&invalid_rec))
  .collect()

I'm glossing over many details here, but I'm taking some inspiration from something like the hash_map and or_insert_with. I hope the idea is clear: Read something from a file, check if it's valid according to your logic. If it is, pre-process it, if not do whatever error handling is appropriate. This kind of defensive programming would be golden for data wrangling.

Similarly, I would love to do my simulations in Rust. One of the most frustrating things in Python when I move from just prototyping to a more robust exploration or simulation is when the program runs for a long time. Then I get an AttributeError: 'int' object has no attribute 'append', which in Rust would have been solved by a quick cargo check. Some tools, like MyPy help with that sort of thing. However, I feel like that's a band-aid rather than a real solution.

Additionally, it is when running more extended simulations or analyses with large models taking up much memory that I think Rust's strengths start to show themselves. Rust helps me write my code, so it uses less memory than it would in Python. Therefore I can run more elaborate simulations on my machine compared to when I was using Python that is a huge win.

Sadly numerics in Rust are... let's say, awkward. I think the lack of const generics has been a significant barrier for bigger libraries being written. Writing optimised code for fast Fourier transforms simply outside my ability, and having to find a new library for every new such thing is a large barrier.

Now I don't want this post to turn into the reveal for arewenumpyyet.com nor similar ones for libraries pandas, scipy or sklearn. I'm not saying that I want something that emulates those libraries. They have their flaws (cough panda's naming scheme cough) and some of the ways they are designed would map onto the Rust paradigm quite poorly. But they do represent a standard robust (enough) toolset that I am currently missing in Rust. I think that solidifying a reliable default option for this kind of work would be huge for a lot of data science in Rust.

Picking libraries

Let me first say I love crates.io and how cargo works with it. Especially compared to something like C++, it's great. Both in terms of discoverability and accessibility. Both from a consumer and publisher standpoint crates.io is a good platform as far as I am considered.

However, I do miss some better curation tools occasionally. Often on crates.io, I have much trouble telling the difference between an obscure library and an unmaintained one. Another problem which irks me is that there is much name-squatting on crates.io. These are by no means problems unique to Rust. I realise that a lot of what I'm about to talk about here is a "problem" that stems from Rust being an open and somewhat young platform. That is not something I want to change. Rust being accessible to everyone is a good thing. However, I would like some better mechanisms to assess whether a library is still usable.

A big problem with this is my relative lack of skill with Rust. When I am looking for a new library in Python, the well-maintained ones usually have enough documentation and are simple enough that I can pick it up and take it for a spin for a few minutes. The complexity of using Rust means that it's much harder for me to assess whether a library does not fit my use case or whether I am misusing it. That is something that will solve itself over time for me specifically, but that is a problem for more newcomers to the language and ecosystem. Sadly I don't have a solution to this one, but I imagine it is something other people struggle with as, so I hope that by bringing it to the attention of people smarter than me we can further this cause somewhat.

Conclusion

If I had to summarise this post, it is "I need more high-level libraries for the things I want to do". I love Rust, but I don't have the time or energy to become enough of an expert in all the things I want to do to use the low-level libraries that are out there. I would say that Rust at the moment, has a solid foundation, but it is time to start building something beautiful upon that foundation.