fs: allocate backing storage once in Fs::load (#9020)
Piotr Osiewicz
created
`futures_lite::AsyncReadExt::read_to_string` (that we use in
`RealFs::load`) explicitly does not allocate memory for String contents
up front, which leads to excessive reallocations. That reallocation time
is a significant contributor to the time we spend loading files (esp
large ones). For example, out of ~1s that it takes to open up a 650Mb
ASCII buffer on my machine (after changes related to fingerprinting from
#9007), 350ms is spent in `RealFs::load`.
This change slashes that figure to ~110ms, which is still *a lot*. About
60ms out of 110ms remaining is spent zeroing memory. Sadly,
`AsyncReadExt` API forces us to zero a buffer we're reading into
(whether it's via read_to_string or read_exact), but at the very least
this commit alleviates unnecessary reallocations.
We could probably use something like
[simdutf8](https://docs.rs/simdutf8/latest/simdutf8/) to speed up UTF8
validation in this method as well, though that takes only about ~18ms
out of 110ms, so while it is significant, I've left that out for now.
Memory zeroing is a bigger problem at this point.
Before:

After:

/cc @as-cii
Release Notes:
- Improved performance when loading large files.
@@ -223,9 +223,11 @@ impl Fs for RealFs {
async fn load(&self, path: &Path) -> Result<String> {
let mut file = smol::fs::File::open(path).await?;
- let mut text = String::new();- file.read_to_string(&mut text).await?;- Ok(text)
+ // We use `read_exact` here instead of `read_to_string` as the latter is *very*
+ // happy to reallocate often, which comes into play when we're loading large files.
+ let mut storage = vec![0; file.metadata().await?.len() as usize];
+ file.read_exact(&mut storage).await?;
+ Ok(String::from_utf8(storage)?)
}
async fn atomic_write(&self, path: PathBuf, data: String) -> Result<()> {