Changes to RandBLAS

This page reviews changes made to RandBLAS in reverse chronological order. We have a tentative policy of providing bugfix support for any release of RandBLAS upon request, no matter how old. With any luck, this project will grow enough that we’ll have to change this policy.

RandBLAS follows Semantic Versioning. Any function documented on this website is part of the public API. There are many functions which are not part of our public API, but could be added to it if there is user interest.

RandBLAS 1.1

Original release date: November 2, 2025. Release manager: Riley Murray.

This is RandBLAS’ first new feature release. Functionality for working with sparse data matrices been significantly expanded.

Overview of changes

Sparse matrix kernels

The most important new addition is sparse triangular solves for CSC and CSR matrices. These kernels are dispatched from RandBLAS::sparse_data::trsm(). Having sparse triangular solves in RandBLAS dramatically expands the algorithms that RandLAPACK can implement using only LAPACK and RandBLAS as dependencies.

We resolved some inefficiencies in sparse matrix multplication kernels with COOMatrix objects. Performance should be significantly improved when left-sketching row-major data (or right-sketching column-major data) with a SparseSkOp.

Instance methods for all SparseMatrix types

We added several instance methods functions for manipulating sparse matrices. Implementations of these methods are deferred to free functions in RandBLAS/sparse_data/base.hh.

For an object \(\texttt{M}\) whose type conforms to RandBLAS’ SparseMatrix concept …

\(\texttt{M.reserve(nnz)}\) allocates internal storage for \(\texttt{M}\) to hold \(\texttt{nnz}\) structural nonzeros.

\(\texttt{M.deepcopy()}\) returns a memory-owning deep copy of \(\texttt{M}.\)

\(\texttt{M.transpose()}\) returns a const view of \(\texttt{M}\)’s transpose, possibly of a type different from that of \(\texttt{M}\) (CSR↔CSC, COO↔COO).

RandBLAS’ SparseMatrix objects DO NOT have C++ copy constructors, but they DO have C++ move constructors. The lack of a copy constructor means they must be passed by reference:

template <SparseMatrix SpMat>
int64_t get_n_rows_broken(SpMat M) { return M.n_rows; } // compiler error!

template <SparseMatrix SpMat>
int64_t get_n_rows_works(SpMat &M) { return M.n_rows; } // works

The presence of a move constructor means you can return SparseMatrix objects from functions. For example, here’s a function that accepts a const matrix (passed by reference) and returns a memory-owning version of that matrix where all nonzeros have been replaced by 1:

template <SparseMatrix SpMat>
SpMat with_nonzeros_as_ones(const SpMat &M) {
  auto M_out = M.deepcopy()
  std::fill(M_out.vals, M_out.vals + M_out.nnz, 1);
  return M_out;
}

Instance methods for specific SparseMatrix types

COOMatrix::symperm_inplace applies a permutation to the rows and columns of a square COOMatrix. This is useful when writing programs that compute a reordering with a third-party library (e.g., approximate minimum degree ordering from SuiteSparse) while using RandBLAS datastructures and sparse matrix kernels for all other computations.

Sparse matrix types have instance methods to construct equivalent matrices in different representations. Objects returned from these functions own their attached memory.

Use COOMatrix::as_owning_csr or COOMatrix::as_owning_csc to get a compressed sparse matrix from a COOMatrix.

Use CSCMatrix::as_owning_coo or CSRMatrix::as_owning_coo to get a COOMatrix from a compressed sparse matrix.

One compressed format can be converted to another by chaining two conversion calls. For example,

template <typename T, SignedInteger sint_t>
auto csc_as_csr( const CSCMatrix<T,sint_t> &M_csc ) {
    return M_csc.as_owning_coo().as_owning_csr();
}

constructs a memory-owning CSR representation of a CSCMatrix.

Sampling from sketching distributions

Sketching operators can be constructed with DenseSkOp::sample and SparseSkOp::sample. This makes it possible to sample from a templated SketchingDistribution variable \(\texttt{D}\) with code like

RNGState seed_state(8675309);
auto S = D.sample<double>(seed_state);

In RandBLAS 1.0 it was necessary to construct a sketching operator by calling a DenseSkOp or SparseSkOp constructor.

Contributors and Acknowledgements

Parth Nobel (PR 126) was supported in part by the NSF Graduate Research Fellowship Program under Grant No. DGE-1656518.

Contributions by Tanya Tafolla (PR 133) were made at UCLA’s Institute for Pure and Applied Mathematics, with support from NSF Grant No. DMS-1925919.

Contributions from Riley Murray (PRs 124, 127, and 137) were made at Sandia National Laboratories, with support from the US Army Engineer Research and Development Center.

RandBLAS 1.0

Original release date: September 12, 2024. Release manager: Riley Murray.

Today marks RandBLAS’ second-ever release, its first stable release, and its first release featuring the contributions of someone who showed up entirely out of the blue (shoutout to Rylie Weaver)!

Overview of changes

New features for core functionality

The semantics of RandBLAS::SparseDist::major_axis have changed in RandBLAS 1.0. As a result of this change, SparseSkOps can represent LESS-Uniform operators and operators for plain row or column sampling with replacement. (This is in addition to hashing-style operators like CountSketch, which we’ve supported since version 0.2.)

We have four new functions for sampling from index sets.

RandBLAS::weights_to_cdf()

RandBLAS::sample_indices_iid()

RandBLAS::sample_indices_iid_uniform()

RandBLAS::repeated_fisher_yates()

We have two new functions for getting low-level data for a sketching operator’s explicit representation: RandBLAS::fill_dense_unpacked() and RandBLAS::fill_sparse_unpacked_nosub. These are useful if you want to incorporate RandBLAS’ sketching functionality into other frameworks, like Kokkos, cuBLAS, or MKL.

Finally, there’s RandBLAS::sketch_symmetric(), overloaded for sketching from the left or right.

Quality-of-life improvements

We’ve significantly expanded the tutorial part of our web docs. It now has details on updating sketches and some advice on choosing parameters for sketching distributions.

RandBLAS::Error is now in the public API.

RandBLAS::print_buff_to_stream() is for writing MATLAB-style or NumPy-style string representations of matrices to a provided stream, like std::cout.

We settled on a unified memory-management / memory-ownership policy. There’s no difference between DenseSkOp, SparseSkOp, or any of the sparse matrix types. The abstract policy is described in our web documentation. The consequences of the policy for each of the aforementioned types is documented in source code and on our website.

We added a few utility functions for working with dense matrices: symmetrize, overwrite_triangle, and transpose_square.

Significantly revised APIs for sketching distributions and operators

Added new RandBLAS::SketchingDistribution and RandBLAS::SketchingOperator C++20 concepts.

API revisions to DenseDist/DenseSkOp and SparseDist/SparseSkOp were mostly about taking quantities which we would compute from an object’s const members with free-functions, and instead making those quantities const members themselves. Good examples of this are RandBLAS::DenseDist::isometry_scale and RandBLAS::SparseDist::isometry_scale, whose meanings are explained in the SketchingDistribution docs.

RandBLAS::DenseSkOp::next_state and RandBLAS::SparseSkOp::next_state are computed at construction time, without actually performing any random sampling. This means that one can define a sequence of independent sketching without changing an RNGState’s “key” and without realizing any of them explicitly.

New statistical tests

Kolmogorov–Smirnov tests for distributional correctness of sample_indices_iid, sample_indices_iid_uniform, repeated_fisher_yates, and the scalar distributions that can be used with DenseSkOp (standard-normal and uniform over [-1,1]).

Tests for subspace embedding properties of DenseSkOp. A forthcoming paper will describe how these tests cover a wide range of relevant parameter values at very mild computational cost.

We’ve incorporated select tests from Random123 into our testing framework.

Contributors

I’d like to start by acknowledging the contributions of Parth Nobel to RandBLAS’ development. Parth and I have worked on-and-off on several projects involving RandNLA algorithms. None of these projects has been published yet, but they’ve had a significant role in uncovering bugs and setting development priorities for RandBLAS. (As a recent example in the latter category, I probably wouldn’t have added the “sample_indices_iid” function were it not for its relevance to one of our projects.) This led me to be quite surprised when I noticed that Parth technically hasn’t made a commit to the RandBLAS repository! Let this statement set the record straight: Parth has made very real contributions to RandBLAS, even if the commit history doesn’t currently reflect that.

Rylie Weaver, the aforementioned out-of-the-blue contributor, helped write our Kolmogorov–Smirnov tests for repeated Fisher–Yates.

I wrote a lot of code (as one might imagine).

Funding acknowledgements

This work was wholly supported by LDRD funding from Sandia National Laboratories.

Sandia National Laboratories is a multi-mission laboratory managed and operated by National Technology and Engineering Solutions of Sandia, LLC., a wholly owned subsidiary of Honeywell International, Inc., for the U.S. Department of Energy’s National Nuclear Security Administration under contract DE-NA-0003525.

Patch releases in series 1.0.x

Version 1.0.1 (September 29, 2024). This patches bugs in values of RNGStates returned from functions for sampling from index sets. See GitHub for more details.

RandBLAS 0.2

Released June 5, 2024.

Today marks the first formal release of RandBLAS. We’ve been working on it for over three years, so we couldn’t possibly describe all of its capabilities in just this changelog. Instead, we’ll repurpose some text that’s featured prominently in our documentation at the time of this release.

A quote from the README, describing the aims of this project:

RandBLAS supports high-level randomized linear algebra algorithms (like randomized low-rank SVD) that might be implemented in other libraries. Our goal is for RandBLAS to become a standard like the BLAS, in that hardware vendors might release their own optimized implementations of algorithms which confirm to the RandBLAS API.

A quote from the website, describing our current capabilities:

RandBLAS is efficient, flexible, and reliable. It uses CPU-based OpenMP acceleration to apply its sketching operators to dense or sparse data matrices stored in main memory. All sketches produced by RandBLAS are dense. As such, dense data matrices can be sketched with dense or sparse operators, while sparse data matrices can only be sketched with dense operators. RandBLAS can be used in distributed environments through its ability to (reproducibly) compute products with submatrices of sketching operators.

There’s a ton of documentation besides those snippets. In fact, we have three separate categories of documentation!

Traditional source code comments.

Web documentation (i.e., this entire website)

Developer notes; one for RandBLAS as a whole, another for our sparse matrix functionality, and a third for this website.

Contributors and Acknowledgements

Since this is our first release, many acknowledgements in order. We’ll start with contributors to the RandBLAS codebase as indicated by the repository commit history.

Riley Murray, Burlen Loring, Kaiwen He, Maksim Melnichenko, Tianyu Liang, and Vivek Bharadwaj.

In addition to code contributors, we had the benefit of supervision and input from the following established principal investigators

James Demmel, Michael Mahoney, Jack Dongarra, Piotr Luszczek, Mark Gates, and Julien Langou.

We would also like to thank Weslley da Silva Pereira, who gave valuable feedback at the earliest stages of this project, and all of the individuals who gave feedback on our RandNLA monograph.

The work that lead to this release of RandBLAS was funded by the U.S. National Science Foundation and the U.S. Department of Energy, and was conducted at the International Computer Science Institute, the University of California at Berkeley, the University of Tennessee at Knoxville, Lawrence Berkeley National Laboratory, and Sandia National Laboratories.

What happened to RandBLAS 0.1?

We tagged a commit on the RandBLAS repository with version 0.1.0 almost two years ago. However, we hadn’t maintained version numbers or a dedicated changelog since then. RandBLAS 0.2.0 is our first formal release. We opted not to release under version 0.1.0 since that could ambiguously refer to anything from the now-very-old 0.1.0 tag up to the present.