Graph coloring is a tough problem but useful in a wide range of applications. I'm interested in multi-color Gauss-Seidel (MCGS), in which solving for the unknowns in Ax=b can be accelerated with parallelism. The general idea is that a graph (the matrix A) is partitioned into independent nodes (equations/rows) as a part of a set (color). We loop over the colors and perform an x-update on each node in the color in parallel.
Recently, Fratarcangeli et al.  used this approach to accelerate projective dynamics (PD)  with a solver called Vivace. In essence, the global step of PD (a pre-factored linear solve) is replaced with GPU-based MCGS. Aside from the performance increase, this also conquered the biggest limitation in PD: reliance on a constant constraint manifold.
In PD, energy terms used to simulate elasticity were assumed constant. So when we want stuff like tearing or remeshing, the matrix that represents these energy terms had to be refactored (a costly operation). The bigger problem with this was dealing with collisions, which are represented as repulsion-springs. There are some hacks to avoid this overhead, but they have their own set of limitations that I won't get into. Using MCGS, adding and removing repulsion springs for collisions only requires a recoloring of the matrix. So, if we can recolor fast, we can have nice, fast, dynamic simulations.
Fratarcangeli et al. compare several graph coloring algorithms in their paper. They found a method by Grable and Panconesi  to perform best given certain considerations (run time and number of colors). Wanting to try out this Vivace solver myself, I've implemented the graph coloring algorithm with a few minor tweaks. It runs on a directed graph, and several of the steps are parallelized. You can also choose a "stride" allowing you to skip rows/cols of the matrix. The motivation for a stride is when your matrix has redundant rows/cols (arising from x, y, z constraint independence in PD), but wanting to store your vertices as a single column: (x,y,z)^T. The recent paper by Liu et al. explains it further (the text following Eq. 3) in which they avoid these redundant coefficients.
You can see my graph coloring implementation here:
It seems to work pretty well and passes all my unit tests, but it isn't optimized and I can't guarantee it's bug free. I decided to put it up early as it's a bit tricky to implement correctly and efficiently, and maybe other people might want to make use of it. I hope to have the MCGS solver up soon, along side the rest of the code used in our TVCG publication.
 Marco Fratarcangeli, Valentina Tibaldo, and Fabio Pellacini. 2016.
Vivace: a practical gauss-seidel method for stable soft body dynamics. ACM Trans. Graph. 35, 6, Article 214 (November 2016), 9 pages. DOI: https://doi.org/10.1145/2980179.2982437
 Sofien Bouaziz, Sebastian Martin, Tiantian Liu, Ladislav Kavan, and Mark
Pauly. 2014. Projective dynamics: fusing constraint projections for
fast simulation. ACM Trans. Graph. 33, 4, Article 154 (July 2014), 11 pages. DOI: https://doi.org/10.1145/2601097.2601116
 David A. Grable and Alessandro Panconesi. 1998. Fast distributed algorithms for Brooks-Vizing colourings. In Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms (SODA '98). Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, 473-480.