Rubik’s Cube configurations
The Rubik’s Cube is a popular puzzle toy invented by Ernő Rubik in 1974. Starting with a 3-by-3 grid of squares on each face, all one color per face, you can twist each face in any order to scramble the colors. To solve it, you have to get it back to its original configuration.
Here is the solved cube, flattened:
Here it is in a more intuitive 3D perspective, showing two opposite corners of the same cube so that every square is visible:
One clockwise twist of the yellow face:
And a second, clockwise twist of the blue face:
And so on until it’s scrambled:
Unsolving the cube
How many twists does it take to scramble a solved cube? We can simulate this process many times, measure properties of the configurations along the way, and see when those properties stop changing. Here are the results of 1000 trials.
First, count how many of the eight surrounding squares match the color of the center squares:
At first, 100% of the surrounding squares on each face match the center. After one twist, four faces have only five matches and the other two faces are unchanged (as you can see in the single-twist cube shown earlier), resulting in \(4/6 * 5/8 + 2/6 * 8/8 = 0.75\). After that it gets more complex, but since each surrounding square can be one of six colors, this should converge to an average of \(1/6 = 0.1666...\). This seems to happen around the 30th twist.
For a more detailed test, we can count how many of each color appear on each other color’s face. Even after an average of 5/6 of a color’s movable squares have migrated from their original face, perhaps they are more likely to be on adjacent faces than on the opposite face.
At first, all surrounding squares match the center color. After one twist, they can be on one of the four adjacent faces, but not the opposite one. The fractions steadily converge, and by around the 30th twist, each color of movable square is about equally likely to be on any face.
There are two types of movable square: edge and corner. Do they get shuffled at the same rate?
Corner squares actually get shuffled faster than edge squares. This is because, while there are 24 of each, each twist moves eight corner squares (\(1/3\) of them), but only four edge squares (\(1/6\) of them), to a new face.
Solving the cube
If we record the result of a random sequence of twists, we can solve the cube from any of those configurations by reversing the twists.
Can we build a general solver that gives a sequence of instructions for any starting configuration? We could do something like the above sequence but more methodically. Starting with the solved cube, try every possible twist sequence, keeping track of which move produced each encountered configuration. Then, reverse each path to get the sequence back to the original solved cube.
Specifically, we can use this algorithm:
Initialize an empty directed graph or hash table to store the solution tree
Start a queue with the solved configuration
While the queue is not empty:
Take configuration \(C\) from queue
For each state \(D\) resulting from a single twist \(T\) from \(C\):
If \(D\) has not been seen before:
Add an edge/entry to the solution tree pointing from \(D\) to \(C\), and include the move \(T\) with direction reversed
Add \(D\) to the queue
This results in a tree with every reachable configuration. To solve any Rubik’s Cube in the fewest steps, look up the state and follow the tree up to the root, the solved configuration.
Here is a diagram for any configuration that requires only three moves to solve, starting on the outside and following the moves to the center:
You can twist the solved cube 12 different ways (six faces, each in two directions), and each of those configurations can lead to 12 more configurations with another twist. Some of these will have been seen before and ignored, but we still see approximately exponential growth in configurations with each increment in twists-to-solve distance:
Eventually this will taper off and stop once every potentially new configuration has been seen before. But it turns out there are around 43 quintillion reachable configurations, so in practice we can’t build a complete solver with this algorithm. That’s why real solving strategies generally involve more targeted algorithms that solve different parts of the cube step by step. There are some interesting strategies and optimal solution proofs out there that are worth checking out. There’s a lot of math packed into this one cube!
Acknowledgements
I used Python to simulate Rubik’s Cubes. I used R with tidyverse, gganimate, tidygraph, and ggraph libraries for visualizations. My code is here.