Note larger screen size, significantly more balls (slightly more hoops, many more balls per hoop--1605 spheres total), and higher frame rate. Same computer.
The percentage breakdowns from a detailed profile: 8% setup (computing and normalizing the rays), 40% raycasting primary ray bundles (I can speed this up a lot by using larger bundles), 12% raycasting primary rays (against things that intersected the bundles, so this gets slightly worse with larger bundles), 25% raycasting shadow rays (difficult to speed up further, as that's been the focus of this round--which means multiple light sources are pretty unlikely), 15% per-pixel computations (partial lighting computations, final intersection resolution, shadow raycast prep).
Of course, I bailed on this originally for a reason: somebody else has already done it. In fact, his demos run in 1024x768 at a pretty reasonable frame rate, although not very many spheres, seemingly, but with lighting and texture mapping, which if I add to mine are just going to slow it down more (although there are certainly possible optimizations--but this is all strictly per-pixel costs, so you can't avoid them).