# XKCD 2585 Implemented: Round the world ⏹ ⏺

Pulling any number of any number by repeatedly cramming one into a bunch of units and rounding until it starts looking like the other.

Try it out at https://rounding.lam.io!

## Technical Overview

This is a fairly straightforward use of the Wikipedia Convert module's database of units, which I reused from a previous project (phrase2unit: Implementing XKCD 2312). All this is doing is a greedy linear search of all units that match the input kind, which is sparse enough (<100k entries) that this isn't very expensive. In fact sparsity is a big problem with the original dataset, which doesn't fill the real line enough to converge for most pairs of numbers and units. Since units can be arbitrarily combined, the live implementation uses a cross product of all units in the base Wikipedia database (and their reciprocals), which gives decent performance and coverage at the expense of some very weird-looking but valid units.

Most of the heavy lifting is done by Postgres, specifically this linear search of all valid unit conversions post-rounding:

SELECT * FROM (
SELECT *, ABS(LOG(st0.vto / <target quantity>)) AS diff FROM (
SELECT long_name AS vto_unit_long,
name AS vto_unit,
factor,
ROUND(<starting quantity> / factor) * factor AS vto
FROM units
WHERE pool = %s AND factor > 0
AND si_m=%s AND si_s=%s AND ...
) st0 WHERE st0.vto > 0
) st1 ORDER BY st1.diff ASC LIMIT 1;


which is really as brute force as it looks, a sequential scan without any indexing after cutting the list to the units that match the queried unit. Looking at the number of matching units in the worst case I had alluded that it stays <100k entries, so this is pretty tractable.

### Unit stats

The base units that Wikipedia provides have an interesting distribution, with energy having the most diverse offerings:

Unit typeCount
Energy205
Volume143
Length90
Area60
Mass58
Flow (volume/s)54
Force50
Time44
Pressure/Energy per unit volume36
Density34
Speed34
Per unit area31
Molar rate27
Power27
Mass per unit area16
Unitless15
Linear density15
Temperature14
Power per unit mass10
Acceleration9
Magnetic field strength6
Chemical amount6
Energy per chemical amount3
Per unit time3
Per unit volume3
Charge3
Mass per unit power2
Mass per unit time2
Pressure per unit distance2
Force per unit distance2
Voltage1
Electrical current1
Luminous intensity1

The combined units have themselves a different mix of units of course, although it is fortunate that many of the ones near the top are named, common units that people like to try:

UnitUnit typeCount
Unitless97268
m1Length38232
m-1·s-2·kg1Pressure/Energy per unit volume38131
m-5·s2·kg-1?29930
s-2·kg1Force per unit distance27111
m-4·s2·kg-1?25955
m1·s-2·kg1Force25761
m2Area25636
m3Volume24836
m3·s-2·kg1?23400
m-4·s4·kg-2?20882
s1Time18702
m2·s-2·kg1Energy18597
m4?17645
m-5·s3·kg-1?15012
m2·s-3·kg1Power14360
m1·s1·kg-1?14106

## Next steps

Further optimizations are difficult, especially within the constraints of an RDBMS. Since where the 0.5 cliffs land depends so wildly on the input, it's pretty much impossible to make any pre-built indexing structure that will serve all inputs. It is possible to progressively search for ranges (e.g. to increment, to look for numbers that will generate 0.5-1, then 1.5-2, etc.) although finding numbers that land in lower ranges won't necessarily be the fastest to advance since landing at say 0.99 has much smaller gains than landing at 20.5. If there were much larger inputs and search spaces it might be worth doing this progressive search and finding a reasonably threshold but for now with this set of units brute force is actually not bad and has minimal overhead.

Use it

rounding.lam.io

GitHub

acrylic-origami/unit_rounding