Compute Exchange
One idea I’ve had kicking around for a bit is something like a stock exchange but for compute. Buyers (users) submit a computation with resource requirements, a container hash, and a price; sellers (providers) submit their computational capacities and prices; the exchange does some matching and does the money things.
This is one possible future for the stuff I’m doing with Program Explorer. I’m slogging through the last 10% until I have a v0 public release for that so maybe writing down some of these ideas will help me get there.
I don’t have a tone of prose to elaborate in a straightforward fashion (as if anything I write is here is straightforward) so I’ll start with bullet points and that will either turn into a bigger post someday or stay a list forever.
- cloud providers each have things like this to run containers, for example AWS ECS or lambda, Google Cloud Run
- they don’t have a uniform API
- they don’t support all the hardware configurations (like memory size caps) they offer for dedicated instances
- cloud providers each have spot instances with a bid price
- your instance can be interrupted (with warning) so you need logic to support being interrupted
- you can’t place one bid between two providers and take the lowest/first (I think there are 3rd party versions of this)
- there are only a handful of cloud providers, is this useful? would they adopt it?
- they wouldn’t initiallly because they are happy having lock-in
- I think there could be way more cloud providers than the big few and having a way for buyers to use them without even knowing who they are makes it possible for them to start getting work
- how can I trust the provider which gets matched with my job is trustworthy
- idk hard problem
- part of exchange duties would be to vet providers
- lean on combination of TPM, attestations, SEV, etc.
- right now the price of computation is set by some finance people at big cloud providers in a static way
- this seems incredibly hard to get correct, or incredibly easy to over-price since you don’t want to lose
- maybe they rely on bandwidth costs and stuff; wish I could look at their numbers…
- the price of computation should be a fluctuating thing, just like a market
- the price of electricity can change and will continue to change
- the price of a FLOP/INSN (instruction count if we’re not AI focused on floating point) is changing all the time
- the geopolitics and business politics of whether/when how expensive every next gen of chips puts uncertainty on future pricing
- there is a huge amount of old/older compute on the secondhand market
- hard to price the value of the equipment currently because
- with exchange data, could price the equipment based on compute/watt
- users care about some combination of price and time
- estimating resource requirements is a hard problem for users and developers for known containers
- don’t want to overestimate memory or cpu otherwise we pay too much
- users can’t be expected to know when they create a job how much they need
- developers don’t always know a good rule of thumb or close overestimate, but sometimes they do
- one possibility is to have some metadata / standard on how to invoke a container with the input files in “resource estimation mode”
- another is to have a mode of “elastic” (though would AWS sue you?) execution where providers are expected to be able to migrate you around if you need more memory (up to some limit)
- doesn’t exactly help with cpus, but you could also imagine an API for “hey please give me 10 more CPU cores” which would be kinda cool
- some containers are public and providers could easily pull/cache them
- I think cloud providers like to make you put any containers you want to run in your own registry; this does make sense but is also a pain
- others are private and would require access keys to a registry (goes back to trust of provider)
- where does the input to the program come from and where do the outputs go?
- presigned urls from object stores (hello semi-standardized API for storage, where are you for compute?) are nice, but are per-object for reading or writing, so doesn’t really scale well
- ideally there would be presigned url for reading from a list of objects and/or prefix and writing to a prefix
- one presigned url per output object is again bad UX b/c the user and/or developer need to know upfront how many outputs we produce
- presigned urls from object stores (hello semi-standardized API for storage, where are you for compute?) are nice, but are per-object for reading or writing, so doesn’t really scale well
- where can you view the status of your computation?
- maybe exchange manages this
- will need to keep record for money stuff anyways
- this is all with the idea of batch type computation, can it work for services
- much much harder
- if my data is in S3 but my job gets matched to azure, will I pay a fortune in egress?
- yes
- buyers likely need constraints they could specify, but having the exchange do something smart here too would be nice
- hopefully we could somehow do away with egress costs one day
- in some ways they are legit b/c it costs a X pJ/bit/mile or whatever, but again if it’s a fixed price, then it is not getting priced accurately
- what about GPUs?
- yes those are important
- this is where it seems like there are actually more successful small providers already
- what about batch jobs that expect to communicate and care about locality?
- ideally the hard problems of scheduling with constraints could be centralized in the exchange
- providers could ideally just netboot machines from an exchange-provided OS (or custom if they prefer) and start collecting money
- all resembles some kind of mega job/cluster scheduler and job/workflow/DAG scheduler
- maybe exchange would support DAG definitions (see earlier post on container build systems); never really seen one I’ve liked
- how should a user decide whether to run on aarch64 or amd64?
- multi arch containers should signify you don’t care and run on the cheaper
- non obvious tradeoffs in compute time and cost though
- if a job doesn’t complete 90% of the way through a 1 hour job, who pays?
- if it’s a hardware failure, should be the provider
- if it’s a bug in the program, should be the developer (joke)
- if it’s a bug by the user, should be the user
- would be nice to have checkpointing either at the VM/container level or some metadata on how to checkpoint a given container (what signal to send and what file(s) are in your checkpoint)
- how much batch computation market is there?
- scientists
- companies
- lower than there could be because of how much friction there is to just run a damn thing
- tool use by AI agents
- is container enough to specify working environment?
- something might require a certain kernel version
- support bare metal? need really secure root of trust to make sure user doesn’t flash your mobo with a rootkit or whatever
- some things like benchmarking would benefit from specifying specific CPU requirements
- what about disk space for scratch?
- can the provider cache files?
- would be so nice to lean on a content addressable system here
- what is the exchange’s cut?
- fixed fee?
- percentage?
- what kinds of fairness can you provide or need to provide