The God Endpoints will continue until morale improves
GraphCMS Hygraph calls it a Federated Content Platform:
Gatsby calls it a Content Mesh:
Apollo comes right out and calls it a Supergraph:
This isn’t new, Matt Slotnick perceptively called it the hardest working graphic in software:
and the GraphQL devotees have their own data engineering parallels in Airbyte and Fivetran.
Thoughts
The value proposition is evident - IF ONLY we can get all these messy things to conform to one interface, developers will be able to query them and resolve across them faster, improving developer experience, blah blah blah. This problem increases nicely with the size of the company, nobody wants to do it, it’s essential for operational and analytical needs, making it a nice meaty problem for startups to go after. Strategy 101, or rather, Strategy Letter V.
But there’s a lot in that IF:
- Maintenance cost: Interfaces break all the time and run into edge cases/perf issues, averaging (1-3hours? its very spiky, averages are kind of meaningless) a week of maintenance
- Lockin: People are locking themselves into your SDKs and APIs, and given that none of these God Endpoints have yet stood the test of time (even GitHub has not managed to make GraphQL default/easy), it is a risky proposition.
- In this sense the data eng companies have it easier, because they integrate data, not code, which lasts longer.
- Incentives: why should the data silos want to let you extract their data? why can’t THEY too create a God Endpoint for their users? obligatory https://xkcd.com/927/ reference (bonus points if you know what that xkcd is without clicking)
Standards as God Endpoint
Each year hundreds of millions of dollars are thrown into solving this stuff, both in-house and in vendors. It is probably necessary work, it is messy work, and it is unrewarding in the small/only lasts until the next Big Bang Rewrite for the New God Endpoint flavor of the decade.
It feels inelegant though. We are brute forcing this problem by throwing endless bodies and time and money at it but this doesn’t solve it like email and terminal outputs and HTML have been “solved”.
What needs to happen is standards - that data producers and data consumers and all in-between tooling can optimize to, that increase the user trust in betting on these. In terms of recent examples, I am inspired by Opentelemetry, which although it did not have a rocketship outcome for Lightstep, seems to be successfully defining a telemetry interface that all producers and consumers are now accepting.
Sometimes standards are designed by committee (in JS, WinterCG is particularly interesting right now, but lacks teeth), sometimes they are decided by an extremely dominant player (JSX, S3, OCI and Postgres are examples from various domains) that has essentially “won”. Of course this begs a question - since it isn’t necessary to establish a common standard before “winning”, is it even helpful to try? My sense is winning without standards and winning with are comparable to “winning the battle but losing the war”.
Language as God Endpoint
Of course, this being the Age of LLMs, no blogpost would be complete without considering the first interface evolved by humanity: natural language.
In other words, even standards have problems - they require a learning curve, they may have design flaws, and they aren’t flexible (almost by definition - the more flexible a standard, the less useful/reliable it is). Standards optimize for machine communication, but Languages optimize for human communication.
What does this look like, potentially?
- Instead of
SELECT COUNT(*) FROM users INNER JOIN charges ON [users.id](http://users.id/) = charges.user_id WHERE users.email = 'joe@freshpizza.com'
- we might write/speak/think:
how many payments has Joe from freshpizza.com made?
If you are implementing a system like this, please also implement partial information resolution:
- Q: how many payments were made?
- A: Insufficient information - how many payments by whom? over what time period?
- Q: oh sorry - by ’joe@freshpizza.com’
- A: ok, looking for the count of payments made by
'joe@freshpizza.com'
… over what time period - Q: oh sorry again - over the last 3 months!
That kind of thing, but for API tokens, logins, and other missing info (I covered some of this in my 2019 Adaptive Intent-based CLI State Machines talk, informed by experience from my 2016 Alexa skill). Because this is a human-in-the-loop process, you’ll want a Temporal.
Isn’t this just chatbots? I’ve now been in tech long enough to remember the previous time conversational commerce was hyped and went nowhere (arguably arguable), and AI assistants were going to take over the world. So yes there’s a risk that we go through a whole bunch of ceremony just to reinvent Clippy 2022 Edition. But the volume of data on both ends, and the new usecases unlocked by better natural language understanding, perhaps makes it worth another shot.
Yes, this is a 1000x more expensive query to parse, but that could go down over time, and it could be more of a last mile thing to humans, but you could imagine a distant future where LLMs are cheap enough that systems talk to other systems in this same way - forever solving the problem of API or standards breaking.