At this point, the internet has a fully global reach. If you create a successful web or mobile app, you could have users on every continent (maybe not Antarctica). When they load up your app, they want it to be fast and relevant to them. That’s where edge computing comes in—it runs code and serves data from servers (points-of-presence) as close as possible to the client.
Companies like Vercel, Netlify, and Supabase have taken that a step further and created edge functions. These are bits of code that, when you deploy your site to these providers, get syndicated around the world to be executed as close and as fast as possible for local users who hit the site or app. It allows not just maximizing web performance for users worldwide, but also other just-in-time modifications that customize the web app for the local viewer.
It can make the world feel like your data center, but it’s an extension of content delivery networks: instead of serving heavy assets like images or video, they execute code. “There’s these other traditional network companies that help connect the world’s data transmission,” said Dana Lawson, Senior Vice President of Engineering at Netlify, “but there’s this new abstraction of that where you have the ability to execute code.”
This article will talk about that abstraction layer and the hardware it runs on, as well dive into the use cases for code that runs as local as possible for your users. For information on how it all works, I spoke with Malte Ubl, CTO at Vercel, Chris Daley, Principal Software Engineer at Akamai Technologies, and Lawson. The folks at Deno also gave me a brief statement and a link to a blog post that covered the same ground as this article.
Building on the shoulders of tech giants
When I was initially looking into this, I was interested in the infrastructure behind edge functions. To be able to call a function and have it execute wherever in the world the user is feels like a bit of magic. And all computing magic in the end is supported by silicon physically located in the world. But it turns out that the silicon that these edge functions run on don’t belong to the companies that run them.
As mentioned in the intro, CDNs have been around for a while. Now with cloud companies covering the world with cheap compute, building server racks in every time zone seems redundant, especially when someone else has already handled the hard of deploying physical infrastructure. “We’re always thinking about scalability and climate change and how we serve the world and be good citizens,” said Lawson. “If you’re trying to do it yourself, you’re gonna miss out on some of those important details. You’re gonna spend a lot of time, energy, and effort on stuff that’s already been done—innovate. That’s why you piggyback on these behemoths that have already done that hard work”
Netlify and Supabase both run their edge functions on Deno Deploy as an extra abstraction layer (Supabase has even open-sourced their edge runtime if you want to give it a go yourself). According to Deno creator Ryan Dahl, Deno “runs on the public clouds (GCP and AWS) but otherwise is built from scratch. The system is meant to be as user friendly as possible, so users shouldn’t need to think about regions when writing their code.” Vercel runs on Cloudflare’s edge worker infrastructure.
But edge functions end up being pretty different from what the underlying hosting providers offer. “Cloudflare’s worker product is terminating traffic. Its primary role is the reverse proxy,” said Ubl. “We use them as a backend because we are terminating traffic in our own infrastructure. So we use them actually very similar to a serverless function implementing a route.”
Most IP lookups use the unicast routing scheme. DNS resolves a URL to an IP address, which takes you to a particular server. However, Deno Deploy and Cloudflare both use anycast, in which an IP address maps to a pool of computers. The network (at least in a WAN, aka the internet) then resolves the address to whichever computer is closest.
While Daley says Akamai uses unicast for most routing, they do offer anycasting for edge DNS resolution. More importantly, they have a bit of mathematical magic that speeds traffic through the internal network to the fastest server. That magic is an extension of the algorithms that brought the company to prominence over 25 years ago.
In general, when a client requests something from an edge worker, whether through an edge function or in a deployed code bundle, it hits a thin reverse proxy server. That proxy routes it to a server close (close in this case means fastest for that location) to the client and executes the requested function. That server where the code actually executes is known as the origin. There it can provide typical server-side functions: pull data from databases, fill in dynamic information, and render portions as static HTML to avoid taxing the client with heavy JavaScript loads. “Turn the thing that worked on your local machine and wrap it such that when we deploy to the infrastructure that we use behaves exactly the same way,” said Ubl. “That makes our edge functions product this more abstract notion because you don’t use it so concretely.”
How you use it depends on the provider. Netlify seems to be pretty straight-forward: deploy the function, then call it the same as you would any other server code. It does provide a number of server events on which to hang functions. Vercel offers the standard server-side version as well as a middleware option that executes before a request is processed. Akamai, as a provider of an underlying edge worker network, offers a number of events along the request path in which to execute code:
- When the client first requests an edge worker (`onClientRequest`)
- When the request first reaches the origin server (`onOriginRequest`)
- When the origin responds after running the code bundle (`onOriginResponse`)
- Right before the response payload reaches the client (`onClientResponse`)
This allows apps to do some complex trickery on the backend. “We allow you to do something like go to a different origin or rewrite the path that I’m going to talk to the origin,” said Daley. “You might say no, I don’t actually want that website, I want you to serve something else completely instead. You can remove headers from it. You could add new headers. You could look at the headers that are there, manipulate them, and send that back. Then right before you go to `OnClientResponse` again, you could do some more edits. When you think about what we call a code bundle, there’s a good chance it’s not all running on the same machine.”
Regardless of whether the edge function performs a simple retrieval or a series of complex steps, it’s all about maximizing performance. Each extra second a site takes to load can cost a business money. “It’s time to first byte,” said Lawson. With some of these applications, they’re completely being manifested on the served assets and origins—whole websites are being created right there on the edge.”
As anyone working on high-traffic websites knows, there’s one thing that can greatly speed up your time to first byte.
Cache rules everything around me
One of the ironies of edge functions is that the abstraction layers built on top of these global server networks slow things down. “We’re adding a little bit of latencies, right?” said Lawson. “We have our traditional content delivery network. We have proxies that are taking those small little requests, shipping them over to these run times.” Each of these stops adds a little time, as does the code execution on the origin.
How do edge function networks minimize this latency so that getting the request to the edge doesn’t cancel out the gains made by executing it there? “The fair answer is many layers of caching,” said Ubl. “And lots of Redis. There’s there’s three primary layers involved. One does TLS termination and IP layer firewall that looks agnostically at traffic and tries to filter out the bad stuff without paying the price of knowing really what’s going on. Going one layer down is the layer that has the biggest footprint. That one has understands who the customers are, what their deployments are, and so forth. That’s driven by substantial caching.”
“It’s so fast and it’s just amazing how quickly we’re transmitting. It’s almost like a no op.” Dana Lawson
This makes getting from the client to the origin server extremely fast. “There is some overhead right between when you get the request and then you have to now deal with the JavaScript instead of hard coded things,” said Daley. “but it’s zero copy shared memory. There is overhead, but it’s extremely, extremely low to go in there—I think it’s less than microseconds. The bigger overhead is usually whatever problem they’re trying to solve.”
That’s the final layer: the origin server, where the code gets executed. That code, depending on what it is, is probably going to be the biggest source of latency overhead. But caching can help mitigate that as well. “If we’ve seen your stuff before, you’re in memory as best we can within memory limits,” said Daley. “That overhead will be fairly low depending on how you structured your code—we have some best practices about things to avoid.”
Once a client has completed their first request, the origin server has the response for that request cached. There’s a cost to replicating that cached response to other servers in the edge network, so maintaining a link between that server and the client can shave precious millisecond off of requests. “Our edge functioning evocation service primarily acts as a load balancer,” said Ubl. “We basically emulate the same behavior of the Cloudflare workers, where we load balance as ourselves and see a worker that can a little take a little bit more traffic and then multiplex another request on the same connection. It’s basically just HTTP `Keep-Alive`. That’s really fast.”
Another spot where the backend can slow down is in accessing your databases. Most edge function and edge worker providers also have fast “serverless” key-value store databases that you can add (or you can use other serverless database providers). But if you have a DB-heavy workload, you can use the routing and caching features of the network to speed things up . “From a latency perspective, once you talk to your database twice, it’s always cheaper to cache data,” said Ubl. “It comes with the trade offs of caching—you have to invalidate things.The other thing that users can opt into through our internal proxy and infrastructure, you can say, invoke the code next to my database.”
Caching can cause queuing issues for sites in less common languages, especially in functions and code bundles with multiple requests. “We changed how we were doing queuing at one point,” said Daley, “because when a subrequest goes off and it’s cacheable, it’s going to look to execute on the individual machine. Certain machines tend to be busier with certain customers, so their content is going to be on those machines often. If you have a lot of those stacking up, and you’re waiting on all these sub requests to finish, requests can fail when they hit resource limits. Most of the time, it takes ten milliseconds to run. We did a lot of work dealing with the outliers. I think it was like 900% improvement in people not hitting a resource limit.”
These systems are built for speed and repeatability—a CDN for code, essentially—so not every use case is a good fit. But those that are can see big gains.
Custom websites on the fly
Not all applications will benefit from functions that run on the edge. Of those that do, not all of their code will need to be executed on the edge. The functions that benefit will be I/O-bound, not CPU-bound. They’ll still use CPUs, obviously, but they provide some logic around moving more static assets or calling APIs and transforming the returned data. Said Daley, “It’s not general purpose compute as much as is shaping traffic.”
This means a lot of conditional logic on pieces of websites, even on whole pages. You could serve language-specific pages based on the region. You could A/B test portions of sites. You can automatically redirect broken bookmarks. You can implement incremental static regeneration. You could inject a GDPR warning if the site didn’t see a cookie. You could geofence users and serve different content based on their location—sale on berets only in Paris, for example. “If you’re very sophisticated, you can create an entire visual experience that’s been separated and globally distributed,” said Lawson.
If you want to get really fancy, you can chain together multiple pieces and custom create a website on the fly. “We have a fifth event and it’s called `responseProvider`, and it’s a synthetic origin,” said Daley. “There are some internal demos where I’ve seen people do impressive things. If you wanted to, say, call a bunch of different APIs, get all the JSON from those, and stitch it all together and call Edge KV—which is the distributed database—then put it all together, you could actually rewrite a web page right there and send it back.”
What it enables now is pretty impressive, but get even more interesting when considering how this functionality will help enable future AI functionality. “It basically enables the AI revolution because you can’t afford to run it on a traditional server list.” said Ubl, “But in the I/O bound use case, which edge functions are ideal for, you outsource the model and inference to somewhere else and just call that API.”
With the increasing prevalence of generative AI, what’s to stop people from combining the conditional logic that edge functions excel at with generative code? “We’re gonna see more AI building these websites and generating them and calling functions,” said Lawson. “It’d be really cool to see it on traffic patterns too, for it to be smart. Where you’re coming in and saying, okay, we wanna make sure this campaign hits this amount of audience. It hits a threshold, hits a metric, maybe it cascades it. Just automatic detection. I think it will be personalized experiences. We will not be as much driven by humans doing research and looking at analytics, but analytics calling code and doing it.”
What looks fast and seamless to an end user takes a lot of behind the scenes work to maximize the speed at which a request hits an origin server, processes, and returns to their screen. With multiple layers of abstractions behind them, edge functions can make all the difference for I/O-heavy web applications of today, and the AI enhanced experiences of the future.
The post Exploring the infrastructure and code behind modern edge functions appeared first on Stack Overflow Blog.