How to Do Virtual Hosting on Google App Engine

Background: What is Virtual Hosting?

Virtual hosting is when websites for multiple domains all run on a single web server. It depends on the HTTP request from the client (i.e. web browser) including the original server name in the HTTP request. This name is then (typically) made available as the CGI HTTP_HOST variable, so that the web server (or application) can serve content as a function of the specified host name.

Google App Engine is itself a virtual hosting environment, since multiple domains of the form my-application-name.appspot.com all point to one or a limited number of Google servers. So when I talk about virtual hosting on a Google App Engine application, I am talking about a form of nested virtual hosting, where my customer's domain points to my Google App Engine application, which itself runs on a domain which points to a Google server.

Why Would Anyone Want to Provide Virtual Hosting on an App Engine Application?

The idea of Google App Engine is that it is a scalable web application environment/framework where Google does all the low-level administration for you.

If follows that you might use the App Engine to develop and deploy a web application which could be used by thousands or even millions of users. And depending on the nature of your business model, and the type of application, you might want to offer your customers the options of serving their data as managed by your application on a host running on their domain.

How to Do It: the Easy Way (i.e. how it should be implemented)

If configuring a customer's account on a developer's App Engine application for virtual hosting was as easy as it could possibly be, it would consist of the following single step, or something very similar:

Customer creates a DNS CName record which points a host name of the form <subdomain>.<customersDomain>.com to a host name of the form <customerUserId>.<developersAppSubDomain>.<developersDomain>.com

Of course there may be additional steps internal to the application, such as the customer actually paying for a virtual hosting service, and maybe specifying options for different sub-domains, but I won't count these steps as they are a function of the application itself, and in the simplest case there might be no additional steps at all.

Note various things about this configuration requirement:

There is no specific reference to any Google servers. A link from some domain within <developersDomain>.com to some relevant Google domain will be entered by the developer on the developer's DNS, but there is no need for the customer to know about that.
Because the CName record for the customer's subdomain was entered into the customer's DNS, you know that it was created by the owner of <customersDomain>.com.
The <customerUserId> value reliably indicates which customer of the application is "owner" of that subdomain for the purpose of serving content on the specified customer subdomain. (Of course the user ID becomes public information, so customers have to take that into account, and the application may even use some alternate ID which is not the actual user ID, but which is reliably and uniquely related to it.)

Implementation

I said this was the "easy way", but unfortunately it's not a method that currently works with Google App Engine.

If it did work, then it could be implemented as follows:

The developer creates a CName record from *.<developersAppSubDomain>.<developersDomain>.com to <appEngineAppId>.appspot.com (if wildcard CName values aren't available on the developer's DNS server, then the developer will have to create one entry per customer).
When Google receives a request on the App Engine Server with an unknown host name such as <subdomain>.<customersDomain>.com, it does the CName look-up, until it finds a host name of the form <appEngineAppId>.appspot.com. It then sends the server request on to that App Engine application, ideally including the whole CName "chain" as a special CGI variable, so that the application can determine the user ID of the customer that owns the domain. (And if the App Engine server doesn't find a host name for an App Engine application, it returns 404.)

Minor variations on this scheme are possible. For example, the App Engine API could include a method for pre-registering customer sub-domains, and the Google App Engine server would follow the CName "chain" once and once only when that registration happened. (It doesn't matter if this registration becomes stale, because if the customer sub-domain ceases to point to the developer's app, then the App Engine server won't receive any requests for it anyway, and if the customer chooses to host the sub-domain on a different App Engine application, then the new choice of app will also have to be registered, and the new registration will replace the old one.)

Implementing App Engine application virtual hosting this way may have some cost (in terms of storage and DNS lookups), and I'm sure developers would be happy to pay for it as an extra option. Unfortunately, the current method of configuring virtual hosting for a customer of an App Engine application imposes unnecessary cost and hassle on the customer. Which is not good.

An Aside: "Naked Domains"

Naked domains are domains like <customersDomain>.com, without even the "www", or any other sub-domain. For some reason, the rules of DNS only allow "A" records for these naked domains, and not CName records.

Currently Google App Engine does not support naked domains. So it is not even possible for an App Engine developer to provide virtual hosting on a customer's naked domain.

In practice this may not matter too much, at least for applications that provide content subsidiary to a customer's main website (so that the customer would be happy with using a subdomain for that subsidiary content).

How to Actually Do It: the Hard Way

Entering a single CName entry pointing from the customer's domain to the developer's subdomain should be necessary and sufficient to configure and verify a virtual hosting option.

But what a customer currently has to do to configure their sub-domain to point to a developer's App Engine application is as follows:

Sign up for Google Apps.
Verify ownership of their domain by creating a special CName entry (with a long obscure subdomain which isn't going to be used for anything else) which points to google.com.
Go to "Dashboard"
Click on "Add more services"
Enter the developer's app id into the Google App Engine field, and press "Add it now"
Check the "Agreement" check box
Click on "Activate this service"
Click on "Add new URL"
Enter the subdomain name and press "Add"
Create a CName entry (in their DNS) which points from their subdomain to ghs.google.com
Press "I've completed these steps"

Note that even after performing all these steps, the customer has still not proven to the App Engine application that any particular user ID is the "owner" of that subdomain. So the application developer would still have to ask the customer to perform some additional validation step, perhaps involving the addition of the customer's user ID to a CName record, as described above (although it could be difficult to read DNS information from within a Google App Engine application).

Additional Domains

When a customer has more than one domain that they wish to use with the developer's App Engine application, they have the option of adding additional domains as "aliases" to the same Google Apps account. This saves the customer from having to create more Google Apps accounts and add the application as a service to each one. However it does require the customer to click on a button at the bottom of a screen full of instructions for configuring MX records, which says: "I have completed these steps", even though the customer hasn't, and has no intention of doing so. (Also they may wonder if clicking on the button risks interfering with their existing email, even though it shouldn't, since their existing MX records point to whatever they currently point to. And they may wonder if clicking on a button saying they've done something that they haven't done violates their Google Terms of Service, which might result in Google disabling their Google Apps account, which of course will disable the virtual hosting configuration.)

Conclusion

Virtual hosting of customer domains on a Google App Engine application is possible.

However it would appear that Google has implemented this functionality in a manner which is convenient for customers who are already using Google Apps.

If Google aims to make Google App Engine convenient for developers developing "major" applications, which may contain virtual hosting as a customer options, then they need to vastly simplify the setup procedure for virtual hosting on App Engine applications, and they need to totally remove the requirement for the customer of the App Engine application to have a Google Apps account.

And if they can't do that, they should at least allow for the possibility that a user of Google Apps is only interested in accessing App Engine apps, and that the user does not necessarily want to access any of the "standard" features of Google Apps.

Appendix: Customer Id Verification

The problem of verifying which user ID for an application owns a particular customer domain is non-trivial.

Looking at it as a security problem, the "attack" is that the customer enables a particular subdomain to point to the application, but then some other user of the application "hijacks" the domain by claiming it as their own.

Here are some pro's and con's of various approaches:

1. No user ID identification, but claim first

The user "claims" the domain, then (quickly) creates the CName record, then asks the application to verify the record. If someone else has already claimed the domain, the application will report this to the user, who can then raise a dispute before creating the CName record (or, wait for the first claim to expire, since it will never be verified, and then try again).

2. Include User ID in the "to" domain of CName record

This directly "describes" the UserID/Application as the target of the customer's subdomain, and is perhaps the most "natural" approach. However it requires the application to have full access to the CName "chain", and it requires the developer's DNS to have a wildcard option, so that the user ID is ignored for the purpose of resolving the developer's subdomain which describes the application.

3. Add the User ID to the "from" domain of a second "verification" CName record

For example, service.mycustomersdomain.com points to myapp.mydomain.com (which points to myapp.appspot.com), and userid.service.mycustomersdomain.com also points to myapp.mydomain.com. The application verifies that service.mycustomersdomain.com belongs to user userid by doing a single web request to userid.service.mycustomersdomain.com, and checking for a successful result. (The request may include a one-time secret token to be doubly sure that the application is indeed receiving the request from itself.) Once verification is complete, the second domain can be deleted.

The advantage of this approach is that the application does not require direct access to DNS records to do the verification.

4. Combine Options 1 and 3

Options 1 and 3 can be combined, where option 1 is used if it works, and option 3 is a fallback.

The required steps are as follows:

The user claims their domain within the application, prior to creating any CName records.
If no other user has claimed the domain, the application confirms that the current claim is active, that it has an expiry period (e.g. 1 hour), and that the user should activate the CName record.
The user creates the CName record e.g. service.mycustomersdomain.com points to myapp.mydomain.com.
The user tells the application to verify the CName record (i.e. the application does the DNS lookup directly, or it does the DNS lookup indirectly by invoking itself via the specified domain).
The lookup succeeds, and the application confirms that the user's claim to their domain has succeeded.

However, if some other user of the application has already claimed the domain, then the steps are as follows:

The user is told that some other user has claimed the domain (information about which user is the other user is not immediately revealed, but may be revealed if this user's counter-claim succeeds).
The user is told not to activate the CName record. (However if the CName record is already active, or if the previous claim has already been confirmed, the user should still continue with the following steps.)
The user creates the temporary verification CName entry which contains the user ID prepended to the customers subdomain which is being claimed, i.e. userid.service.mycustomersdomain.com points to myapp.mydomain.com.
The application verifies the verification CName entry (by direct or indirect DNS lookup), and confirms the successful claim.
The user then creates the actual application CName entry (and deletes the temporary verification CName entry which is no longer required).
The user requests the application to confirm the application domain name. (If this fails due to existing cached DNS records, then the user should repeat the request for confirmation until such time as the new DNS record is available to the application and the claim is confirmed.)

Caveat

I should point out that I have not actually constructed a full-blown Google App Engine application with a virtual hosting option. (I am currently researching which hosted application environment best supports the development of a web application with a virtual hosting option, before I invest major effort in actually developing such an application.)

However I have confirmed the steps required to configure a simple test App Engine application to work with alternative domains via Google Apps – creating a "customer" Google Apps account distinct from the account which owns the application, using the customer account to map more than one domain to the application, and verifying that the application can display the correct HTTP_HOST value for each configured "customer" domain.

Working Example

My current test application is the Appengine application philips-experimental, and an example domain configured to be virtually hosted by that application is showkeys.aptrow.com. (You can do an nslookup on showkeys.aptrow.com to see how it is configured.)

(You can try configuring your own subdomain to have philips-experimental as a Google Apps "service". If you configure it correctly, you should see your domain as the value of the key HTTP_HOST. Note: no guarantees are made about the continuing stability or existence of the test application. Also, note that although SERVER_NAME happens to show the same value as HTTP_HOST, HTTP_HOST is the correct variable to use for implementing virtual hosting.)

a blog about things that I've been thinking hard about