Tuesday, November 13, 2007

Interdomain teleports, revised.

This is going to be one hell of a geeky post :)

Ok, after a lot of bloody experiments (blood was mine, that is), I have the interdomain teleports working.

Needless to say, I will need to completely rewrite it, but at now I have a pretty solid proof it works :)

So, the first edition of the architecture is below. (when I talk "grid" further on, the standalone setup works exactly the same).

Suppose we have a sim in the local grid (LS), where we are, and sim on the remote grid (RS), where we want to get. The grid coordinates can be the same for the two. Also, each of the two grids defines two regions, I call them "Area51" - they are being used as a transit area for the teleports. Hence the name :) Let's denote them "A51L1", "A51L2", "A51R1", A51R2". If we consider the teleport from LS to RS, we can forget for a moment about the A51R1/2 - they are not needed, only local transit areas are used.

The RS has the URL, "http(s)://x.x.x.x:pppp/sim/RS-name" - where x.x.x.x is the address of the http listener holding the RS. (the one you would use in loginuri). We denote this "RS-TP-URI" (as belonging to the RS).
There is also another URL used for the interdomain teleport to the RS, "http(s)://x.x.x.x:pppp/interdomain-teleport/RS-name" - we denote it as "RS-ID-URI"


The process is as follows.

1) The user brings up the "Map" window while being on LS, and in the "name" editor pastes the RS-TP-URI. The flag "ID teleport pending" for this client is reset to "false"

2) LS detects that it is an URI-based simname, and does not do the local search, instead falls back to the next step.

3) LS fills in the XML data structure, denoting the key of the user requesting teleport, RS-TP-URI, possibly the credentials from the user/source sim, and performs an HTTP POST request to the RS-TP-URI. This happens in the background while the user is awaiting for the search result to pop up.

4) RS receives the request, verifies that the domain where it is allows the interdomain teleports, and fills in the reply - echoing the agentID, also the following info: RS-ID-URI, RS's name/location/external endpoint, regionID, regionHandle, and also the cookie - which will be used subsequently for verification that the request is not spoofed. Also can include the map of the sim for showing on the map.

5) LS checks the grid coordinates of the RS, and picks one of the two A51L's - the one that does not overlap with RS in grid coordinates. Let it be A51L1. After that, it formats the map block with coordinates of A51L1, and sends it to the user. At the same time, the structure InterdomainTeleport is being created, which holds the reply for further use, and the flag "ID Teleport pending" for this client is set to true.

6) The user selects the desired location on the RS's picture, and clicks "teleport".

7) The regular teleport function checks, if the "ID TP pending" flag is set. If it is set, some more info is stuffed into the "InterdomainTeleport" structure - namely, delegate functions for running before/after TP, position, and lookAt values.

8) The teleport proceeds as normal - note, that since we've specified the coordinates of A51L1, we're going to land there first.

9) There's a hook on the "UseCircuitCode" function, which checks, whether there is a pending IDTP. It would be reached (I think:) once we have successfully landed on the A51L1. If there is a pending IDTP, then the control is passed to the IDTP module. (also, since there can be more than one of these messages, ensure to react only on the first one).

10) IDTP module calls the "before TP" delegate, and then fills in the TeleportRequest, which it submits via HTTP POST to the RS-TP-URI - we have received it in step 5. Also it sends the "Start teleport" to the client. The Teleportrequest is basically the same idea as in the gridmode/standalone mode - request for the RS to start accepting the client packets.

11) RS checks the cookie, and if everything is ok, prepares itself, and sends the positive reply.

12) LS receives the reply, sends the "teleport" request to the client with the grid coordinates of the RS, and the IP endpoint of it. Then it sends another POST request to the RS-TP-URI with the additional parameter "final" - which will not return until the RS sees the client (i.e. "UseCircuitCode" seen by the RS). In the prototype version it is just a delay.

13) when the reply from RS is received, the LS knows that the client is already hanging "one foot" on the other grid, so now is the time to clean up. LS triggers the "SimDisable" messages for the sims on local grid, and (since they do not seem to do jack, apparently), also triggers the forceful disconnect of all the client's agent connections to the local grid.

14) At this stage the client is fully sitting on the remote grid, so the LS can delete the "InterdomainTeleport" structure and return to initial state.


Some observations:

1) If you think that the function which is called "CloseConnection" closes the connection to the client - no way. It does not :)

2) There's no good infrastructure for doing REST-like talk. The built-in library is dead buggy on mono - passes away after a few HTTP requests... So I had to hack around the ObjectPoster class - by implementing a callback method. (so 12 is a callback, and 13 is a callback upon the request done by a first callback :) Also made a couple of few methods to simplify the work with the XML-encoded requests and replies.

Notes:

1) The "post + dual post" is deliberate. It allows to hook something else onto the GET handler (i.e. if the user just navigates to the RS-TP-URI with a browser, and also will allow to "jump out" of the firewalled or non-interdomain-enabled sims, if necessary.

2) There is currently possibly a bug when the client jams into the already existing circuit - everything is normal, except you are walking in the middle of the sea. Since I finally managed to fully disconnect the client upon the teleport, normally it should be no big deal.

With my experimental code (no selection of A51, no access control check), I've fired up 3 standalone sims, and had been jumping around with 2 clients. Did around 10 hops or so - seems to work.

3) physics when TP-ing can play funny tricks :) be careful.

2 comments:

Anonymous said...

Short summary for those of us whom only understand real words: Yay!!! ? ;-)

Dalien said...

not yet :) Yay will be once it is committed and available in the stock distribution.