This is my personal blog. The views expressed on these pages are mine alone and not those of my employer.

Wednesday, March 03, 2004

The following is in response to Daniel Brookshier's blog entry:

Since you are the one who created the Binary ID work for JXTA, it is cool to see you lay out your full vision for that code.

I also didn't know that the JXTA team is focusing on optimizing search for pipe endpoints, rather than optimizing on searching for advertisements, for example. This is important to know. For example, when designing a JXTA based P2P app, such as a content management system, I would ask myself where I store and search for the files for a particular content site. I might store the actual content inside of JXTA advertisements, and then use JXTA's Rendezvous or Resolver systems to search for those bits of content. However, if the JXTA team is optimizing pipes, then it makes more sense to advertise the existence of that content in an advertisement along with a pipe endpoint to find the content, but not put the actual content into the ad. Instead, I find the advertisement for where to find the content, then open a pipe to the peer that actually has the content.

This is interesting, because it makes JXTA more about service discovery rather than using JXTA as a place to actually implement your services. JXTA becomes more of a directory service, where I search for what I need and find pipes to actually request the service, rather than a place to actually implement the service, such as putting instant messages between two peers into advertisements that are found using the Resolver service.

Does this hold in general for most JXTA applications? One of my interests is in vastly simplifying JXTA. If this is true, then we can use the JXTA framework mostly as a distributed directory service + firewall NAT traversal system, which is what P2P Sockets does (http://p2psockets.jxta.org). In P2P Sockets, once you resolve a well-defined endpoint (using your Binary ID work), you get back a pipe to communicate with that endpoint (made to look like a standard socket). Publishing a service is exactly the same, but looks like a standard server socket. The main issue at this time is that these "friendly" service endpoints, such as the email address you give in your article or "www.nike.laborpolicy" are easily spoofed. This might possibly be an unsolvable problem if we want the distributed directory service to be secure, decentralized, and easy to use for end-users.

I also read your blog entry comparing JXTA to the Apple II. I think JXTA is more like the Commodore 64 or Tandy Color Computer; it shows the way to an exciting new way of building applications/using computers, but it is missing something that can take it mainstream like the Apple II or IBM PC did. I think part of the problem is that JXTA is simply too complex for developers to use; it is also too complex for other developers to reimplement in other computer languages, which is keeping it from becomming a standard. Both RSS and XML were relatively easy for developers to reimplement in other languages and systems, which aided their growth. You could build a system or incorporate into an existing system XML or RSS in a weekend or in several weeks. Compare that to JXTA; the JXTA-C team is still struggling to just get the TCP endpoints working, never mind having C-based rendezvous or relay peers. JXTA is more right than any other P2P framework so far, but it isn't right enough to go mainstream yet. I don't know what the answer is; P2P Sockets is simply one exploration of a possible solution, but it still falls short.

You also discuss ad-hoc discovery in your article. The funny thing in JXTA is that everything, even Binary IDs, are really runtime "late binding" searches for an ID. Everything in JXTA is really ad-hoc. That is how it can be dynamic and recover from failures or changes, because every time you "bind" to a resource you are really sending out a search that might return a different way of connecting to that resource.

I actually think well-known IDs are one of the pieces we need for vastly more usable P2P frameworks. It is much easier to write a P2P app that deals with human friendly strings (and much easier for end-users), then with 128-bit GUIDs.

You also talk about Jini-like services for JXTA. If you look closely at P2P Sockets, that is kind of what we are trying to do, but using a service-description language that already exists: the HTTP based web. In the REST philosophy, or for web sites in general, we can "treat" any resource using a limited set of verbs: Get, Put, Post, etc. In essence, we can treat any arbitrary resource the same way Unix does, as a file descriptor. Even if that thing on the other side is actually an object oriented database, a camera running an embedded web server, or an administration console, we can Get from it, Post to it, etc. We have a baseline abstract way of dealing with it. Why shouldn't we be able to do the same thing with P2P networks? Why should we have to delve into JXTAs overly complex ModuleClassAdvertisement, ModuleSpecAdvertisement, etc.? Why should we have to deal with Jinis mobile code? Why can't we simply have a collection of easy to contact endpoints, with domain names or easy names just like web resources, which can be contacted and spoken to with HTTP? Under the covers that HTTP may actually be going over JXTA pipes and traversing firewalls, but at the end of the day it looks like a stream. I get the resource using the well-known name, then start talking to the resource. I first talk to it using HTTP verbs; I could be Getting a file from that peer, getting a web page, getting the results of some computation running an embeded device, whatever. If I need to get more complex, I can start using XML-RPC or running things that look more like servlets that I Post to. This comes to the other side: where I implement my service. Why not use a technology that already exists for implementing services, called Servlets? A servlet simply takes a request, handles it, then gives a response; it doesn't care whether that request comes in from a peer-to-peer mesh or from a standard TCP HTTP connection. P2P Sockets includes a servlet engine that has been tricked into receiving all of its requests and sending its responses to peers on a peer network. You publish yourself to an endpoint, such as "www.nike.laborpolicy", and build a servlet that receives a request, such as "GET HTTP/1.0 www.nike.laborpolicy/somefile.txt", does something with it, then sends it back to the original peer. If several peers are servicing that name and can receive requests for "www.nike.laborpolicy", then you've now created something like a JXTA Peer Group Service, without having to read a book about JXTA (you just have to know how to work with standard Jetty and servlets).

So the web itself becomes our abstraction layer that programmers "write" to, underneath which hides a P2P network backed by things like your binary ID work, Mohammed's JxtaServerSocket, and JXTA.

The last thing we need is to actually return search to the mix. Just as a directory service can support retrieving something by its well-known ID, such as "someone@someemail.com", it can also support finding something by attribute, as LDAP does. So finding something indirectly by "attribute"/metadata or content is also needed, which JXTA doesn't do as well. We can also hide this and make it look like the existing web.

One way to do this is to reuse a concept that programmers already understand: search engines. Why can't we define well-known endpoints that look like domain names, such as "www.jazzmusic.search", and which we can POST a search request to and receive back an XML or HTML list of "endpoints" (i.e. other P2P Sockets domain names) that have the requested search values? Under the covers, the P2P Sockets framework would know that any endpoint that ended with the word ".search" would actually mean to use JXTAs Rendezvous or Resolver functionality to search for advertisements that had the metadata you want in the Post request.

For example, a future version of Paper Airplane (http://www.paperairplane.us), which is built on P2P Sockets will have something called the Paper Airplane Directory. This is a simple, Yahoo-style hierarchical directory of available Paper Airplane Groups, which are basicly just web sites with web style domain names, such as "www.boobah.cat". This directory will look just like a search engine that you can Post metadata to and search for metadata on. To indicate that you have a Paper Airplane Group that you just created for the category "Politics" and the sub-category "Campaign Finance Reform, you might send a standard HTTP Post request to "www.paperairplane.directory/politics/campaign_finance_reform?site=www.campaign.reform". If you want to see what sites are in this category, you would send an HTTP Get request to "www.paperairplane.directory/politics/campaign_finance_reform".

Here's the cool part. Every peer in the Paper Airplane peer group is running the modified servlet engine and can receive requests for "www.paperairplane.directory" (i.e. it is a JXTA peer group level service). When it receives a Get or Post request, it has to actually talk below the P2P Sockets framework to JXTA and do a search request to JXTA Rendezvous peers to find advertisements for Paper Airplane Groups that have <category>Politics</category> and <subcategory>Campaign Finance Reform</subcategory> tags. Once it gets enough of them, it can use a standard JSP or servlet to process them together and spit out HTML or XML back to the "peer" that contacted this service. Since the original peer "contacted" this "website" through a browser that was configured to use the P2P Sockets P2P to Web Proxy, a local proxy that tricks browsers into thinking they are talking to normal web sites when in fact they are talking on the JXTA network, the original peer simply receives back their HTTP response as a nice web page they can display to the user, which then looks just like a standard Yahoo style page.

So it breaks down to this. HTTP becomes our universal way to contact services and provides a low-level way to "talk" to any service; things that look like domain names become our standard way to actually contact these services; and HTML/Mozilla XUL becomes our universal, easy-to-use UI language. You can now build P2P applications using the browser on the client side (though with P2P Sockets locally installed to intercept requests) using all your client-side knowledge and use servlets on the "server"/service side to handle requests. Underneath it all we use JXTA for the P2P primitives we need.

Basicly what you have is something that looks like Universal Plug and Play (HTTP + HTML + SOAP), which is a competitor to Jini, without having to learn very many new things and which is actually a P2P framework.

What do you think? The whole search engine portion is not coded; I feel like it could be made more simplified and more universal.

This page is powered by Blogger. Isn't yours?

Subscribe to Posts [Atom]