This is my personal blog. The views expressed on these pages are mine alone and not those of my employer.

Tuesday, December 28, 2004

First Entrant for $50 Million Dollar Orbital Prize

There is now an entrant for Bigelow's America's Space Prize $50 million dollar orbital competition:

Solar Skiff is a company, a craft, and a concept. It is a revolutionary idea to begin the earnest exploration of the Solar System with real spaceships manned by explorers and colonists from the planet Earth. The company, Solar Skiff, will help make this happen by manufacturing the spaceships to carry out the exploration - and eventual colonization - of the Solar System.

The immediate aim is to develop and fly a reliable, reusable single-stage-to-orbit spaceship. If this can be demonstrated by January 10, 2010, and no one else does it first, the company will win the $50 million America’s Space Prize being offered by Las Vegas businessman Robert Bigelow. The long-term goal is to develop a versatile spaceship and associated mission architecture enabling routine manned spaceflight throughout the Solar System.

Solar Skiff itself is a new type of reusable space vessel with an elliptical cross-section, pointed bow, and lifting body shape, utilizing free and abundant resources and solar energy in space to manufacture its own high-energy rocket propellants. The source of propellant used by Solar Skiff is ordinary space-storable water or ice, readily available throughout the Solar System. Simple electrolysis using focused energy from the Sun splits the water into liquid hydrogen fuel and liquid oxygen for upcoming velocity increments.

It's hard for me to tell how legit this entrant is; does anyone know if they might have the ability to actually compete in the America's Space Prize competition?

Getting Some Nookie on SpaceShipTwo?

Some more details emerge on SpaceShipTwo and what passengers might be allowed to do:

"Instead of shoulder harnesses and tight seatbelts we want this roller coaster-type bar that you fold out of the way and you can float around," [Burt] Rutan said. "We think that’s important. If you want the view, we have handles there so you can float over and put your nose right against your own window. "Or if you want to pull down your science tray and do whatever you brought along for an experiment, or play with your cat - you have bought the ride, you paid for it.

"This experience is going to have very few restrictions on what you can do because these payloads are doing it for fun and every person has a different idea of what fun is. "Does that mean that some guy and his girl might want to take the whole ship? OK!" [emphasis added]

In exchange for some of the extra altitude and about 30 seconds of weightlessness, passengers also may have the option of landing in a different place from where they took off. "The ship could launch not far from Las Vegas and land in Mojave," Rutan said. "Or, we could launch offshore, start out over the ocean and then… fly over the mountains and land in the desert.

How to Solve the Britney Problem

The "Britney Problem" is a common issue in P2P systems that use distributed hashtable (DHTs) to organize their namespaces. In many DHTs a given key, such as "britney.mpg", is mapped to a single node that will store this file and serve all requests to it. What occurs is that popular key's get mapped to single nodes, causing hotspots that get a tremendous number of requests and therefore can not serve all client peers interested in the file.

Paul Campbell recently posted a very interesting proposal for a solution to the Britney Problem in DHTs on the p2p-hackers list:

There's a simple solution to solve the "Britney Spears" problem in my opinion in the DHT model (randomized searches don't suffer from this problem as a result of their structural fault that they handle common files well but don't handle rare files well). Let's first recast your question in aneven worse situation.

Let's say that we directly implement an inverted index (keyword-->file) on the DHT. Even with common word elimination, this certainly and severely generates a very non-linear index space. But it is something that is very obvious and desirable since it also solves a major DHT and randomized P2Pdilema: how to quickly search for files by keyword.

So far, most proponents of DHT's believe that hashing is the answer. While hashing MASKS problems, it doesn't actually solve the problem. All solutions that I've seen appear to be pretty simple. The idea is to spread the location of a particular key out over a wider arc. Conceptually, one can spread it in multiple points (such as spreading it to equadistant points
around the ring) or on multiple hierarchal rings (like the distributed sloppy hash tables...DSHT's although that was done for delay purposes). But the concept is essentially the same.

The routing algorithm remains valid. Since the routing algorithm incrementally searches for the destination. It just remains to send a search that is in the form of incrementally refining an "area" to search in. As the searchrequest passes through potential nodes, the search area gradually narrows.

Storage is slightly more complicated. It is necessary to store a key/value pair in the appropriate AREA of the DHT. The storage targets the area and not the point. The above search mechanism (which has to be used to find the appropriate destination for a key/value pair anyways) also paradoxically
nails down the appropriate area for storage.

thecircle (from your terminology, it appears that's what you've been looking at) attaches a random bit string to the end of the hash to accomplish this and then spreads a file over a small area using that random hash (although a lot of this isn't actually implemented when you look at the code). The various "tree" designs out there just use the hash keys directly and split storage as necessary. I've seen a few designs that attempt to simply delete common files from the DHT and index those using a randomP2P structure (thus taking advantage of both designs).

What I've noticed in general is this. In order for the routing properties of a DHT to be preserved, it is necessary to maintain the node link structure. But that's not to say that you have to maintain the same structure at the key/value pair level of the system...that is, a direct
one-to-one mapping is necessary. The "skip graph" structures specifically acknowledge this. Simply substitute "key/value pair" for "DNS name" and you convert those structures to the idea I'm referring to.

In a similar way, one could simply store "first key/last key" values in addition to "node keys" and decouple the data structure from the DHT link structure. This is similar to index tabs in a rolodex...no matter how many cards are in the rolodex and no matter what the ordering, you maintain labels and you can approximately locate a card using the same search mechanism all the time.

As to the basic problem of "too many keys with the same value", this is also pretty easy to deal with. Extend the key/value pairs so that for certain operations (searching), you use just the key as before. However, optionally treat the keys as "key+value" meta-keys. For instance, when storing a new key/value pair, use a "search-store" function that treats
the key as "key"+"value" and the value as just "value". Then no matter how identical keys there are, there is still a valid ordering and the number of keys with the same values are no longer a limiting factor. Of course, if your system doesn't require a total ordering on all keys and values (the node balancing algorithms more or less do require this at least right now), then you can dispense with the "key+value" idea and simply ignore this issue.

The one remaining problem is node balancing. The mechanisms that are already published seem to be pretty uniform in design. Nodes that undertake rebalancing can do so using one of two mechanisms. Either an empty node (that may have to be created by dumping it's current payload) is moved into position on the ring, or else incremental data movement is undertaken. Since the former incurs a much higher communication cost, the latter is given
preference as much as possible. However, it has been shown that both mechansims are necessary for load balance to remain bounded.



This page is powered by Blogger. Isn't yours?

Subscribe to Posts [Atom]