Coding In Paradise: Tuesday, April 25, 2006

This is my personal blog. The views expressed on these pages are mine alone and not those of my employer.

Tuesday, April 25, 2006

Now in a Browser Near You: Offline Access and Permanent, Client-Side Storage, Thanks to Dojo.Storage

[Note: This blog post is out of date. For up to date information on Dojo Offline, Dojo Storage, and Moxie please see the official web page.]

I'm proud to announce the immediate availability of dojo.storage and a new web-based editor named Moxie.

Imagine if web applications could store megabytes of data on the client-side, in the browser, both persistently and securely. No server needed.

Imagine if web applications could work offline with the click of a button. Want to access your web based word processor when you are not on the network, with your private files stored privately, right on your own machine and not on some server? Now you can.

Even better, imagine if all of this worked across the existing web; 95% of the existing browsers on the web could start using these features right now, with no software installs or funky new browsers.

What could you build if you had these tools? How about a truly collaborative, web-based word processor with client-side storage for your private documents, as well as offline access? Maybe an Ajax RSS aggregator with client-side caching of the feeds you read and offline access? An offline, web-based book reader using data from the Internet Archive's Open Library would be cool.

This is the vision, and now the reality, of dojo.storage.

For years I've been tinkering and prototyping to see if it was possible to achieve both persistent client-side storage as well as offline access across the existing installed browser base, cross-platform and cross-browser; there were many times I thought it was impossible, but I kept at it because it was an obsession I wanted to see exist. For the last eight months I have worked on it for at least two or three days a week, on my off days when I am not working for my consulting clients. Also, a few months ago, Julien Couvreur had a breakthrough when he figured out the HTTP header to send from a server to achieve offline access.

I've finally finished; dojo.storage is now done, in beta, and fully open source under a business-friendly BSD license.

I've put together an example application to show dojo.storage called Moxie, a web-based word processor with persistent, client-side storage, no server, and full offline access. It is open source and is in the Dojo repository. It works across the big three browsers: Internet Explorer 6+, Firefox, and Safari.

Take it for a spin. Use the rich editing control to enter a document, type in a file name, and select the Save button. If the document is above 100K, you will be prompted on whether to accept this storage request; user's can deny storage requests above 100K. Choose the file from the Load File pull down to reload the file, even if you have closed the browser. Follow the directions on the page on how to do offline access. I've done a bunch of QA around all of this, but the dojo.storage framework I created that powers this is still in beta; if you run into a bug or problem email me at bkn3@columbia.edu with the details.

Moxie was put together in one day using the power of the Dojo framework; it uses Dojo Widgets (the rich editing control); Dojo Events; and the newest member of the Dojo team, Dojo Storage.

What is dojo.storage? Dojo.storage is a unified API to provide JavaScript applications with storage. It is a generic front-end to be able to provide all JavaScript applications a consistent API for their storage needs, whether this JavaScript is in a browser, a Firefox plugin, an ActiveX control, using Windows Scripting Host, etc. Further, the storage backends can use whatever mechanism is appropriate; dojo.storage automagically detects its environment and available storage options and selects the most appropriate one.

You might have heard about AMASS, the Ajax MAssive Storage System, and are wondering about the relationship of AMASS and dojo.storage. AMASS was a proof-of-concept prototype of Flash based storage I released in October 2005. It was not widely QAed and only worked in Firefox and IE. Dojo.storage now supersedes and replaces AMASS.

The dojo.storage architecture is simple; a JavaScript application interacts with the Dojo Storage Manager, which selects the best available Storage Provider and makes it available. Storage Providers implement a generic interface, which makes the underlying storage system look like a simple hash table that can be saved and loaded from. Storage provider can optionally be persistent, and can provide metadata about their capabilities (isPersistent, hasSettingsUI, getMaximumSize, etc.)

Some possible storage providers:

Cookie Storage Provider - uses cookies to persist the hash table
Flash Storage Provider - uses Flash's SharedObjects to persist data
ActiveX Storage Provider - uses COM's File APIs to persist data
XPCOM Storage Provider - uses XPCOM's File APIs to persist data
Form Storage Provider - uses the text autosave features of a hidden form to save transient data (the Really Simple History library uses this trick)
WHAT WG Storage Provider - uses native browser persistence to store data, as defined by the WHAT Working Group.
IE Storage Provider - uses IE's proprietary abilities to store up to 60K of data.

Right now the dojo.storage system includes one provider, the Flash Storage Provider; others are not implemented yet but should be straightforward if folks want to contribute.

The Flash Storage Provider uses features available since Flash 6, including Flash's SharedObject's.

Why Flash? Flash now has a greater installed base than Internet Explorer; Flash 6+ has a 97.1% penetration across the installed base of PCs, while IE 5, 6, and 7 have 64.7%. Flash is probably one of the most installed pieces of software on the planet.

The Flash Storage Provider uses Flash as a hidden runtime to extend the browser, because of its ubiquity and cross-platform/cross-browser qualities. When browsers get native persistence support, your dojo.storage applications will continue to run since you write them against the generic dojo.storage APIs.

I'm going to quickly take you through Moxie, to show you how to use dojo.storage to build applications.

Moxie has two files: editor.html and editor.js. It has a singleton JS object named Moxie, which you can see if you look at the editor.js file.

It's HTML is straightforward, except for one thing: we use the Dojo Editor Widget, which we specify in HTML as follows:

<div id="storageValue" dojoType="Editor"
       items="textGroup;;blockGroup;;
           justifyGroup;;colorGroup;;
           listGroup;;indentGroup;;
           linkGroup;">
  Click Here to Begin Editing
</div>

The items attribute tells the Editor widget what toolbars we want visible, and the between them says to put a seperator between them.

The next big step, inside edit.js, is to import our packages, including dojo.storage; this is at the top of the file:

dojo.require("dojo.dom");
dojo.require("dojo.event.*");
dojo.require("dojo.html");
dojo.require("dojo.fx.*");
dojo.require("dojo.widget.Editor");
dojo.require("dojo.storage.*");

Next, at the bottom of the file, we subscribe to find out when the storage system is ready for us to work with it; we can not load or save values until the underlying storage provider is ready. In this case we want to call Moxie.initialize when storage is ready; if by the off chance it is already ready when we get to this code block, then we want to wait until the page is fully loaded before working with it:

if(dojo.storage.manager.isInitialized() == false){
  dojo.event.connect(dojo.storage.manager,
                     "loaded", Moxie,
                     Moxie.initialize);
}else{
  dojo.event.connect(dojo, "loaded",
                     Moxie,
                     Moxie.initialize);
}

When we are all loaded up, we can start to play with the storage system. For example, to load a value that was previously saved with some key, we just do the following:

var results = dojo.storage.get(key);

To save some value:

try{
  dojo.storage.put(key, value,
                   saveHandler);
}catch(exp){
  alert(exp);
}

Value can be a string or even a complicated JS object; we internally JSON everything before storing it as a flat string, and turn it back into an object on later retrieval. Dojo.storage actually does some nifty autodetection if it is working with strings, to avoid the JSON performance hit of evaling() and bypass this, which gives much better performance for storing large strings into storage, like XML files or digital books as you will see in a later demo in this post.

Notice the saveHandler above; a storage system can optionally ask the user if they want to accept or deny your storage request, so you must be ready for a save request to fail.

saveHandler is a callback function that receives two arguments. The first is the status, which can be one of three values:

dojo.storage.SUCCESS - Saving was successful
dojo.storage.FAILED - User denied storage request
dojo.storage.PENDING - The user is being prompted with some UI on whether to approve this storage request

The second argument is the keyName that is being saved; since saving is asynchronous, it is sometimes useful to know which key is being worked with on the callback.

The Flash Storage Provider pops up an underlying Flash storage dialog to the user after 100K, and every order of magnitude increase after that:

Here's an example saveHandler:

var saveHandler = function(status, keyName){
  if(status == dojo.storage.PENDING){
     // ...  
  }else if(status == dojo.storage.FAILED){
     // ...
  }else if(status == dojo.storage.SUCCESS){
     // ...
  }
}

try{
  dojo.storage.put(key, value,
                   saveHandler);
}catch(exp){
  alert(exp);
}

There is more to the code, but those are the important bits in terms of understanding dojo.storage. It's pretty straightforward to use.

I'll briefly describe dojo.flash here, which is the layer that seperates out JS and Flash communication. Cross-browser, fast, reliable JS+Flash communication is really hard and ugly, so I encapsulated this portion out into it's own layer. The great thing is you don't have to know any of this externally; dojo.flash and dojo.storage work together to figure out your Flash capabilities, and use the appropriate mechanism internally. Zero hassle.

Dojo.flash provides several major services:

dojo.flash.Info - Is Flash available + what version of Flash?
dojo.flash.Embed - Embeds Flash into page for Flash+JS communication
dojo.flash.Communicator - Provides uniform, fast, reliable, JS + Flash communication
dojo.flash.Install - Uniform installation and upgrading of Flash

dojo.flash.Communicator was the real doozy to create; it was a pain in the butt, to put it lightly. This area of the system provides a method abstraction between Flash + JS. For example, JavaScript can call sayHello(), which is some Flash method, while Flash can execute DojoExternalInterface.call("dojo.storage.save", resultsHandler) to run some JS method.

The heart is something called DojoExternalInterface, which is a backport of Flash 8's ExternalInterface to Flash 6. I didn't want to create this backport, but the complexity of handling all the internal tricks I was doing to make this stuff work required that I wrap this magic in some kind of maintainable API. DojoExternalInterface makes it possible to register Flash methods that can be called from JavaScript:

DojoExternalInterface.initialize();
DojoExternalInterface.addCallback("put",
                                 this, put);
DojoExternalInterface.addCallback("get",
                                 this, get);
DojoExternalInterface.addCallback("remove",
                                 this,
                                 remove);
DojoExternalInterface.loaded();

There are three ways to do Flash+JS communication:

Using LiveConnect/ActiveX + fscommands - Flash 6
Pro: Extremely fast, can send very large data, mature
Con: Only works on IE and Firefox
Using ExternalInterface - Flash 8
Pro: Easy to use, Works on Safari
Con: Unbelievably slow, performance degrades O(n^2), serious serialization bugs
getURL/LocalConnection/New Flash object for each call - Flash 7
Pro: Very cross platform
Cons: Destroys history, serious data size limitations and performance issues

Only methods 1 and 2 are acceptable for the Flash Storage Providers needs. It turns out that we use method 1 for IE and Firefox, and method 2, ExternalInterface, for Safari. I found workarounds to fix ExternalInterface's serious bugs, so that it works on Safari and is fast and serializes correctly, but these workarounds only work on Safari.

The Flash 8 communication support for Safari took 3 months to figure out; pain in the behind. I won't go into detail on the method and workarounds, but here is a bit of info on the workarounds:

We chunk method call data into many different small calls through ExternalInterface,
which makes performance linear rather than O(n^2), which is what the standard ExternalInterface's performance is.
I used a debugger to find hidden JS serialization methods used by the Flash plugin, which internally uses XML serialization and made all sorts of mistakes in terms of serializing characters - for example, it doesn't escape characters and also uses eval() so its slow as dirt. I found a way to bypass this internal serialization, do it all manually, and make the method call myself using an undocumented JS function.
I now double encode and decode all XML characters on both sides:
& --> && This is very important for persisting XML, and I did lots of testing around this to catch every single problem character (including nulls, for example)

With these workarounds, performance and reliability on Safari are great. Unfortunately, these workarounds only work on Safari, so we can't use ExternalInterface cross browser.

Check out this testing page, which provides a lower level interface to dojo.storage; there are quick links on the left to fully save an entire book into the storage system, in this case Faust by Goethe at 250k courtesy of Project Gutenberg; there is also a quick link to save an example RSS XML feed into the storage, in this case an atom feed from my weblog. Try this on Safari; in the past it took minutes and froze the machine to save the book, now it takes seconds.

I won't go into the Flash 6 communication necessary for IE and Firefox support; there were lots of fun different things that had to be figured out, like how to center the Flash dialog even if you are across many different kinds of HTML doctypes, browsers, and platforms, or isolating timing issues like the fact that the fscommand infrustructure on some versions of IE comes up after your Flash applet is already running.

dojo.flash was really hard to make, but I tried to isolate the pain into that module so that it is encapsulated and ready for folks to use. Externally its easy, internally it was hell.

Dojo.storage has had lots of what I call poor-mans QA testing, which means I went to coffee shops and Kinkos, paid them money, and used their machines to test all of this code; I also begged random people to go to the pages and saw how it works. I've spent a bunch of money and done alot of QA testing, but this is still beta; if you find a bug or have something wierd email me your browser and platform details at bkn3@columbia.edu.

For offline, Julien Couvreur discovered the necessary HTTP headers. The core part of it is to use HTTP caching; your site must have the following kinds of HTTP headers on:

Etag
Last-Modified
Expires
Cache-Control

Etags and Last-Modified are on by default on Apache 2.0; the rest have to be turned on in httpd.conf with mod_expires:

LoadModule expires_module modules/mod_expires.so

<Directory "c:/dev/dojo/">
  ExpiresActive On
  ExpiresDefault "access plus 1 month"
</Directory>

The page is now in the browser cache after the first access. In IE and Firefox, the user has to go to File > Work Offline to work offline, and then can simply just navigate to the app's URL. In Safari, this is not necessary, and you can just go to the URL. I like to provide a link that can be dragged to the toolbar.

The offline and dojo.storage work together, because whether you are offline or online you can access the same persistent storage, saving data while offline then syncing when online. Expect a dojo.offline and dojo.sync in the future that will provide abstractions for common operations like this. I'm looking for financial sponsors on this if you are interested.

Dojo.storage is in beta and is inside the Dojo Subversion now. It will be bundled with the next release of Dojo, including with Moxie, but you can download it now. Links:

If you try out dojo.storage and run into problems, remember that I did this all for free and make my money and food from consulting, so I'm not available to provide free support; please consider a consulting contract. I do want bug reports though.

// posted by Brad GNUberg @ Tuesday, April 25, 2006 11 Comments Links to this post

Subscribe to Posts [Atom]

Coding In Paradise

Tuesday, April 25, 2006

Now in a Browser Near You: Offline Access and Permanent, Client-Side Storage, Thanks to Dojo.Storage

about me

Archives