Tales of a Developer Advocate

Getting Your App to Support Web Intents on Chrome

Chrome just got Web Intents support in Dev and Canary builds (18 onwards).  This is a huge milestone and I am very excited by this first step along the path of building a more connected web of apps.

A lot of developers have asked me how to get started as it seems some of the demos on http://demos.webintents.org don’t register correctly.  I have a good answer for that - in short: Chrome doesn’t yet detect the intent tag, instead applications currently can only register their support for an action such as ”share” via the Chrome apps manifest.

The longer version is a little more complex:
  1. Consensus over the introduction of a new tag in to the spec has not yet been reached.
  2. Working with members of the DAP in the intents task force, it is clear that discovery of applications and services shouldn’t only take place by detecting a tag on a web page.  What happens if the service you want to “Share” a video too is a TV connected to your local network? Or an external native application wants to be able to support a “Save” action.  To enable this important use case the User Agent should be able to determine the services it presents to users, and this is why this is allowed in the specification (3rd paragraph).
Bringing this closer to home, because the discovery and presentation of an app’s capabilities can be managed by the User Agent, and Chrome has the concept of extensions and installed apps we can quickly enable the intents feature by letting developers declare their support for actions in the manifest.

So what does the declaration in the Chrome apps/extension system look like?  It is pretty easy, it is an entry into the manifest called “intents”.  It looks like:

{
  “name”: “Share to Gmail™”,
  “version”: “0.0.0.2”,
  “icons” : {
    “16” : “favicon.ico”
  },
  “intents” : {
    ”http://webintents.org/share” : {
       “type” : [“text/uri-list”],
       “title” : “Share to Gmail”,
       “path” : “/launch.html”
    }
  }
}

It is that simple.  The intent section includes a dictionary of supported action (http://webintents.org/share) and in each action object there is an array of data types that the application or extension can handle, the friendly name to appear in the picker and a path to what should be opened when the user selects your app.  The client-side code remains exactly the same as it would in a normal web app.

In the long term we want applications to be able to declare their capabilities and services directly through their html and this will be done with the Intent tag.  However whilst the standardisation work continues we want to make sure that developers today can start building apps that can take advantage of the Web Intent system.

A lot more examples can be found on the Web Intents Github repository.

Expect a lot more posts about how to build applications that love each other with Web Intents.

Two Years and Counting at Google

As of February 1st I have been at Google for two years! Yay!  It has been an amazing time and I am truly honored to work for such a company.

A year ago (+1 week) I wrote about my first year experiences, it was pretty crazy in my first year, not only did the scale of the work hit me overwhelm me, but so did the sheer number of excellent engineers and colleagues.  My overall feeling was one of awe.

My second year has taken that to the next level again!

One of the highlights of the year was that I got to speak at Google IO - Mobile Web App development: Zero to Hero, where myself and Mike Mahemoff (with the help of Eric Bidelman and Boris Smus) built an app that worked across 5 different form factors using one code base - the awesome thing about this is that we built LeviRoutes (a url routing framework) and FormfactorJS (a form factor detection library).

I got to travel a lot too - I am trying to count all the countries that I have been too; it is a lot - USA, France, Spain, Portugal, Romania, Belgium, Germany, Poland, Netherlands, Czech Republic.  Whilst I was traveling I did a lot of presentations which is a really good way to meet people that I wouldn't normally get the chance to meet. I also managed to get completely misquoted on TechCrunch too - I say misquoted, I didn't even say the words that they attributed to me.

The biggest thing for me was that Web Intents became a real living and breathing project and first native implementations have landed in Chrome, it has changed a lot since my first announcement late in 2010.  This is a huge change for users of the web and I am immensely proud to be working on the project and with some amazingly talented engineers (James Hawkins, Greg Billock, Rachel Blum, Ben Smith and many others) and helping to define the specifications.  We have big goals for Web Intents so stay tuned.

One of my resolutions last year was to be in Liverpool a little more and I managed to do that, it turns out that I unwittingly pre-announced another project (that I honestly had no idea was about to launch) I am glad it happened though, there are a huge swath of developers in the UK that are outside London and I don't think anyone reaches them properly and this gives me a chance to work with them.

So.  What is happening in the year to come?  It is hard to say, there are a lot of cool things happening, but I know two things for sure: we will keep working on making sure users' experience of the web is improved through technologies such as Web Intents and we will keep working with developers to build awesome sites and applications on the web and prove that the web is our future.

On to my third year! 

Web Intents: A Fresh Look

We have a huge problem on the web today. If I built an image gallery application and I wanted to let users edit an image so that they can remove red-eye from a photo I either have to build an application that edits the images, or integrate with a 3rd party solution. Doing this is hard and stops you from building an awesome image gallery; and what happens if the user has a favorite service that they already use to remove red-eye? Simple, you have a frustrated user.

We have a solution!

In December 2010 I announced a project called Web Intents whose goal was to allow developers to build applications and services that could work with each other, but not need to explicitly know about each other – the concept has heavily inspired by the Intent system in Android, although the API bore no resemblance. It would allow you to build applications using just the functionality your cared about, and then delegate the other functionality to the users preferred choice of service.

After some conversations, I moved to try and support Web Introducer as a specification over Web Intents. For one reason or another this didn’t quite work out so I decided to plug away at revising the WebIntents work that I started back in November.

It turns out there is a lot of interest internally with the idea of Web Intents and how it can work in modern browsers. We set up a small crack team and after a flurry of work, speccing and prototyping how we think it might look we have put a prototype API on to Github. Have a play, it is really easy to get started.

So what changed?

A lot as it happens. It is not the same as the initial project that I experimented with, although the goals are the same. We have an objective to make the developer experience of the API so painless that most developers can start integrating with applications in 5 minute of reading the spec – in fact we want it so that most developers can just copy and past examples and it will work with their service. We have tried to drastically reduce the API surface and make it so there is literally only one or two lines of code you need to start an activity.

Service registration has been made even easier that my initial project through the use of a new tag, for example:

<intent
  action="http://webintents.org/share"
  type="text/uri-list"
  href="share.html"
/>

This small tag, that is included in the head of your application will signal to the browser the intention to handle a “share” action for a selection of URI’s (think “share this page”), and will register it in the system so that the user can choose it when a client application wants to provide “share” functionality in their app.

When the service is chosen by the user, and the service is loaded the intent data is passed to the open application and is available on the window.intent object.

For clients to initiate an Activity it is easy too. Simply declare an intent and start the Activity as follows:

var intent = new Intent();
intent.action = "http://webintents.org/share";
intent.type = "text/uri-list";
intent.data = "http://paul.kinlan.me";

window.navigator.startActivity(intent);

The system will take care of the service resolution for the action and compatible data formats and give the user the choice of using their favorite application to handle the “share” intent.

I have only just touched the surface of what you can do with the API. There are a lot of things that you can do with the API over and above what I have described in this 5 minute overview. A selection of examples can be found at http://examples.webintents.org/ where we show you how to build applications that solve some common use-cases. I particularly like the cloud kitten service provided by the “pick” example.

We are working with Mozilla to define a common approach to solving the challenges that web integrators face today. We are interested in hearing your thoughts and we are still thrashing out the API so bits of it might change but the intent is still the same.

My closing thoughts are: “This project will fundamentally change and improve the way we build applications on the web today for our users.”

window.name

I have learnt a lot of the last couple of days about inter-window and inter-iframe communication. I documented some of my frustrations about Web Messaging API’s and an attempted work around.

For you to be able to pass data into a window (that isn’t on your domain) so that it is available before the onload event fires in the opened window, the only sane way I have found is to set the window name via window.open.

Client:

var w = window.open("list.html", "some data");

Service:

window.onload = function () {  alert(window.name); };

Now that we can pass data between the windows, you can quickly imagine that you stringify a JSON object on open and parse it in the opened window. Pretty simple.

The good news is that the work-around works in FF, WebKit and Opera as is, but not IE.

To get it working with IE, it takes a few of hacks so I thought it best to document them here.

When you open a window via window.open, the second parameter is the name, in IE it must only contain [A-Za-z0-9_], this means that you have to base64 encode the JSON object for it to be able to be sent across, but that is not enough because Base64 encoding can only use certain characters. Base64 will also likely include an == at the end, which is not an allowed character.

However, IE doesn’t include a btoa and atob function for managing base64, so you will also need to find a library to use.

To encode the data I used the following:

var winname = window.btoa(
  unescape(
    encodeURIComponent(JSON.stringify(obj))
  )).replace(/=/g, "_")
var w = window.open(e.target.href, winname);

To decode the data I used the following:

var obj = JSON.parse(window.atob(window.name.replace(/_/g, "=")));

Pretty hacky, but it seems to work.

As always, if anyone has a better suggestion, or there are any obvious flaws let me know.

WebMessaging Is Broken

I have been working on a rather cool project recently that initially used a lot of WebMessaging (postMessage etc) to talk between all the components. However, even though these API’s look simple and easy to grok there are some bizaare limitations and usage of them is frustrating to say the least.

Ignoring the fact that Chrome passes structured clones, and Firefox passes strings, that is a simple difference to resolve. It is not even the fact that WebKit supports MessageChannels and Entangled Ports and no one else seems too. These we can work around in sane ways.

The normal developers flow is as follows: open a window/iframe, get a reference to that window, send it a message, have the frame or window handle the message.

Client App:

var w = window.open("test.html");
w.postMessage({ data: "some more data"}, "*");

Service App: test.html

window.addEventListener("message", function(e) {
  // Do something with the data
}, false);

This would be pretty simple and intuitive, something that nearly every developer would be able to pick up in an instant. But this isn’t the case, if you want to send a window a message you have to wait for it to load – which might be a logical assumption, but given that if the page you are opening is outside the origin of the opener, you can’t easily tell when it loads. So the current solution is on the host page to postMessage back to the window.opener, and for the opener to handle the message.

Client App:

var w = window.open("test.html");
window.addEventListener("message", function(e) {
    if(e.data.state && e.data.state != "ready") return; // do nothing.
    // Send data
    e.source.postMessage({ data: "some more data"}, e.origin);
}, false);

Service App: test.html

window.addEventListener("load", function(e) {
    // tell the opener that it is ready to receive messages
    e.source.postMessage({ state: "ready" }, e.origin);
}, false);

window.addEventListener("message", function(e) {
    // Process data from opening window.
}, false);

This is bonkers! Developers just want it to work.

There is a hack that allows you pass data to a window so that it is available to the window as soon as the script starts executing. It is probably not safe nor is it likely to be secure. When you open a window, you can pass it data immediately using the name parameter on window.open()

window.open("test.html", "{ data: 'ABC123' }");

And then on the opening page, you can read it back.

var data = JSON.parse(window.name);

There are problems here though:

  • We have no real proof of where the data came from, we can check window.opener but that is not enough
  • We have to ensure that the window.name is cleared down as soon as we parse it, because it will be available for the life time of the application.

This is a quick hack, that allows the opened window to read the data that was passed to it as soon as it opened.

What are your thoughts? Have you come across these limitations? Have you solved them in any other interesting ways?

Landing My First WebKit Patch. OnPopState Lock and Load.

This is a story all about how my life got flipped turned upside down….. wait what?!?! I can’t start a blog post with The Fresh Prince.

Last week, when I was still in my 20’s, I wrote a blog post about HTML5 History API needing a new event. This came about because the LeviRoutes framework would work better if it could understand when state had been pushed via History.pushState. Whilst investigating pushState and adding some tests to the LeviRoutes framework I wanted to be able to simulate an “onpopstate” event.

Let’s just quickly digress with a little bit about HTML DOM events. HTML defines a rich series of events that are fired when a user clicks on something, the page loads or….. well let’s just say there are hundreds of events. Not only do the events get triggered when the user or system does something, but the developer can easily simulate events. If you want to click on a button via script. Simple:

var evt = document.createEvent("MouseEvent");
var anchor = document.getElementById(......);
evt.initMouseEvent("click", true, true, window, 0, 0, 0, 0, 0, false,

false, false, false, 0, null);

anchor.dispatchEvent(evt);

Why is this programatic dispatch important? You can build individual tests that are responsive to the events that would fire to user events without having to build a system that tries to automate the UI by say, managing the mouse pointer through hardware control.

Digressions aside, I was building tests that would test my HTML5 History handling logic without me having to physically invoke a History.pushState command. In summary, separating the physical navigation from my logic.

var evt = document.createEvent("PopStateEvent");
evt.initPopStateEvent("popstate".....);
window.dispatchEvent(evt);

This should have worked. Instead all I got was:

var evt = document.createEvent("PopStateEvent");
DOMException

Which if you read the Event specification is what occurs when an Event type is not implemented. I quickly jumped across to Firefox and tested the same code. It worked. So it must be a bug in WebKit.

Now, this is where my real story starts, and what I hope will demonstrate the power of Open Source software.

I was in a bind. I could raise a bug and hope someone might pick it up at some point in the future, or I could raise a bug and try to fix it myself. I chose the latter. I didn’t think it would be hard – after all the Event system is already in WebKit and PopStateEvent is already implemented, it is only the hookup with createEvent that didn’t.

Where do you start? I started by downloading the latest WebKit code and building it. In all this process took longer than the actual fix.

Once I had a build, I decided to create a very simple test case to prove that it is still broken. With this in hand, I had a quick peek at the Webkit code. Google Code search is your friend here, I just searched for “PopStateEvent” and it returned a list of important places to look.

Inspecting PopStateEvent.cpp and PopStateEvent.h, I could see that there was a create and initPopStateEvent methods so I was pretty sure I was in the roughly the correct place. I also knew that createEvent is on the document object in the DOM, so I did a quick search for createEvent and there was a file called “Document.cpp”, this looked promising. A quick search in Document.cpp highlighted the area where the events are created, and there was a suspicious lack of PopStateEvent.

Bingo!

I quickly raised a bug, on http://bugs.webkit.org/ detailing the error with a simple test case attached, and then went about fixing the code. Raising the bug seemed to take longer than the fix, which amounted to adding in a condition for the type of event, adding in a parameterless constructor and then calling it.

Pretty quick. My own test case passed, so I had a strong indication that it was fixed, but I knew if I submitted it without an automated test it would probably get rejected. The problem is that I had no idea how to build the automated tests or where to put them.

I had a quick scan through LayoutTests, and in the “fast” directory there is an “events” directory which seemed liked the logical place to start. I followed the examples of other tests, I created a simple test and an “expected” results file and then gave the test runner a go. Boom! it failed. It took a little bit of looking, I found that the results of the test run were stored in “/tmp/layout-test-results/results.html” and it gives you a visual diff of the actual output vs the expected – it was a single new line character that was causing the problem.

That was me done. I created the ChangeLog and attached it to the bug, set r to ? (this was an oddity that I had to learn about). After the first review there were a couple of changes I needed to make. The second review indicated that I updated the wrong ChangeLog and some other smallish issues. But after that it was ok and submitted.

And here it is: http://trac.webkit.org/changeset/88187, it is not a complex fix but it is one I am proud of, if all the ports include the fix then my code will be used by the eleventy billion users of WebKit (ok – I have no idea of the number of users, but I know it is very large number) and now I can get on with fixing my LeviRoutes framework ;)

In my eyes, this is one of the powers of Open Source. Rather than just report a bug and hope someone picks it up, and then wait for the next major release of the software to see if it is fixed, I have the power to go in and fix the problem, and if it stands up to muster I can get the solution published.

Beautiful!

HTML5 History Needs Another Event

I love the HTML5 History API, it makes developing applications with a consistent URL scheme across server and client super simple, however it doesn’t come without its problems.

When developing the LeviRoutes URL routing framework it became obvious that we need some changes to the specification as-is. The Mozilla documentation reports that onpopstate is called whenever the history object changes, unfortunately this is not the case, and is not the case with the spec either, the HTML5 Spec indicates: “The popstate event is fired in certain cases when navigating to a session history entry.”, which you can logically assume to be only on a forward, backwards navigation, not a replaceState or pushState.

This might sound odd. There is no mechanism to detect change in url afterit changes. You get notification via onpopstate when the user navigates forwards or backwards through your application, but not actually when it changes. This is apposed to onhashchange which does fire when the document fragment changes.

This causes us problems.

LeviRoutes listens to changes in your URL, it allows you to build applications that are responsive to the current URL, and are decoupled from navigation. So rather than have your code littered with “if(url == ”/“) { doA(); } if (url == ”/categories”) { doB(); } you can now specify:

var app = routes();
app.get("/", doA);
app.get("/categories", doB);

Simple right?

It would be excellent if it was that simple, but it is not. We don’t know when the URL changes via pushState. This forces us o bind our logic in our controller to call the same code that LeviRoutes would call when it detects a change in URL via normal navigation.

Now I have to bastardise my code:

var app = routes();
app.get("/", doA);
app.get("/categories", doB);

.....
function gotoA () {
    history.pushState({}, "A", /);
    doA();
}
...

Pretty messy right?

I would love to see an event “onstatechanged” (or perhaps, onpushstate and onreplacestate) triggered when the user pushes or replaces state on to the history object so that I can capture the code and return to my simple routing logic.

var app = routes();
app.get("/", doA);
app.get("/categories", doB);

function gotoA () {
    history.pushState({}, "A", /);
}

I am not the only person asking for this: http://www.google.com/search?q=onpushstate

So, how am I going to solve this problem? I am going to wrap the History API and proxy pushState to call the natural pushState and also fire a new custom event. Hmph :\

When Are We Going to See the Death of SVG?

I have this bizarre mixed feelings about SVG, I loathe it and love it at the same time (according to urban dictionary the word is loave) and I hate myself for it - okay, hate is a strong word.

I constantly feel frustrated by its complexity, requirement for tooling (have you tried to create a path by hand) and half-arsed integration into the web of today.  I see the <svg> element much like I see the <object> tag, that is a boundary that rarely if ever should be crossed by mere mortals, a semi-permeable barrier where only through reverse osmosis can we wrangle some of the elements in our usermode. (to be read as, we can script it and hook it up in our app but that is about it).

The way I see it, once you get in to <svg> it is like a context switch, the HTML DOM and the SVG DOM will never really truly mingle.

But SVG has some serious awesomeness too - for one it is scaleable, two is vector based…. can you guess the third?  It has awesome graphical capabilities, you’ve seen the filters right and paths? There have been lots of project that I have worked on where I simply can’t build what I want because it is not available in the “web” sans SVG.  I want to be able to apply filters with out importing a SVG declaration, I want to be able flow elements out along a path.

Why can’t path be a css property?

p {
  path: “M 100 100 L 300 100 L 200 300 z”;
}

Text inside the <p> will be rendered along the path.  But what if <p> was a block element like a <div>?  Even better all elements be they block or inline will be rendered along the path…… That is powerful!

For me this is better than what we have now. I don’t want to have to have <p><svg>……</svg></p> when I want to do something awesome that SVG lets me do, that just doesn’t scale.

App Cache and HTML5 History

Whilst developing our latest app (https://github.com/PaulKinlan/ioreader) for a Google IO, we ran into several large is limitations with AppCache and HTML5 History that I wanted to share (and at somepoint hopefully solve).  

Putting the current discussion of the issues with AppCache aside for a couple of minutes, there is no provision in HTML5 History to include pages "pushStated" into the current App Cache Group

Consider this flow:  We have a multi paged app with pages A and B when rendered from the server sharing the same AppCache and thus in the same group.
  • User visits page A, it uses an app cache, so everything is cached.
  • User navigates from A to B, page B is added to the app cache group.
  • User goes offline
  • User refreshes A it is served from the Cache,  User refreshes B, it is served from the Cache.

Update the AppCache, A and B are re-downloaded and cached [see: Death by App Cache].

Now if we inject HTML5 History for the same flow:
  • User visits page A, it uses an app cache, so everything is cached.
  • User navigates from A to B (via pushState) …..  currently B is not added to the App Cache group.
  • User goes offline
  • User refreshes A it is served from the Cache,  User refreshes B, fail, because not in the Cache.
I believe that as URL's change if the master page is App Cached, the new URL's should be added to the App Cache group.  You are still in the same application, the state is just dynamically changed but in essence you are on a page that should be available offline.  The act of changing the URL client-side normally enforces you to generate the correct content on the server if the page was simple fetched. 

Death by App Cache: The problem that we also faced was as master entries are added to an App Cache group, when an update to the App Cache occurs, all those pages in the App Cache are refreshed, with more dynamic applications this could mean that 10's or 100's of pages are quickly downloaded by the App Cache software and thus can quickly cause a mini DDOS.

I am interested to hear your thoughts.

IO Question: How Long Did It Take to Develop the App? #io2011

One of the many question that we didn't get to answer in our talk - Mobile web development: Zero to Hero - is "How long did it take to develop the app?".  Luckily, we have an answer: The first commit of the project was on the 3rd of March, but that was mainly just a simple README file, the first commit of the server was 25th of March which is when we really started working on the code and began taking our ideas and basic concepts into a fully fledged solution

You might think that just over a months worth of work went into this (it was, in theory), but it is a little more complex than that.  We developed 2 frameworks, LeviRoutes and FormfactorJS and allowed us to push a lot of the common logic into a single controller.  Using FormfactorJS it was possible to assign engineers to each distinct UI and have them work on it in isolation.  FormfactorJS was used to specialize the base controller that was in every single page.  If you look in our "scripts" directory, you can see how we structured the project - controller.js is common across all interfaces, desktop/controller.js is then dynamically injected at run-time and using prototypal inheritance we added custom formfactor specific features (such as swipe detection in the case of tablet and mobile)

In total we had 4 engineers working on the project, each working on their own UI.  Each engineer spent roughly 20% of their time developing the parts of the UI that they were responsible for.

In all, it took just over a month of man effort to develop this application.