XMPP widgets
2008-02-04 19:31:01 by Fabio FornoThis morning on the train I've started writing a post about this concept, to which I've dedicated quite a thinking so far. Then I've read this article wandering whether XMPP is the 'Next Big Thing", especially the possible counter arguments in one of the links (also Kirchpatrick is quite positive in the piece, but he objectively points out some possible obstacles):
The primary arguments against a future powered by XMPP are two. First, so much of what's already been developed is web-centric, based on http, that the options for mashup-fodder are relatively limited for XMPP. For a service to integrate a number of new and existing communication features, making the leap to a less ubiquitous protocol might not make sense. Time will tell if things like IM, Android and software like Jive's can change the near total imbalance in marketshare between communication protocols online.
The second argument against this rosy picture of the future could be that open standards-based technology falls outside the profit model of many larger companies. If one vendor can corner their respective model with proprietary technology and charge a monopolist's premium for superior service, then a standards based competitor will have their work cut out for them. Google Talk's use of XMPP may have been the straw that broke the camel's back in IM because the chat client rode the coat-tails of so many other Google services.
While the second argument is a non argument, since the key factor of the success of the Internet has been it's lack of legacies and the capability of freely adding value at the edges, the first argument is serious. The Internet is webcentric, and it's not only for legacy reasons, but because the web is still a successful model. The web deployment chain is straightforward, incredibly simple and cheap: you don't have to worry (almost) about the clients, just build you service and convince people using it.
With XMPP the situation is slightly different. Let's suppose I build what could be the next killer application in social networking, and I'd like tu build it upon XMPP. XMPP is the ideal candidate, since it already supports virtual identities, presence, information push, complex event distribution with pubsub, federation and many other interesting features that are required for the next generation of social tools. From the engineer and the geeky point of view the environment seems perfect. But there is problem, and it comes when facing the client side and the number of potential users: what are my options for making a nice, easy to use interface for my service? So far the available options are:
- Text only messaging (think of jaiku or twitter bots). It is not that appealing, since users must remember text based commands, and navigation between text menus - if the application grows - may be quite cumbersome; moreover: where are the nice icons ang graphics making my application cute?
- Ad hoc commands. I've been a fan of ad hoc commands since the beginnig and built a few services based on them, but I have never ben 100% satisfied. At the beginning the frustration came from the lack of support in clients, then I believed that some of the problems came from the different implementation levels (e.g. some clients don't support <reported/> fields for complex results), but now I'm sure that, though extremely powerful, ad hoc commands fit only one scenario, when all the interaction may be done using linear wizards. The problem is that most of the interactions aren't linear, users navigate through menus and jump to different sections of the application. For example I receive a list of blog posts, I read few and mark them as read, then I reply to few others and finally I get back reading the remaing posts. Having to start a wizard for each of these operations is time and bandwith wasting. That's the reason why all the rich and successful distributed applications keep some state (sometimes a lot of state) at client side. AJAX is an example and the DOM structure of the displayed document may be a quite complex state kept by the client. More on this later
- Extend a client and add special support for your application. This is tempting, but that's the evil side of force, your just shifting the problem: from many incompatible messaging layers, to many incompatible services above the same layer. This approach will force users to upgrade their software just using your service and perhaps keep multiple client instances for using the different applications. I don't think this is the way to go.
So what to do? Indeed the solution is quite simple and already in front of us: if you can't beat them join them. The web model is a winning model, why not merging or using the same approach? Just one client, infinite services, with the only difference that this time all is included not in a browser, but in a communication platform that can be run on your desktop, but also on your mobile or on your car navigator. The communication platform is XMPP of course, and the infinite services can be run in small widgets living inside the client. Which client? You have at least two options:
- the traditional web browser
- any jabber client supporting "xmpp widgets"
In the browser case the implementation is already available and quite simple: embed in your web page a javascript XMPP library like JSJaC or our upcoming kometo and start manipulating the DOM of the hosting page through commands sent via those libraries.
More interesting, imho, is the concept of XMPP widgets, since there is the possibility to embed small XMPP based UI snippets in almost everything and every device. We just reverse the paradigm, we do not embed a XMPP library in a rendering engine (the browser) any more, but we embed a rendering engine in a XMPP client. How? we could use plain XHTML + javascript, since engines like WebKit are very powerful and available almost everywhere (android has it and it's embeddable in user applications, symbian has a similar control). But it is possible to have some more basic profiles for very limited platforms such as J2ME, with simplified rendering environments and scripting languages like Hecl.
So, finally what is a XMPP widget? Basically it's a restricted environment inside a XMPP client and directly controlled by a remote XMPP entity, supplying:
- a display area where the widget is drawn; this are may be either visible or hidden, but also when hidden is capable of receiving events and updates form remote services;
- a rendering engine, capable of drawing the model of the widget in the display area (where available XHTML is the best choice, but simpler models may be useful);
- an optional local storage for persisting some data required at client side;
- an optional scripting language for running tasks at client side (eg. Javascript);
- a model for possible input events that must be sent the remote controlling entity (e.g. mouse, button clicks, list selections etc; hide, display or pause of the widgets etc)
Just try to think what it could be possible to do if jabber clients supported this feature. You could write your killer microblogging application and instantly have zillions of possible users, instead of being only able to send the commands using the PSI xml console ;)
More soon about the actual implementation of xmpp widgets.
Hi Fabian, I was trying to fit the XMPP widget idea long ago :) http://jabbermania.blogspot.com/2007/07/why-xmpp4moz-needs-to-be-standard-whats.html http://jabbermania.blogspot.com/2007/11/xmpp-services-as-opensocial-providers.html http://jabbermania.blogspot.com/2007/11/meebo-invented-everything-us-in-2007.html Also, Pedro Melo (http://www.simplicidade.org/notes/ ) is working on such (he calls them "Xicklets"), you may know him as one of the forces behind SAPO messenger (which has 3 million potential users, not bad for a start of such platform) Now let's get back to the topic specifically. The thing is about code mobility, and now we can safely reference to the original rest article (which isn't as bad as restafarians usually are): http://www.ics.uci.edu/~fielding/pubs/dissertation/rest_arch_style.htm I'll show you what design issues my team had on the same topic: 1) Features provided by XMPP - (RealTime) Communication channel between two application instances from the same source - (RT) Communication channel between an application instance and the participating users - RESTRICTED? Presence information of the participating entities (users + bots) - STRICTLY RESTRICTED: Roster information + all contacts' presence of the user 2) Features provided by HTTP - Remote script inclusion (like: Google Gadgets) - Developers MUST be able to write XMPP-supported applications in pure web-style (eg. PHP scripts) - It SHOULD be easy to adopt already written applications for the new platform, either which uses real-time communication and those who aren't 3) Features derived from browsers: - Ability to run javascript applications - Ability to display XHTML + CSS - DOM API (perhaps E4X as well? at least, at a later time) - Sandboxing and cross-domain code execution - Some kind of interaction between the XMPP client (which SHOULD be able to be a web client too!) and the application Restricted features could be accessed if certain security checks are met (like: the page is from HTTPS, signed by an acceptable authority, like the enterprise running the XMPP infrastructure, or an official XMPP extension) Our current example is a Go (simpler than chess) client which should be written in pure PHP + JavaScript (not because we ourselves cannot use more sophisticated, but it's the lowest entry barrier) Another good example is whiteboarding, it's done such way in Yahoo! Messenger. We'll see if we could extend it further. Hope we could meet at DevCon and have a conversation about that (but it shouldn't stop you to answer my comment :)
At DevCon I'm sure we will have some interesting conversations on the topic. Indeed the idea of XMPP widgets was blinking in some of my background processes long ago, and the posts you cited helped its emersion ;)
About REST principles... you are right, you only don't have to pay too much attention to "restians". Some are too much fanatic, others believes they are REST compliant just because they use HTTP and some obscure action beyond GET and POST, believing that REST principles can't be applied to other situations. Unfortunately it's a buzz word that MUST be used, often out of contest. For example last week when I started speaking about XMPP to potential partners, they replied "uhm, we use only REST", as if it where a protocol...
Anyway you got it right, some of the basic principles of widgets should be designed having in mind REST. AFAIK REST simply says: all you need is a stateless protocol able of doing actions on remote resources, that's all. There is a client (the subject), an action (the verb), a target resource (the object) and a stateless protocol (a simple tool) and that's what we have with XMPP widgets. All the operations on the widgets are simple GET, SET, DELETE operations on a particular items, which are identified by pairs of values which are equivalent to URIs: (target jid, node name). If you consider it from this point of view the approach is very RESTful and you have a simple, extensible, framework for controlling remote objects via XMPP.
I hope to be able to write more actual examples before FOSDEM so we can have more beef for the discussion ;)
Unfortunately, this is not AS easy :)
In order to use the XMPP channel, some trick must be done
Consider a Chess, or Go server. You're one of the players, and now it's the opposite's turn. Where do you get it from?
- Of course, you should get it realtime (otherwise why is the XMPP for in the game?)
- How do you know it was a legal move? Only the application (chess/go) server, the game manager should be trusted, so normally you'd send your step to the server, which in response asks every client for update, since it's the authoritive entity. (It's a simple "observer pattern", and is bound strongly to the Document-View / MVC model which usual GUI developers are accustomed of).
- But how can a PHP script send updates to a client, in which isn't in current connection? (remember, the OPPOSITE steps, but it needs to notify you!)
- Maybe a daemon should be needed? Where should it reside?
- Or make it easier and have it in a P2P-fashion, so clients notify each other?
- Or have HTTP-Bind connection with that PHP script? Or Comet? Would they like this approach?
- Or use a Publish-Subscribe node, which could be accessed from PHP?
- Does it need a toolkit on PHP developers' side? Wouldn't it hinder adoption?
- Or does it need a special module on a jabber server? On what jabber server? (consider you're on a bluendo host, your opposite is using gTalk)?
- What about gTalk users? Google is slow to adopt anything, how do you enable them to play?
A lot of questions, some are really hard.
We should look at how meebo did this, perhaps try to contact them; after all, their system uses jabber, they're young, I hope they could participate in such a conversation
We MUST keep it easy ;) I understand all the issues you pointed out, but they belong to what I may call an evolutionary step of what I call XMPP widgets. Another key factor of the success of HTTP I didn't cite, is its inner simplicity (see the comment of Jonas below) that allows almost any extension you can think about, and so must be any protocol aiming at being a general purpose framework. The starting scenario I considered is far simpler, in which no coordination is needed. The range of applications is still pretty wide: microblogging à la twitter, remote control of objects when some lightweight ui delegation is needed, realtime monitoring of remote processes, access to remote services that could take advantage from presence such as email, news and so on. Basically I'm talking about human machine interaction, where the MVC approach applies where well. The main difference from the traditional web approach is presence, which means much more than the simple "hey, don't bother now, I'm busy" that usually people have in mind. Presence is the ability to deliver the right piece of information accordingly the current needs or status (meant in its widest meaning) of the user, to shape it to the current device or connection type, to receive it in a timely manner with minimum bandwidth. The further step will be real realtime collaborative applications between humans, in which the MVC approach should be rethought in order to cope with fact that all entities may be either view or control and the model is distributed. However I'm sure that with a strong and working ground allowing ui delegation or remotization over XMPP that task will much easier.
I'm sorry, but I fail to see the point of this. There are close to a zillion different RPC protocols out there, why would anyone want to use XMPP? If you want a pervasive protocol you could use HTTP. Or if you have really complex use cases, SOAP over HTTP. You'd have a much greater chance of bypassing firewalls. Plus HTTP is more in the Internet tradition (read: flat like SMTP) compared to the relatively complex XML streams employed by XMPP, so using it as a message transport is much more straightforward and well understood. Using XMPP for RPC end point addressing is an idea, but I think there is just no incentive compared to hostnames. Existing peer to peers widgets (read: botnets) sometimes use IRC for this purpose already. But if you really want your idea to succeed, I think you should focus on a great developer kit. Ease of use for the programmer is always a winner in the marketplace. A widget construction kit could be the killer app you are looking for.x
If you keep the comparison at the level of "transport" you are right, what's the point? With HTTP you can go almost anywhere and carry any data, but then you need an infrastructure for building distributed collaborative applications.
To make long story short the added value of XMPP are presence and a federated distributed system of online identities, which means the simplification of an incredible number of tasks where now in HTTP you are forced to find clumsy and not scalable solutions.
A quick example to make the concept clear. You are using a dozen of such services which give you realtime updates about an important project you are following (shared documentation), general news with different sources and importance levels, a simulation you are running on a server, a role playing game, the alarm of your house, a weather ticker, an agent monitoring your options, an assistant helping you in a learning task, a monitor of the movements of your kids, and so on. This is a possible portfolio of services you may use in the near future. They have different sources, different importances and requirements of "realtimness", and you may need to access them from different devices: the home and office pc, the car navigator, your phone, a flat monitor in your room. What you can do with the current "web" technology (please note, I'm not talking about HTTP, I'm talking about an approach for building applications)? Each time you change device you must login with the browser into a dozen a different pages and continue checking them for updates. OpenID may help you in managing the identity, without having to remember all the passwords, but you still have to go to that pages and login, wasting a lot of time. Furthermore the remote services will have little knowledge about the capabilities of your client and your current level of availability. For example you could be interested in being alerted for any little alarm from the agent monitoring the kids at any time and in any situation, while being logged on the phone many other services may have less or none priority. That's very difficult with the current web infrastructure, while almost free with XMPP: just update your status once and this information will be available to all the interested parties, and if there are new events from any of these sources you are promptly informed.
Jonas: while as an enterprise architect, I must agree with you that using XMPP as a middleware solution is at least questionable, the thing is that there's no other widely deployed IM standard today.
So, for building real-time collaborative (human2human) applications, XMPP could just fit the gap.
Presence theory has a very wide spectrum, and I don't know, which part of the spectrum is real and/or useful, but I'm certain not the whole. I had long arguments about that I feel differentiated presence (as in Jaiku) is better than simple status message passing (as in twitter), but noone outside the jabber community agreed.
As for widgets: forget human-machine interaction. It's just not interesting. Now that WLM 9 is upon us (with a new API and XMPP integration), and now that meebo has effectively done a propiertary widget platform built upon XMPP, we should concentrate onthe core function of jabber: real-time messaging between humans.
It's an often cited quote, here in Hungary: the killer application for computers was the internet. And in fact, internet is about communication. That's there are people at the other side of the wire too!
It may be a pre-stored information, like newspapers, still journalists do a communication BSc (or BA?) course usually. It may be a wiki, or a forum.
I could cite Wellmann-Gulia's famous publication, "Net surfers don't ride alone: virtual communities as communities", that statistically it has been shown, that the most used features, even back in the time of BBSes, were in fact, communication facilities: e-mail lists, newsgroups, message boards, private messages.
(Their study is online, google for the title)
I'm sure that social networks succeeded, because as an information system, they centered around one entity (object, class, record, "thing"): people. That's a kind of entity what is the most exciting and important for humans.
Accounting systems will be never be as popular as social networks, one factor is that they're needed for work, which is a hateful thing for most of humanity. The other thing is that they're about something strange, something alien to humans.
Most human-computer interaction is also for work, so they won't be too popular either, unless humans will be turned to machines, or machines would be turned to humans, so communication with machines will be interesting to them.
In the meantime, I think we should focus on the original goal of Jabber: communication, interaction [and collaboration] between people.
Uhm I don't agree with the fact that with meebo and WLM widgets there's everything we need. AFAIK all you can do is to place messenger area within in your application. That's not integrating UIs with messanging and presence, it's just part of the work since the messaging and presence services still live in different place from the business logic of your application. Correct me if I'm missing something of those services.
Usually I suddenly ask people to join a meebo game or a voip conversation with me; it arrives to their standard client, since if the other party doesn't use meebo, it arrives as a text link to a webpage (where there's a MUC in fact with "httpbind", but that's the technical side of the story)
When I speak of meebo widgets, I don't speak about the "MeeboMe", and "Meebo Group" embeddable applications: I speak about their development platform. It looks like this:
http://wwwe.meebo.com/platform/javascript.html
It can do whiteboarding, it can do voice over IP, videoconferencing, playing chess, Go (we prefer the latter:), and so on... and all of them are third party applications using the standard API (although mostly the flash one somehow).
If WLM would support such an API, and jabber wouldn't, I don't think anyone with windows, and some common sense (eg. without enthusiasm about open protocols) would prefer the latter. The equation would be simple: on one side, the possibilities are infinite, on the other, there would be less than in the current version of WLM (taking the average jabber client: Pidgin or Psi or Google Talk for Desktop).
So hurry up, we have to solve a problem BEFORE MSN solves for us! :)