http://capedwarf.org/CapeDwarf Blog2015-07-27T13:31:16+00:00http://capedwarf.org/blog/2013/11/13/ImplementingCapeDwarfsUsersAPIWithPicketLinkSocialsOpenIDSupport/Implementing CapeDwarf’s Users API with PicketLink-Social’s OpenID support2015-07-27T13:31:16+00:002013-11-13T15:02:00+01:00Marko Luksa
JBoss CapeDwarf is an implementation of the Google App Engine API, which allows applications written for the Google App Engine to be deployed on JBoss Application Servers without modification. Behind the scenes, CapeDwarf uses existing JBoss APIs such as Infinispan, JGroups, PicketLink, HornetQ and others.
The Users API
One of the smallest APIs of the Google App Engine in terms of the number of API methods is the Users API. It basically contains only the following API methods: |getLoginUrl|, |getCurrentUser| and |getLogoutUrl|. By using the Users API, the application developer need not implement any kind of login system, because authentication is handled...
<div id="preamble">
<div class="sectionbody">
<div class="paragraph">
<p><a href="http://www.jboss.org/capedwarf">JBoss CapeDwarf</a> is an implementation of the Google App Engine API, which allows applications written for the Google App Engine to be deployed on JBoss Application Servers without modification. Behind the scenes, CapeDwarf uses existing JBoss APIs such as Infinispan, JGroups, PicketLink, HornetQ and others.</p>
</div>
</div>
</div>
<div class="sect1">
<h2 id="the-users-api">
<a class="anchor" href="#the-users-api"></a>The Users API</h2>
<div class="sectionbody">
<div class="paragraph">
<p>One of the smallest APIs of the Google App Engine in terms of the number of API methods is the Users API. It basically contains only the following API methods: |getLoginUrl|, |getCurrentUser| and |getLogoutUrl|. By using the Users API, the application developer need not implement any kind of login system, because authentication is handled by AppEngine itself. Instead of directing the user to a custom login form, the application simply needs to point the user to the URL returned by |getLoginUrl|. This will allow the user to log in with either their Google account, their Google Apps Domain account or through an external OpenID provider. After the user logs in, they are redirected back to the application. The application can then obtain the logged-in user’s email address simply by calling the |getCurrentUser| API method.</p>
</div>
<div class="paragraph">
<p>In order to allow migration of existing GAE applications to and from CapeDwarf, CapeDwarf must also support using Google accounts and not introduce any custom method(s) of user authentication. Thanks to OpenID support in PicketLink Social this was pretty straightforward to achieve.</p>
</div>
<div class="paragraph">
<p>OpenID is an open standard that allows users to be authenticated by a trusted third party service. In CapeDwarf’s case, the third party service is the Google Accounts OpenID provider.</p>
</div>
<div class="paragraph">
<p>Using PicketLink Social to authenticate the user through Google Accounts is a very simple process.</p>
</div>
</div>
</div>
<div class="sect1">
<h2 id="directing-the-user-to-the-openid-provider-s-authentication-page">
<a class="anchor" href="#directing-the-user-to-the-openid-provider-s-authentication-page"></a>Directing the user to the OpenID provider’s authentication page</h2>
<div class="sectionbody">
<div class="paragraph">
<p>When the user requests the login URL returned by |getLoginUrl|, CapeDwarf’s AuthServlet instantiates an instance of PicketLink Social’s |OpenIDManager|, associates it with CapeDwarf’s own implementation of |OpenIDProtocolAdapter| and then simply instructs the |OpenIDManager| to authenticate the user:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay"><code class="java language-java">OpenIDManager manager = <span style="color:#080;font-weight:bold">new</span> OpenIDManager(<span style="color:#080;font-weight:bold">new</span> OpenIDRequest(<span style="background-color:hsla(0,100%,50%,0.05)"><span style="color:#710">"</span><span style="color:#D20">https://www.google.com/accounts/o8/id</span><span style="color:#710">"</span></span>));
CapedwarfOpenIDProtocolAdaptor adapter = <span style="color:#080;font-weight:bold">new</span> CapedwarfOpenIDProtocolAdaptor(request, response, getReturnUrl(request));
OpenIDManager.OpenIDProviderList providers = manager.discoverProviders();
OpenIDManager.OpenIDProviderInformation providerInfo = manager.associate(adapter, providers);
manager.authenticate(adapter, providerInfo);</code></pre>
</div>
</div>
<div class="paragraph">
<p>Behind the scenes, the manager calls the |OpenIDProtocolAdapter| and instructs it to redirect the user to the OpenID provider’s URL. The redirect is achieved either through a standard HTTP redirect or by sending a self-submitting HTML form to the browser (this is necessary when the OpenID payload is larger than 2048 bytes).</p>
</div>
<div class="paragraph">
<p>The OpenID provider then displays the login form to the user (or authenticates the user in some other way).</p>
</div>
</div>
</div>
<div class="sect1">
<h2 id="handling-the-user-s-return-from-the-openid-provider-s-authentication-page">
<a class="anchor" href="#handling-the-user-s-return-from-the-openid-provider-s-authentication-page"></a>Handling the user’s return from the OpenID provider’s authentication page</h2>
<div class="sectionbody">
<div class="paragraph">
<p>After the user is authenticated successfully, the provider redirects the browser back to the consumer - CapeDwarf. This returning request is also handled by AuthServlet. The servlet verifies if the user is authenticated by calling verify() on the |OpenIDManager|. If the user has been authenticated, CapeDwarf can now access the email address of the authenticated user, store it in the session and redirect the browser to the destination URL that was supplied by the application when it requested the login URL in the first step.</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay"><code class="java language-java"><span style="color:#339;font-weight:bold">boolean</span> authenticated = manager.verify(adapter, getStringToStringParameterMap(request), getFullRequestURL(request));
<span style="color:#080;font-weight:bold">if</span> (authenticated) {
response.sendRedirect(request.getParameter(AuthServlet.DESTINATION_URL_PARAM));
}</code></pre>
</div>
</div>
<div class="paragraph">
<p>During the invocation of the verify() method, the manager calls back our |OpenIDProtocolAdapter| with two types of |OpenIDLifecycleEvents|: |SESSION| and |SUCCESS|. The |SESSION| events instruct the adapter to store certain data in the session, while the |SUCCESS| event obviously signals that the authentication was successful.</p>
</div>
<div class="paragraph">
<p>The authentication process has now completed. Whenever the application now calls the User API’s |getCurrentUser| method, CapeDwarf will return the authenticated user’s email address. The email address simply identifies the user. It is up to the application to use this information whichever way it wants to.</p>
</div>
</div>
</div>
<div class="sect1">
<h2 id="how-we-handle-application-admins">
<a class="anchor" href="#how-we-handle-application-admins"></a>How we handle application admins</h2>
<div class="sectionbody">
<div class="paragraph">
<p>On a final note, there is another API method that I haven’t mentioned yet: |isUserAdmin|. As is obvious from the method’s name, this method returns true if the logged in user is the admin of the application - either the Google user that uploaded the application to Google’s AppSpot or another user that was manually added as an admin by the application uploader through the Google Cloud Console.</p>
</div>
<div class="paragraph">
<p>Since there is no user account/email associated with deploying an AppEngine application to CapeDwarf, there is no notion of an admin. To specify who the app’s admins are, you need to list their email addresses in capedwarf-web.xml like this:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay"><code class="java language-java"><capedwarf-web-app>
<admin>my.email<span style="color:#007">@gmail</span>.com</admin>
<admin>another.admin.email<span style="color:#007">@gmail</span>.com</admin>
</capedwarf-web-app></code></pre>
</div>
</div>
</div>
</div>
<div class="sect1">
<h2 id="further-info">
<a class="anchor" href="#further-info"></a>Further info</h2>
<div class="sectionbody">
<div class="paragraph">
<p>For the complete source code of how CapeDwarf uses PicketLink Social, please turn to the <a href="https://github.com/capedwarf/capedwarf-blue/tree/master/users/src/main/java/org/jboss/capedwarf/users">Users module</a> in CapeDwarf’s <a href="https://github.com/capedwarf/capedwarf-blue">GitHub repo</a></p>
</div>
</div>
</div>
http://capedwarf.org/blog/2012/12/11/2012-12-1-InsideCapeDwarfHowWeImplementedTheDatastoreAPI/Inside CapeDwarf: How we implemented the Datastore API2015-07-27T13:31:16+00:002012-12-11T16:11:00+01:00Marko Luksa
Inside CapeDwarf is a series of blog posts about the internals of the CapeDwarf project. CapeDwarf is an open-source implementation of Google App Engine APIs on top of various JBoss technologies. You’ll find more info on CapeDwarf at the project’s page at http://www.jboss.org/capedwarf
AppEngine Datastore API
The single most important AppEngine API is probably the Datastore API, which (as evident from the name itself) provides an API for storing, retrieving and querying data. This was the first API we set out to implement in CapeDwarf. It basically served as proof-of-concept for the whole project.
Those familiar with JBoss technologies will know that JBoss...
<div id="preamble">
<div class="sectionbody">
<div class="paragraph">
<p>Inside CapeDwarf is a series of blog posts about the internals of the CapeDwarf project. CapeDwarf is an open-source implementation of Google App Engine APIs on top of various JBoss technologies. You’ll find more info on CapeDwarf at the project’s page at <a href="http://www.jboss.org/capedwarf">http://www.jboss.org/capedwarf</a></p>
</div>
</div>
</div>
<div class="sect1">
<h2 id="appengine-datastore-api">
<a class="anchor" href="#appengine-datastore-api"></a>AppEngine Datastore API</h2>
<div class="sectionbody">
<div class="paragraph">
<p>The single most important AppEngine API is probably the Datastore API, which (as evident from the name itself) provides an API for storing, retrieving and querying data. This was the first API we set out to implement in CapeDwarf. It basically served as proof-of-concept for the whole project.</p>
</div>
<div class="paragraph">
<p>Those familiar with JBoss technologies will know that JBoss already has an existing project that would offer most of the things needed to implement the Datastore API - <a href="http://www.jboss.org/infinispan">Infinispan</a>. Infinispan is an extremely scalable, highly available key/value NoSQL datastore and distributed data grid platform. So basically Infinispan offers everything we need - all we needed to do is implement an adapter between Infinispan and the Datastore API.</p>
</div>
</div>
</div>
<div class="sect1">
<h2 id="hacking-into-google-s-factories">
<a class="anchor" href="#hacking-into-google-s-factories"></a>Hacking into Google’s factories</h2>
<div class="sectionbody">
<div class="paragraph">
<p>No, of course I’m not talking about breaking into Google’s real-world facilities. What I am talking about are all the XYServiceFactory classes in the AppEngine API. They represent the entry point into the API and have methods like getXYService(), which you use to obtain references to all the various services the API has to offer. One of those factories is the DatastoreServiceFactory, which we needed to force into returning our own custom implementation of DatastoreService, so that every time anyone would call DatastoreServiceFactory.getDatastoreService(), they would get the reference to CapeDwarf’s implementation of the service.</p>
</div>
<div class="paragraph">
<p>Since the factory itself is not configurable and always returns Google’s own implementation of DatastoreService, we needed to resort to bytecode manipulation of the factory. Javassist made this pretty simple - we simply replaced the whole body of the getDatastoreService method, so it would create a new CapedwarfDatastoreService instance and return it.</p>
</div>
<div class="paragraph">
<p>We used the same technique with all the other XYFactory.getXYService() methods.</p>
</div>
</div>
</div>
<div class="sect1">
<h2 id="making-sure-we-re-actually-implementing-the-api-correctly">
<a class="anchor" href="#making-sure-we-re-actually-implementing-the-api-correctly"></a>Making sure we’re actually implementing the API correctly</h2>
<div class="sectionbody">
<div class="paragraph">
<p>Much of the coding was done TDD style. We used JUnit and Arquillian, which enables you to programatically create micro-deployments, automatically deploy them to a running application server and run tests inside the deployment. Initially, we only ran the tests against CapeDwarf and JBossAS7.1, but later also added the option of running the same tests against Google’s own development app-server and even the production system (appspot). This ability to run our tests against the real Google App Engine proved to be extremely valuable, as it allowed us to validate our tests and to see if our implementation of the GAE API was aligned with that of GAE itself.</p>
</div>
<div class="paragraph">
<p>Of course, it’s hard to write perfect tests based only on API documentation, which usually doesn’t go into every implementation detail of the API. This meant that when we initially ran the tests against the real GAE, quite a few of them actually failed, even though they had been passing against CapeDwarf. There were minor differences between Google’s and CapeDwarf’s implementation of the API, and through the tests, we were able to pin-point the differences and iron-out CapeDwarf so it would behave exactly like GAE. This would have been a lot harder if we had not had Arquillian at our disposal.</p>
</div>
</div>
</div>
<div class="sect1">
<h2 id="storing-and-retrieving-data">
<a class="anchor" href="#storing-and-retrieving-data"></a>Storing and retrieving data</h2>
<div class="sectionbody">
<div class="paragraph">
<p>OK, let’s finally move on to the actual implementation of the datastore. The most basic operations of the Datastore are storing and retrieving Entity objects by key. Implementing this with Infinispan was very straight-forward, since Infinispan exposes a Cache interface, which extends java.util.Map. So, basically, implementing datastore’s get and put methods was as simple as invoking put and get on a Map. It really doesn’t get any easier than this.</p>
</div>
<div class="paragraph">
<p>One caveat that would reveal itself later is that by default Infinispan does not make a defensive copy when you store an object in the cache. This means any modification made to the object after storing it would also be seen by clients retrieving the object from the cache (this is only true if the object hasn’t been passivated and is being accessed on the same node in the cluster). We could have used Infinispan’s storeAsBinary option, but we figured it was faster to simply clone the Entity prior to storing and returning it, since the clone() method was already implemented on Entity.</p>
</div>
</div>
</div>
<div class="sect1">
<h2 id="querying">
<a class="anchor" href="#querying"></a>Querying</h2>
<div class="sectionbody">
<div class="paragraph">
<p>With the basic operations implemented (storing, retrieving and deleting entities by their keys), we moved on to the hard(er) part - querying. Infinispan-Query and <a href="http://www.hibernate.org/subprojects/search.html">Hibernate-Search</a> already offered the ability to index the properties of entities into a Lucene index and perform queries against this index and then retrieve the results from the Infinispan cache.</p>
</div>
<div class="paragraph">
<p>In order to make Infinispan index the entities, we needed to add a few annotations to the Entity class. We accomplished this through bytecode manipulation with Javassist as well. Since every datastore Entity can have a completely dynamic set of properties (the properties don’t have to be specified up-front in any kind of schema), we needed to implement a Hibernate-Search Field Bridge that would map all the properties of the Entity to a Lucene document.</p>
</div>
<div class="paragraph">
<p>With the entities now stored in the cache as well as in the Lucene index, all that was left to do was implement a query converter that would convert Google App Engine queries into Infinispan’s cache-queries. Actually, Infinispan’s CacheQuery is not much more than a wrapper around a Lucene query, so the converter actually converts GAE queries into Lucene queries. It does this through a DSL provided by Hibernate-Search.</p>
</div>
</div>
</div>
<div class="sect1">
<h2 id="appengine-splits-certain-types-of-queries-capedwarf-doesn-t-need-to">
<a class="anchor" href="#appengine-splits-certain-types-of-queries-capedwarf-doesn-t-need-to"></a>AppEngine splits certain types of queries, CapeDwarf doesn’t need to</h2>
<div class="sectionbody">
<div class="paragraph">
<p>One interesting thing about Google’s implementation of Datastore queries is the fact that certain types of queries (IN, OR, NOT_EQUAL) are split up into multiple queries with the results then merged into a single result set. We opted not to do this, since Infinispan-Query/Hibernate-Search/Lucene are quite capable of doing this in a single query. However, there is a problem with using this single-query approach with queries containing the IN operator. Since GAE performs multiple queries in this case, the order of the results depends on the order of the items in the IN list. Since this is clearly documented in the GAE documentation and certain applications may depend on this, we were forced to implement sorting of the results according to the same rules. While this is not really a problem when queries return whole entities, it is quite a pain when performing projection queries (these queries return only a subset of the entity’s properties). When a client performs a projection query returning property <em>foo</em>, and filtering (with an IN operator) on property <em>bar</em>, we have to add <em>bar</em> to the list of requested projection properties, just so we can sort the results according to the order of items in the IN clause. In hindsight, maybe we should have simply gone the same route as GAE does and split these kinds of queries into multiple queries as well. It is possible we will do this in the future.</p>
</div>
</div>
</div>
<div class="sect1">
<h2 id="wrap-up">
<a class="anchor" href="#wrap-up"></a>Wrap-up</h2>
<div class="sectionbody">
<div class="paragraph">
<p>This was a quick look at how we implemented the Datastore API in CapeDwarf. I didn’t cover Datastore statistics, Callbacks and Metadata yet. These are fairly recent additions to the Datastore API and I’ll go into how we implemented them in one of my future "Inside CapeDwarf" blog posts.</p>
</div>
<div class="paragraph">
<p>If any of you have an application running on Google App Engine, we’d really appreciate it if you could give CapeDwarf a try and report to us any problems you encounter. For detailed instructions on how to run your app on CapeDwarf, see Aleš Justin’s recent <a href="http://in.relation.to/Bloggers/FirstCapeDwarfRelease">blog post</a> about the first CapeDwarf release.</p>
</div>
</div>
</div>