Posts Tagged ‘jcr’

eXo JCR 1.14: New Features for eXo Platform 3.5

Thursday, September 22nd, 2011

For almost everything it can do – and it can do a lot, from websites to community extranets to enterprise social intranets and much more – eXo Platform relies on its built-in Java Content Repository (JCR). With eXo Platform 3.5, working with content and data will be both easier and more ready for prime-time enterprise deployments.

At eXo, I lead a team of developers in building eXo JCR, our open source implementation of the JSR-170 spec, which is used in eXo Platform. We’ve just released eXo JCR 1.14.0-GA, which will add many new features and enhancements to the upcoming eXo Platform 3.5. This version is quite a big step forward since JCR 1.12: between JCR 1.12 and 1.14, we worked on no less than 50 improvements and 100 new features!! In this post, I will describe the new features and improvements that we added since my last blog post.

New Features

Managed Transactions & JCA Support

In eXo JCR 1.12 and earlier, our Java Transaction API (JTA) support was limited and experimental. Because we wanted to be able to support managed transactions and JCA 1.5 in eXo Platform 3.5 and beyond, we completely overhauled our JTA support for eXo JCR 1.14.

  • Managed Transactions Support: If you use managed data sources, you simply need to configure your eXo JCR instance to allow it to distinguish the managed data sources from the others. eXo JCR will then know whether or not it needs to delegate the commit and rollback calls to the Application Server.
  • Java EE Connector Architecture (JCA) Support: If you intend to use eXo JCR in your custom application to store your data, and you would like to delegate the management of the JCR session life-cycle to your application server, you can now use the JCA Resource Adapter for eXo JCR.

Asynchronous Indexing

Lucene is used any time we execute a JCR query, which means it’s a critical part of eXo JCR. That’s also why, in previous versions, full indexing used to be blocking. Now you can make it non blocking in order to access to your application immediately, even if a full indexing is running. However, as long as the indexing is running, you won’t be able to execute a JCR query; only pure API calls will be allowed. Another interesting aspect of this feature is the fact that you can rebuild the Lucene indexes of a given workspace at runtime, using JMX.

External Backup Tools Allowed

eXo JCR now provides a secure way for you to use external backup tools to back up the data of your JCR instance. It was possible to use third-party backup tools with previous versions, but you had to ensure that no transactions were running (which is not trivial). The latest version of eXo JCR leverages JMX, which lets you suspend and resume all the current transactions on your JCR instance. So if you want to use your own tools, simply suspend the current transactions, launch your tools to back up your data, then resume the transactions.

Other Interesting New Features

  • If you wanted to keep big objects or non-serializable objects in a replicated eXo cache instance, you can configure your eXo cache to enable an invalidation mechanism. This will automatically invalidate your data anytime it detects a changed value in your cluster.
  • A new syntax allows you to define a default value in your configuration files. For example, ${my.value:10} will be understood by the kernel as, try to find the value of the variable called “my.value”, and if no value is found use “10″.
  • You can use your own Lucene lock factory and/or Lucene directory if the default ones don’t fulfill your requirements.
  • We added a mime type detection mechanism that allows eXo JCR to properly extract the meta-data and the full text content of a document, even if the extension is missing in the name of the file.

Improvements

Multi-Database Schema Support

Frequently a single database instance must be shared by several other applications. But some of our customers have also asked for a way to host several JCR instances in the same database instance. To fulfill this need, we had to review our queries and scope them to the current schema in a different way; it’s now possible to have one JCR instance per DB schema instead of per DB instance.

Monitoring

In this new version, we worked a lot on improving the monitoring capabilities. We did this by exposing many new MBeans in the JMX console, and by simply giving more understandable names to all the threads of eXo JCR. You now have a better overview of what’s going on in your eXo JCR instance.

Pattern-based Methods

In many applications based on eXo JCR, we realized that people obtain their data property by property; this is not necessarily a good practice, especially when there are many properties to fetch. To provide a much more scalable solution, we worked on improving the methods Node.getNodes(NamePattern) and Node.getProperties(NamePattern). So now, instead of potentially making the eXo JCR execute as many queries as you have properties in your node (assuming that the cache is empty), we can obtain all data properties using a single query.

Other Interesting Improvements

  • Access Control List (ACL) management has been updated to drastically reduce the total amount of database accesses needed to get the ACL of a given node.
  • We added a new method in our internal interface called ExtendedNode.getNodesLazily. This can be used as an alternative of Node.getNodes in the parts of your application where you can have a lot of sub nodes, and where consistency is not necessarily an issue.

Download

You can download eXo JCR 1.14.0-GA from here and get the documentation from jboss.org here.

Enjoy,
Nicolas

The Top 5 New Features in eXo Cloud IDE

Wednesday, July 20th, 2011

Four months ago we launched cloud-ide.com, the first free online service eXo has ever provided, and its success has been incredible. Our goal: to be the preferred path for developers to Platform-as-a-Service (PaaS) deployments.

Today we are unveiling a major upgrade to the service, with more than 75 new features. I would like to show you my top 5 favorites.

1) Git Support

Git popularity is huge and more and more projects use it to manage their source code. Even some PaaS, such as Heroku or Openshift Express, use it as an application deployment paradigm.

Supporting Git in eXo Cloud IDE was clearly our number 1 priority, and we focused on improving the integration we announced in May at Red Hat Summit. Now we support most of the protocol commands, all natively integrated within eXo Cloud IDE.

As you can see in the first screenshot, we support many Git commands that are exposed in a new Git menu in the IDE. It is possible to init or clone a remote repository, add a file to the index, create a branch, add remote repositories and push the code to different branches on different remote repositories! And at every step of the way, you can view the current status of your repo.

image00

To be able to support private Git repositories, and to communicate with them using the SSH protocol, we have also added the capability to create private and public keys for dedicated domains, and the ability to upload existing private keys and bind them to a domain. In the next screenshots, you can see that I have created 2 private/public keys for the Heroku and GitHub domains, as well as uploaded 2 private keys for Red Hat OpenShift and CloudBees.

image07

It is also possible to browse the version history of Git repositories, see the changes and who made them!

image03

2) OpenShift and Heroku Support

The primary goal of eXo Cloud IDE is to be able to develop apps in the cloud, then deploy them to the different PaaS available in the market. With this upgrade, we now support 3 different PaaS, each that has a different deployment model.

For Heroku and OpenShift, we use some REST commands from the PaaS menu (see the next screenshot) to create applications bound to a Git repository.

image05

Then we use the Git menu to clone and push modifications to this remote repository, such as the OpenShift repo shown in the next screenshot.

image04

We have announced our Red Hat OpenShift support at Red Hat Summit last month in Boston. You can see the video demonstrating how to deploy to OpenShift here.

3) CloudBees Support

For deploying Java apps to CloudBees RUN@cloud PaaS, we only use Git and the CloudBees DEV@cloud service.

A developer first has to create a Java project in eXo Cloud IDE.  Then he has to init that repository and push the code to the CloudBees Git repository (after having registered its public SSH key in the service). From here, we leverage CloudBees DEV@cloud, which uses Maven and Jenkins to manage both the build of the Java WAR artifacts and the deployment to CloudBees RUN@cloud PaaS.

image02

4) Java / JSP support

Java is the language of choice for most eXo developers. In the first version of Cloud IDE a developer could write some REST API in Java using the JAX-RS specification. He could also store structured data inside a Java Content Repository (JCR).

With this upgrade, we now also support standard Java classes (Servlets or POJOs) and Java Server Pages (JSP).

As before, every file has some color syntaxing, code completion and outline. The next screenshot shows those features for a JSP page.

image06

Once the Java and JSP have been written it is possible to deploy them to CloudBees DEV@cloud, which manages the build (it can also manage any unit tests that you add in the Cloud IDE) and the deployment of the generated WAR.

As you can see, you can now create, test and deploy standard Java projects directly in the Cloud.

5) Ruby and PHP Support

With the launch of Red Hat OpenShift Express, we announced support for the Ruby language within eXo Cloud IDE. A developer can quickly create a Ruby file; the IDE provides some color syntaxing, outline and auto completion. A Ruby project can then be deployed to either OpenShift or Heroku, as described previously.

image01

Red Hat OpenShift also supports the PHP language, so it was a great opportunity for us to add support for this dynamic language to our catalog. And of course, we have some color syntaxing, outline and auto completion.

These are my top 5 favorite new features. I hope you will check out the new and improved Cloud IDE and give us your feedback!

For now, we’re getting back to work – this is just the beginning of a new era.

eXo JCR 1.14.0-CR1 is out

Thursday, March 31st, 2011

The core Java Content Repository functionality that we leverage in both GateIn and eXo Platform 3 is eXo JCR (developed in JBoss.org, the JBoss open source forge). We have made many improvements and introduced new features in our upcoming version of eXo JCR (eXo JCR 1.14.0-CR1). The following post provides a technical deep-dive into some of these changes.

New Features

JBoss Infinispan

The most significant of the new features found in eXo JCR 1.14 is the ability to rely on Infinispan as the underlying cache; this provides a more scalable clustering solution. For now, we only used and tested Infinispan (also known as ISPN) as an alternative to JBoss Cache (also known as JBC, which you can still use if you prefer). In other words, Infinispan is only used as a simple replicated cache, which is still interesting in terms of memory footprint and concurrency.

According to our first internal tests, ISPN seems to consume less memory than JBC; more importantly, ISPN clearly reduces the contention compared to JBC. With JBC, you can face contention issues especially when you use any eviction algorithm other than expiration, since any read access to a JBC Node will add an eviction event to the LinkedBlockingQueue instance of the whole region. In ISPN, they had the brilliant idea to implement their own version of ConcurrentHashMap, which they call BoundedConcurrentHashMap, to manage the eviction within each segment. This means that we now have one LinkedBlockingQueue instance per segment, so you can reduce the contention generated by the eviction algorithm by simply increasing the concurrency level.

Another significant improvement ISPN offers is the remove method. In our internal tests we realized that in some use cases, it could be over 800 times faster to remove a cache entry in ISPN compared to JBC. This is mainly due to the notion of Node trees in JBC that is not found in ISPN. Actually, when you remove a node in JBC, it needs to remove all its descendants – which consumes a lot of time and CPU when you have a lot of children nodes.

Next we will try to improve our ISPN integration, to fully benefit on the distributed cache capabilities offered by ISPN. In real-life scenarios, it is difficult to ask a customer to deploy their application on hundreds of instances of a given application server, since the required licenses and support would be cost-prohibitive (not to mention a nightmare for the administrator). On the other hand, it sounds more acceptable if the customer only needs to deploy their application on 3-8 app server instances; these would be used as frontal servers, while hundreds of ISPN cache instances could be deployed in standalone mode to act as the cache server. This would allow ISPN to be used as a cache server, although in our context this is not possible out of the box (due to a lack of JTA support when ISPN is used as cache server – more details here).

Java Security

As you may know, our new eXo Cloud IDE is a free developer service for Java Platform as a Service (PaaS). This ability to easily create and deploy REST components on the fly is very interesting in terms of productivity. However, it needs to be over-protected to ensure that no malicious users affect the integrity of your environment. So we made the entire eXo JCR stack rely on Java Security, meaning that when the SecurityManager is installed, access to sensitive methods is impossible unless the full call stack has enough rights.

Apache Tika

eXo JCR already has a plugin-based framework that enables the extraction of both the meta-data and the full text content of the most common document types, such as Text, XML, HTML, PDF, MS Office and Open Office documents. But we wanted to support many more types of documents, so we decided to implement a plugin for Tika. This is actually an open door to many new document types, including images, audio and video.

Other Interesting New Features…

  • If you dedicate a listener for a specific event broadcast by the ListenerService, you can elect to receive the event asynchronously by adding the annotation @Asynchronous (from the package org.exoplatform.services.listener) to the class declaration level of your listener.
  • eXo JCR can be deployed on Jetty.
  • H2 DB is now supported.

Improvements

JCR Re-Indexing

With full text search engines such as Lucene, it is helpful to rebuild them regularly to preserve consistency, get rid of potentially corrupted indexes, and ensure optimal performance. We decided to speed up the re-indexing mechanism by making it multi-threaded, and by relying more on features specific to RDBMS, such as SQL paging. The results are quite interesting: according to the total amount of core and the db type used, the indexing of millions of JCR nodes could be 4 to 6 times faster.

Lucene Indexing in Clustered Environments

In the previous version of eXo JCR, we stored the Lucene indexes in a shared file system, so it was possible to add a node to the cluster dynamically (meaning the new node could access the Lucene indexes directly, so they could be started and made available quickly). The problem with this approach was that the performances in read and write accesses were affected, and that using a shared file system could have side effects. In addition, only the main cluster node (a.k.a. coordinator in JGroups terminology) could see the latest changes. This is because, for performance reasons, they are only persisted after a certain amount of time, while the rest of the cluster could only see the persisted changes.

To improve this, we took a new approach. Each node can see all changes in near-real time, and has its own version of Lucene indexes. This improves performance and means we no longer rely on a shared file system. This change is possible because we were able to improve the index recovery. Now, when a new node is launched that doesn’t have its own version of Lucene Indexes, you can either decide to rebuild them from a configuration (if the DB is not too big, knowing the re-indexing has been improved too) or get it from the coordinator. The latter method allows you to get a new node up and running in a reasonable amount of time, and fully benefit from having the Lucene indexes locally.

The next step will be to implement a non-blocking index recovery in order to have the new cluster node ready to use even faster.

Backup/Restore

The backup/restore feature has been completely reviewed to better fulfill the requirement of an enterprise; it is now faster, more reliable and much easier to use.

Other Interesting Improvements

  • An application with a lot of workspaces requires a lot of JBC instances (3 per workspace: JCR Cache, JCR Indexing and JCR Lock). To reduce the total amount of JBC instances, you can configure your JCR to make your cache instances shareable – meaning you only need 3 JBC instances, and they can be used by all your workspaces. With this configuration, your JCR will create a dedicated JBC region per workspace instead of launching a new JBC instance.
  • The way missing values are cached is optimized for applications that require frequent testing of the existence of specific nodes or properties. If the searched-for node or property does not exist, the information indicating that the data is missing in the DB is stored in the cache, instead of accessing the database at each call. Because future re-tests will find this information in the cache instead of having to query the DB, your application will be faster and more scalable.
  • All cluster nodes can now be launched in parallel, even when the JCR has never been initialized. This was a limitation in the previous version, since the JCR had to be initialized first.

Download

You can download eXo JCR 1.14.0-CR1 from here and get the documentation from jboss.org here.

You can test it with jetty or tomcat; for both be sure to read the file exo-readme.txt to know how to test it with the default configuration, JBoss Cache or Infinispan. In a nutshell, you simply need to launch it from eXo batches with a new parameter: jbc for JBoss Cache and ispn for Infinispan.

The best ways to quickly test it are:

  • The WebDav access available here
  • The FTP access available from port 2121

For both, use the account root with the password exo.

Enjoy,
Nicolas

Introduction to CRaSH

Tuesday, January 11th, 2011

I’ve just written a new tutorial that gives a technical introduction to CRaSH, an open source project I lead that makes interacting with Java Content Repository (JCR) technology easier. The complete tutorial can be found on the eXo Resource Center – but here’s a sneak peak:

It’s been a year now since I started the CRaSH project. We use Java Content Repository (JCR) technology a lot at eXo, and I realized we all spent too much time and effort trying to interact with content repositories. We needed a tool to make this easier – so I decided to write a shell for JCR. While this new project, CRaSH, started as an interactive shell for browsing, querying and modifying JCR repositories, it has evolved into more than that.

The architecture of CRaSH is founded on two ideas:

  • The capability to serve multiple protocols: telnet and SSH are must-have’s
  • Extending the shell should be easy, and possible at runtime

CRaSH started very simply, so the first usable version took me only a few days to write. In this first version, I remember I used the Netty library to provide connectivity, as it had basic support for the telnet protocol (I didn’t need anything more at the time). I also selected Groovy language for writing shell commands, thinking it was the perfect match for two reasons. First, Groovy is dynamic and easy to compile, and second, you only need a little knowledge of Groovy to begin using it.

Since then, CRaSH has evolved to become richer and offer more capabilities. Netty was dropped because its telnet support was too basic; instead, Wimpi Telnetd and Apache SSHD were adopted to provide a real shell experience. CRaSH benefited from a couple of contributions as well (it’s always nice to have people in the open source community helping you), so it is pretty mature as of the recent 1.0.0-beta18 release (the only missing feature I would like is command line completion).

CRaSH is now a valuable tool to interact with a JVM runtime. The latest release provides two bundles. The first one, the core bundle, can be deployed in any servlet container. The second one is the GateIn bundle, which is built specifically for the GateIn portal server to add a powerful set of JCR features.

In this tutorial, we will focus on explaining basic CRaSH development, and demonstrate this by coding a command that will display a nice list of the JVM system properties.

Continue reading the “Introduction to CRaSH” tutorial on the eXo Resource Center…

eXo at JUDCon Berlin: GateIn Presentation from Julien Viet

Friday, October 1st, 2010

For those of you who are attending the JBoss User & Developers Conference (JUDCon) in Berlin next week, be sure to check out Julien Viet’s session. Julien, who serves as the project manager (from the eXo side) for GateIn, will be giving a talk on Day 2 of the conference, 8 October, at 14:30 in the Workflow and BPM track. He’ll be introducing Chromattic, an open source project that provides GateIn with a JCR persistence layer for rapid development of content-based apps. Here’s the complete abstract:

Julien Viet – Advanced JCR Persistence in the GateIn Portal Framework

The GateIn Portal comes with a built in Java Content Repository server for managing pages, layouts and portlets. The Chromattic open source project was initiated to develop the GateIn object model persistence in a JCR server. Beyond natively powering the heart of GateIn, Chromattic can be used to rapidly develop rich and complex JCR based applications.

Chromattic is an object mapper framework that uses JCR as persistence layer. It provides a natural support for various JCR features, thanks to the usage of Java Annotations. Annotations declare which and how classes are mapped to nodes, turning any repository node into a Java object. It provides important features to JCR development such as type safety and object orientation which are lacking when the JCR native interfaces are used. Moderns IDE most used features like code completion and refactoring are de facto available when developing Chromattic applications.

The key concepts of Chromattic will be presented, through the development of a simple Chromattic application in real-time. This sample application will be made available to the attendees so they can use the sample code as a starting point. This advanced technical session will show:

  • How to integrate Chromattic with a modern IDE using a Maven-based build
  • How to deploy a Chromattic application in GateIn
  • How to connect to and manage a repository server
  • In addition, several advanced features of Chromattic will be highlighted, to demonstrate the power of the framework.