Ontopia JDO

It is with great pleasure that we announce the next step in the Ontopia development, the Ontopia JDO project.

Rationale
As we worked toward a release of Ontopia 5.3.1, we encountered the massive block of code that is Ontopia RDBMS backend. This huge block of extremely important code was developed as part of the OKS, before it was open sourced. As such, most of this code is now at least 6 years old, and the core code probably even older. We have come to understand the limitations of this code throughout the years, to name a few important ones:

  • Supported databases are limited
  • Optimization requires extensive knowledge of the code
  • Optimization is database dependent
  • Tracing and debugging is complicated
  • Full text search is complicated, and database dependent
  • TOLOG RDBMS is incomplete or broken
  • Adding use case oriented optimized SQL queries requires complex code hacking

When faced with these issues during real world projects, we set out to improve upon Ontopia. Soon thereafter we came to the conclusion that improving or changing the code would be a massive undertaking. So instead we choose to research new, yet proven, technologies that offer features that Ontopia requires.

JDO
JDO, which stands for Java Data Objects, is a specification of Java object persistence. It allows a domain model, represented by Pojo’s, to be mapped to a persistence store. JDO was initially designed in JSR 12 in 2002 and the last version (3.0) was released in 2010. JDO was not the only ORM technology we had a look at, but best suited the needs of Ontopia.

DataNucleus
Because JDO is a JSR, several implementations exist. We choose to work with Datanucleus initially, as it is the reference implementation for the latest JDO specification and supports the largest number of data stores.

Benefits
Here are some of the benefits we should be able to achieve with this project:

  • Many datastores: RDBMS (all?), Graph based (Neo4j!) , Document based (Mongo, …), Object based, web based.
  • External optimization: optimization is (mostly) part of the JDO abstraction layer, which means we won’t have to program it.
  • Use of open source community: JDO and Datanucleus are maintained by a large open source community, which means we get improvements on each new version.
  • Better integration: extending Ontopia’s datamodel with your own JDO persisted Pojo’s should now be possible.
  • Basic full-text searching for every datastore: The project provides a very basic full text search over JDO.
  • TOLOG RDBMS remake possible: the inner workings of tolog-rdbms creates JDO queries that are converted to SQL. This could now directly leverage JDO features.

Downside
Sadly, there is a downside. The RDBMS schema has changed. Although the schema closely resembles the Ontopia 5.x schema, it was impossible to fully reuse it. We plan to create a tool that can migrate from an existing RDBMS backend to a new JDO backend as optimized as possible to mitigate this issue.

Project status
The code committed to GitHub at the time of this post has been in development for about a year. It has been tested within the scope of Ontopia code, meaning all the backend tests in net.ontopia.topicmaps.core. Beyond these basic tests, Morpheus has tested the project in combination with existing frameworks and projects based on Ontopia. All these tests are now successful, which means the project is ready to be beta tested.

Roadmap
The project goes into beta testing with this post. We ask you, the Ontopia community, to test it in your projects and frameworks. We especially would like to see all the different datastores tested before we officially claim that Ontopia can be used with all stores. Do not hesitate to ask questions, report issues, or even better: create pull requests. In the coming days we will add known issues and todo’s to the issue tracker. See the README on GitHub to get started with ontopia-jdo.

The Ontopia committers.

Advertisements

Ontopia.toMaven()

Ontopia’s developer team is committed to switch from Ant to Maven as build and project management tool for the Ontopia code base. Making this switch has been ongoing work since 2009. This blog post serves as a summary of the work that has been done so far and the work that still needs to be done. 

Why Maven?
Ontopia’s biggest problem is that the code base forms one massive block, that cannot be split up. Many developers and end users have complained about this and have requested a change to modularize the product. The Ant build file that is currently used to build Ontopia is about 3000 lines long and has become difficult to maintain. Also, as we discovered along the way, it contains obsolete parts and many tasks are heavily tangled. Cleaning up the build file is not straightforward and will remain a problem as the project evolves. At TMRA 2010 Morpheus has presented a proposal to start using Maven instead of Ant.
Maven is a project management and comprehension tool that has become increasingly popular over the last couple of years. It uses convention over configuration. Instead of configuring every setting over and over again, Maven uses conventions for commonly used tasks. It uses a standard for directory naming and for the build cycle that is used to compile, test, build and deploy software. As a result, it takes a lot less XML to tell the system how Ontopia should be built. Of course this requires the code base to follow the convention, which is what we’ve been working on since July 2009.
Additional benefits of following the Maven convention is that the directory structure starts to reflect the modular architecture of the code base and that test files and resources become separated from the actual code. This creates a more transparent code base, in which developers can find their way more easily.
Maven is used extensively in software written in Java. It is mature software and much support can be found online. Many plugins are available for, for instance, pre-compiling JSP, creating Docbook documentation, etc. We believe it Maven is currently the best option for Ontopia. Later it would be possible to create build scripts based on other project comprehension tools like Gradle or Buildr, which use the same file layout as Maven.

What needed to be done to support a modularized architecture?
To modularize Ontopia we distinguish three main parts:

  • Java code. Core functionalities, db2tm, classify, navigator, etc only contain java code. We’ve split up these functionalities into modules where this was possible.
  • Web applications: Omnigator, Ontopoly and the supporting web applications are now modules of Ontopia. Each application can be build separately if needed, and will eventually end up in the distribution.
  • Distribution: for most of the users, this is what Ontopia is. The zip file containing tomcat and all tools and applications. Currently this module builds only a tomcat distribution, but it is set up to allow for other container server distributions to be added in the future.

Maven has a specific project structure. To implement this structure we needed to move a lot of files into the correct location. After all the code was moved into the correct location and was once again compilable, we started working on the test cases. Maven forces testing on every build, which currently Ontopia doesn’t. Maven also automatically detects test classes. Most of our work went into changes the test cases into Maven-runnable test cases. During this process we discovered that not every test case of Ontopia is being tested in the current build process.
The next step was to move all the web applications into maven modules. The new Maven web applications are now being pre-compiled, which brought up some old and broken code.
Finally, the distribution needed to be redefined. The Maven modules are collected and placed into a freshly downloaded tomcat.

What will change for end users?
We aim toward a build that generates essentially the same distribution as the one that is now available, so that users are not directly affected by the changes. After a successful transfer to Maven, we can start improving the quality of Ontopia. This of course translates into fewer bugs for users.

What will change for developers?
The biggest changes in this process are aimed at the ease of use of Ontopia as project dependency and the maintenance of Ontopia itself. Developers using Ontopia will get more choice in what part of Ontopia they would like to use. For example: a Topicmap browsing web application would be dependent on the Navigator module only.
The work of the Ontopia developers can now be aimed more directly at a certain module, allowing for easier splitting of developer tasks. The modules now have clear lines between them, so that debugging becomes easier.

What has been done so far?
Currently, we are at about 90% completion of the conversion to maven. We have created a branch on Google code, called ontopia-maven, in which we are working. The steps we have taken so far are, amongst others:

  • Java code has been moved into modules (100%)
  • Most of the test files have been modified to the new situation (99%)
  • Web applications have been moved into modules (90%)
  • A distribution with Tomcat has been added (80%)

We are still working on finishing the TMRAP service, the documentation, the vizlet and the overall fine-tuning of the distribution.

When is the switch expected to be finished?
At the moment there is no date set for any action after finishing the branch. There are several decisions to be made before we can define a time frame. Of course there will be ample notification in advance of any major changes. Our best guess is to finalize the transition in the summer of 2011.

How will the merging be done?
At the moment there are several thoughts about how we can merge the results of our work back into the trunk:

  • Replacing the trunk with the branch. This means we have to apply every commit on the trunk since the branching moment on the branch. The current trunk would then be tagged as the last non-maven version.
  • A ‘normal’ merge to the trunk. The usual way of ending the life of a branch is by merging it back into the trunk. We expect that this will create a lot of conflicts due to the amount of moves, copies and changes that needed to be done. During the process of fixing these conflicts, the trunk would be locked until we reach a stable build.
  • Replacing the trunk with a backup plan. This would be almost the same as the second option, except for a safety precaution: we would create a new branch from the trunk as backup for emergency fixes/builds. Once the merge is finished, we can merge back any emergency changes from the trunk branch.

None of these options has been chosen yet, and we are open to suggestions from people with experience in massive branch merges.

How is the progress monitored?
The Ontopia Maven branch is currently deployed in a Hudson server at Morpheus, which runs nightly builds and also runs a Sonar analysis. This is used to keep track of the buildability of the maven project, and the status of the test cases. It also provides us with some nice metrics:

  • Modules: 18 (10 java, 7 web applications, 1 distribution)
  • Lines of code: 148,895
  • Java classes: 2145
  • Tests: 4529 (of which 39 are currently failing)
  • Test success: 99,1%

We are looking into the possibility of sharing access to the Hudson and Sonar results.

Is there something to see / play around with?
Yes there is! The Ontopia maven code is publicly available in the Ontopia repository, under branches/ontopia-maven/ontopia-maven. If you want to build Ontopia yourself, please install Maven and run from the project’s root directory:

 mvn clean install -Dmaven.test.failure.ignore=true -Pontopia-distribution-tomcat
The failure.ignore setting is temporary to work around the last failing testcases. Without this fix, the build process will halt on the first failure. Since Maven will run all test cases (including RDMS), the number of test failures will be a few hundred if you do not provide an RDBMS property file (-DargLine=”-Dnet.ontopia.topicmaps.impl.rdbms.PropertyFile=/path/to/file.props”). The -Pontopia-distribution-tomcat is an additional profile setting to include the distribution in the build. It is not included in a default build. Once the build is complete, you can find the distribution in the ontopia-distribution-tomcat/target/ontopia-distribution-tomcat-**/ folder. 

How can I help? Whom to contact for questions?
You can help us by building Ontopia with Maven yourself and either trying out the distribution or the new artifacts as dependencies in other projects. Issues you find can be reported on the Ontopia issue tracker. Keep in mind however that this branch is quite old and might not contain fixes already committed to the trunk.
Any of the Ontopia contact options can put you into contact with people that can answer your questions.

Conclusion
Switching to Maven will be a great leap in the maintainability of the Ontopia code base. Building, testing, releasing, etc. of the code will be done based on a standardized life cycle. The file layout will be more transparent by standardizing the directory structure and separating test and resource files from the code. Also, by using Maven’s modularized approach, we will be able to build parts of Ontopia separately and gain the possibility to create customized distributions, for example for different web containers.

The conversion is now almost complete, but still residing in a Subversion branch, awaiting to be merged back into the trunk. We are looking forward to meet you on the other side.

Ontopia gets an SDshare implementation

A prototype server implementation of the SDshare protocol was added to the sandbox today. The implementation is not yet tested, and somewhat incomplete, but good enough to be tried out. The readme file has more details, plus build instructions.

This particular implementation of SDshare allows the contents of one topic map to be replicated into another topic map, potentially on a different server. One use case is to have one hub server build a merged topic map from SDshare streams from a number of upstream servers. Another is to transfer updates from a staging server to a production server. And so on.

The implementation is being written for two reasons. One is to learn SDshare. The other is because Bouvet is likely to get a project which will be based on SDshare, in which case using Ontopia for some of the SDshare servers is one of the options we are exploring. So work on the implementation is likely to continue.

Note that the implementation is so far an SDshare server only. There is no client for the time being.

Aranuka 1.0 is released

Hannes Niederhausen of The TopicMaps Lab in Leipzig has released Aranuka 1.0, a Topic Maps object data binding tool, which supports persisting information stored in Java objects in a topic map. Effectively, it means you can write normal Java objects encapsulating your business logic and have Aranuka take care of storing the data in a topic map for you. Aranuka works with Ontopia and tinyTiM.

Aranuka is not part of Ontopia, but since it adds value to Ontopia users we thought it would be good to mention it here.

Here is an example showing how you could implement a simple class representing Person topics, and have Aranuka store the data about the person in a topic map for you:

@Topic(subject_identifier="ex:address")
public class Address {

  @Id(type=IdType.ITEM_IDENTIFIER)
  private int id;

  @Occurrence(type="ex:zipcode")
  private String zipCode;

  @Occurrence(type="ex:city")
  private String city;

  @Occurrence(type="ex:street")
  private String street;

  @Occurrence(type="ex:number")
  private String number;

  public int getId() {
    return id;
  }

  public void setId(int id) {
    this.id = id;
  }

  public String getZipCode() {
    return zipCode;
  }

  public void setZipCode(String zipCode) {
    this.zipCode = zipCode;
  }

  public String getCity() {
    return city;
  }

  public void setCity(String city) {
    this.city = city;
  }

  public String getStreet() {
    return street;
  }

  public void setStreet(String street) {
    this.street = street;
  }

  public String getNumber() {
    return number;
  }

  public void setNumber(String number) {
    this.number = number;
  }
}

The example was taken from the Aranuka manual, which has more information.

Toma implementation in Ontopia

Thomas Neidhart of SpaceApplications has implemented the Toma query language on top of Ontopia. Toma is a query language for Topic Maps designed by Rani Pinchuk of SpaceApplications. The implementation is currently in the sandbox part of Subversion, and not part of Ontopia proper, but the wiki page explains how to check it out and run it. Currently, Toma queries can only be run using a command-line client (or the API), but work is currently under way to make Toma available in the Omnigator query plug-in.