eXist 1.2.4 Released
Besides fixing critical bugs in the storage backend, the 1.2.4 release mainly improves the memory consumption of queries on large document sets. Major changes include:
- new node set implementation, which is much more memory efficient compared to previous approaches. The old implementation consumed a lot of memory when used with larger sets of documents. Obviously this had a negative effect on overall performance.
- reduce memory consumption of documents constructed during a query: if you have a query which creates thousands of small XML fragments, each of those fragments used to have its own document context with its own name pool and various fields which may have never been needed. Large parts of the document context are now shared between fragments and we make more use of lazy initialization, thus reducing the memory consumption of in-memory fragments dramatically (in my tests, I could save up to 100mb memory when creating a few thousand XML fragments in one query).
- fixed fatal btree bugs leading to index corruptions (which usually caused an ArrayIndexOutOfBounds exception). The bugs were more likely to occur when indexing large string keys, but they may also have happened in other situations. The failure damaged the index and rendered the db unusable (though it could be repaired).
- fixed concurrency issues leading to ArrayIndexOutOfBounds or NoSuchElement exception when querying for attributes
- memory leak: we observed that the xerces XML parser builds some internal data structures when validating a document, which are unfortunately not properly cleared afterwards. This is a major problem since eXist pools the XML parser instances. To work around those issues, eXist will no longer pool XML parsers which were used on larger documents.
- using full text and ngram indexes at the same time caused eXist to hang in an endless loop
The release is now available for download.
Note: all releases in the 1.2 branch are bug fix releases and can be considered stable. They only contain hand-selected changes which were ported back from the main development version.
- 0 Comments
- Add Comment
Warning: Bad Memory Settings in 1.2.2 and 1.2.3
As reported by users, the 1.2.2 and 1.2.3 releases shipped with a bad memory configuration: in the main configuration file (conf.xml), the cacheSize parameter was set to 256M:
However, Java is started with only 128M max. memory, so using 256M for caches will sooner or later result in eXist hitting the wall.
The problem here is that the effects of an OutOfMemory error are somehow unpredictable and may lead to unnoticed corruptions in the database. Java doesn't show many warnings before it runs out of memory. All you usually get is a message on stderr.
In general, the cacheSize parameter in conf.xml should never be set to more than 1/3 of the maximum memory available to Java. Please adjust cacheSize accordingly or increase Java's max memory (usually set through the -Xmx parameter which has to be passed on the java command line - see bin/functions.d/eXist-settings.sh or bin/startup.bat).
Installer Issues
We still had some issues with the installer in the 1.2.2 release. This was a major problem for some users who redistribute eXist with their own application. Version 1.2.3 has been uploaded to solve those issues.
It also fixes the consistency checker, which was introduced with 1.2.1 and unfortunately triggered a false alarm in some cases.
Updating is not really necessary unless you had problems with the installer or rely on the consistency check service. The 1.2 branch is maintained separately from the development branch. This allows us to release selected bug fixes and improvements much more frequently.
Small Fixes in 1.2.2
For those who had problems with the 1.2.1 version, we uploaded a slightly updated release, which is now called 1.2.2.
New Stable Release 1.2.1
eXist 1.2.1 is mainly a bug fix release, which addresses a number of stability- and performance-critical issues. All releases in the 1.2 series are considered to be stable. They are limited to hand-selected changes, which have been ported from the current development trunk. New features or major code changes will be part of the 1.3 development series.
We nevertheless count more than 60 bug fixes in 1.2.1!
New Consistency Check and Emergency Backup Tools
When deploying eXist in a production environment, I really want to make sure that the database is in a consistent state and that potential problems are detected as early as possible. Even if the database is running well, bad things can happen which are outside of eXist's reach (e.g. an OutOfMemory error in the servlet container, which can be fatal).
eXist 1.2.1 will thus offer an automatic consistency and sanity checker. It's main job is to detect inconsistencies or damages in the core database files. This includes the document and collection storage (dom.dbx, collections.dbx) as well as the symbol table (symbols.dbx). While all the indexes can be rebuild after a crash, a corruption in the core files can lead to real data loss.
Another Old Problem is Solved: Processing In-Memory Fragments
I have some good news for all users who suffered from eXist generating too many temporary document fragments in the db: the current SVN trunk version doesn't need to store those temp fragments anymore! eXist is finally able to handle in-memory fragments in nearly the same way as persistent documents, which also means: without causing a performance bottleneck.
Summer of Code Deadline Extended
Google has extended the deadline for student applications until Monday, April 7, 2008. We hope this will convince a few more people to apply. So far we received 6 proposals. However, I think that a detailed application which concentrates on concrete development steps will still have a realistic chance to be accepted.
Google Summer Of Code
eXist is participating in the Google Summer of Code again. Student applications can be submitted until March 26! The list of proposals along with the timeline can be found here:
http://www.exist-db.org/gsoc/2008/summer.html
Some of our project proposals may sound a bit ambiguous, but please don't be scared: we are here to help ;-) We chose those projects because they represent more or less separate work packages which can be handled without knowing the entire eXist code base.
The list includes, for example, a "remote debugging interface" for XQuery. We certainly don't expect someone to write a full-blown debugger. What we need is a well-defined debugging interface on the server and a simple command-line prototype, which demonstrates the interface.
Or take "index-support for order-by, distinct-values and aggregate functions": eXist's new indexing architecture makes it possible to implement the required functionality as an index plugin. Not too difficult. The challenge though will be to make it efficient.
eXist 1.2 Released
After another three weeks of documentation work, I'm really happy to announce that eXist version 1.2 (codename: Rennes) is now available for download at
http://exist-db.org/download.html
We counted more than 2500 software changes for this release, including many serious stability fixes.
eXist at XML 2007
I have not been there myself, but as we heard from those who attended XML 2007, there were quite a few talks mentioning eXist:
Erik Bruchez: XForms and the eXist XML database: a perfect couple
Dan McCreary: Using XForms and eXist to Manage Metadata
Kurt Cagle: Lightweight XML
Mark Birbeck: XForms, XHTML, and RDFa for Internet-Facing Applications
Mark Birbeck: XForms on the Desktop using Sidewinder
The slides for the first two talks are available online. We also heard Norm Walsh mentioned an XProc implementation in XQuery based on eXist (I guess this must be Jim Fuller's XProcXQ). XProc is an interesting new standard and I would love to see a simple implementation which can be easily integrated with eXist.
99.4% XQuery Conformance
We just reached another milestone in our struggle to make eXist 100% conformant with the XQuery specs: 99.4%! Details can be found on the official W3C XQuery Test Suite pages.
Recent changes were mostly related to namespace handling, though we also had a number of small fixes to the XQuery parser, including whitespace processing and ordering declarations.
Understanding the New Indexing Features
The upcoming next release of eXist will introduce quite a few changes with respect to index types and index creation. While your old index configuration should still work with the new version, knowing the new features and possibilities can sometimes result in a dramatic performance boost.
To better understand the changes, we have to look at two different areas of development, which both have direct effects on indexing features:
- The switch to a modularized indexing architecture
- The new query-rewriting optimizer
New Wiki Online
This site is now going public as we are starting to switch the links on the eXist homepage to point to http://atomic.exist-db.org. We will try to move all valuable contents from the old wiki into here. This has to be done manually though as the old server is definitely dead. Well, it's a good chance to re-read and evaluate all the old stuff.
The new site will be in a "private beta" mode for now, which means that only selected users can edit entries. We will open registration for other users once we are sure the system is running stable enough.
AtomicWiki: An Atom-based Wiki
What you can see here is a first live version of AtomicWiki, my XQuery-based Wiki engine. AtomicWiki started as an experiment to create a simple blog on top of eXist's existing Atom support. Eventually, more and more features were added during the past months, so the project has more evolved into a wiki-style system than "just" a weblog.
AtomicWiki is entirely based on the Atom Publishing Protocol and syndication format. All entries are stored as Atom feeds in eXist. We use the Atom Publishing Protocol to create and manipulate feeds and entries. Nearly all the functionality - except one Java function for parsing Wiki markup - is implemented in XQuery with the help of some XSLT and Javascript.
What makes AtomicWiki really powerful though, is its tight integration with XQuery!
Read more about AtomicWiki: /atomic/AtomicWikiFeatures