OpenZoom Description Format

When we look at multiscale or multi-resolution imaging in 2008,¹ we’re mostly looking at a pile of image files² (called tiles) that make up an image pyramid.³ Typically, image file formats, e.g. JPEG and PNG, have stored their properties such as width and height inside the file. These formats acted not only as carrier of image data but also as container for the metadata associated with it. This is a manifestation of the The Truth Is in the File paradigm.

Deprecation Warning
The OpenZoom Description Format was just a proof-of-concept. For real-word applications, I highly recommend using the Microsoft Deep Zoom file format as it has the widest client & tooling support, is well-documented and can be used within collections.

The Truth Is Out There

Taking this paradigm into account, we suddenly encounter a problem with multiscale images. If the original image is exploded into many little pieces, where do we store its metadata? There are different solutions to this problem. For mapping sites such as Google Maps and Yahoo Maps it is probably sufficient to just hard-code the image pyramid properties and how to access the tiles directly inside the client. However, for general multiscale image viewing technologies such as Zoomify, Deep Zoom or OpenZoom this is not an option since we don’t know the properties of the images until run-time. Again, there’s a simple and elegant solution to for this: XML description files that carry the image metadata.

Rumble In the Jungle

In the following section I will first quickly present you the two dominant multiscale image description formats out there: Zoomify and Microsoft Deep Zoom. After that, I will introduce you to a new description format I designed called OpenZoom description format.

Note: The following examples all describe a 10 megapixel JPEG image with the name bruges.

Zoomify
Example
This Zoomify image has the following structure on the file system: Descriptor bruges/ImageProperties.xml [filename]/ImageProperties.xml Tiles bruges/TileGroup0/0-0-0.jpg bruges/TileGroup0/1-0-0.jpg bruges/TileGroup0/1-0-1.jpg bruges/TileGroup0/1-1-0.jpg bruges/TileGroup0/1-1-1.jpg bruges/TileGroup0/2-0-0.jpg … bruges/TileGroup0/4-15-9.jpg [filename]/TileGroup[X]/[level]-[column]-[row].jpg

Deep Zoom Image (DZI)
Example
  
This Deep Zoom image (DZI) has the following structure on the file system: Descriptor bruges.xml [filename].[xml|dzi] Tiles bruges_files/0/0_0.jpg bruges_files/1/0_0.jpg bruges_files/2/0_0.jpg … bruges_files/9/0_0.jpg bruges_files/9/0_1.jpg bruges_files/9/1_0.jpg bruges_files/9/1_1.jpg … bruges_files/12/15_9.jpg [filename]_files/[level]/[column]-[row].[extension]

OpenZoom Description Format
The following is actually a description of the Deep Zoom image we've looked at previously. Descriptor bruges.xml [filename].xml Tiles Wherever you wish… Example
  
    
      
    
    
      
    
       …
    
      
    
    
      
    
    
      
    
    
      
    
    
      
    
  
OpenZoom Description Format XML Schema (Draft)

Not Invented Here

Alright, we’ve seen examples of all three description formats for the same image. Before anything else, you might ask yourself: Why the #&$@ another format? Good attitude and glad you asked. Hopefully, I will be able to answer this question for most of you. If not, just leave me a comment, I’d be glad to discuss this further. Now, let’s compare these three formats by looking at where they shine but of course also at their shortcomings.

Conciseness

Obviously, Zoomify and Deep Zoom win big time here. Their description files have a couple of lines vs the 40+ lines of the OpenZoom descriptor which inherently is very verbose — Ed.: Levels 2–7 omitted for esthetic reasons. On the other hand, we should keep in mind that everything we see in the OpenZoom descriptor sample somehow has to be computed by the client for the other two formats. More on that later.

Portability

Not sure if portability is the right term, but let me explain what I mean: How flexible is the format regarding the storage of the descriptor and its image tiles? Deep Zoom is the most extreme case of the three where the descriptor file and the image tiles are strongly coupled through the original file name of your image. That means if you move your descriptor you always have to remember to move the image data folder as well. This could be considered risky as the two are not contained in one folder. Zoomify has the same limitation but at least the image data and its descriptor are both contained in the same folder that carries the name of the original image. OpenZoom is clearly the most portable of the three as it let’s you specify the descriptor file independently of the image tiles.

Important Note: Both Microsoft and Zoomify offer an alternative storage method in the form of a single-file format. They are called Zoomify’s Pyramidal File Format (PFF) and DZIZ (a ZIP-based container for DZI) which I’ve seen used by Microsoft Photosynth.

Flexibility

Flexibility apparently was not a design goal of Microsoft or Zoomify. This is fine considering that the design of such a new format requires these kinds of trade-offs. Their assumption is that the descriptor file and the image tiles are strongly coupled and the latter are computed with a well-defined algorithm and stored in a fixed file hierarchy. Flexibility is the area where the OpenZoom description format shines. When I worked on the OpenZoom description format, I obviously followed the Python Zen which states Explicit is better than implicit. Although one drawback is the verbosity of the format, there are many advantages we can get from it. For example, when I worked on the OpenZoom framework, I wanted to test it with some really large multiscale images that are out there. Well, what is the largest image out there that I know of? A map of the world, of course. The OpenStreetMap Project, for example, features many, many gigapixels of image data. Fine, so how do I test the framework with a map? Hard-code the URLs somewhere? No, no. Let’s create a descriptor for it. So I did. Grab it and play with it with your copy of the OpenZoom framework. Look Ma’, no code!

This example demonstrates one of the advantages of the format, namely your descriptor file does not have to be stored along with your image data. Just put your descriptor wherever you wish and point it to the image tiles.

Features: OpenZoom Description Format

The following section gives you a short summary of some of the features in the OpenZoom description format.

Flexible Pyramid Layout Behind both Zoomify and Deep Zoom, there are well-specified algorithms that create the image pyramid and define its properties. To get an idea of how the formats expand the information you previously saw in their descriptors, feel free to take a look at their implementation in OpenZoom: ZoomifyDescriptor and DZIDescriptor.

The OpenZoom description format doesn’t require a particular layout of the image pyramid. One requirement would be that every level of the pyramid approximately has the same aspect ratio but I’ve even managed to work around that constraint. To give you an idea of how powerful this flexibility is, consider the following couple of facts:

The OpenZoom description format can express both, the properties of a Deep Zoom image pyramid, as well as the one produced by Zoomify. Besides these, it supports the pyramids of OpenStreetMap, Google Maps (road, terrain and satellite) and many more.
Just like in Deep Zoom, you can specify tile overlap⁵ in the OpenZoom description format.
Unlike Deep Zoom or Zoomify, the OpenZoom description format also supports non-square image tiles by exposing a tileWidth as well as a tileHeight property. Deep Zoom and Zoomify obviously don't have to support this as they know that their algorithms don't produce non-square tiles. The OpenZoom format however, has to accomodate legacy multiscale image data that has non-square tiles.
One thing that surprised me most is the fact that even images on Flickr which are stored in many different dimensions can be put into relationship of an image pyramid. The levels of a Flickr image pyramid are quite irregular compared to Deep Zoom and Zoomify as they are bounded by maximum sidelengths of 100, 240, 500, 1024 and original. Even though it isn't very efficient since Flickr doesn't support tiles, images from Flickr can be rendered as multiscale images inside OpenZoom.

Important Note:

Deep Zoom features a powerful concept called Sparse Images not present in any other format known to me. ~~However, I am considering to incorporate this feature into the OpenZoom description format at a later date.~~

Powerful Addressing Scheme

column

row

cartesian or rectangular coordinate system

{row}

{column}

[0, numRows)

[0, numColumns)

numRows

numColumns

level

.php

.cfm

type

pyramid

Exceptions:

Obviously, no matter how powerful a design is, there are always things it can't handle. For the OpenZoom description format this means sources such as Microsoft's Virtual Earth or the GigaPan project which both feature a quadtree-based addressing scheme. That the OpenZoom description format cannot describe these kinds of sources doesn't mean the OpenZoom framework can't render them. However, doing that involves some amount of code which in the case of OpenZoom would mean to implement the IMultiScaleImageDescriptor interface. For Silverlight Deep Zoom that would be the abstract MultiScaleTileSource class.

Support for Multiple URLs

limited to 2 concurrent requests per domain

Example

Ease of Implementation

OpenZoom source code repository

ZoomifyDescriptor

DZIDescriptor

OpenZoomDescriptor

Conclusion

OpenZoom description format

yet another description format

OpenZoom framework

behind the scenes

Acknowledgement

Boris

OpenZoom description format specification

YAMSIDF.

Footnotes

[1 & 3] If you'd like to get some more background on this topic, I wrote an introduction to multiscale imaging and another article about the mathematical properties of an image pyramid using Microsoft's Deep Zoom as an example.

[2] From my own experience, I know that there are unfortunately still people out there who think that there is some magic going on behind multiscale imaging. To set this straight, if you've used any of the following, Google Maps, Yahoo Maps, Microsoft Virtual Earth, Silverlight Deep Zoom, Seadragon AJAX, Seadragon Mobile or Zoomify,⁴ you should know that all of them basically work the same, namely with off the shelf JPEG or PNG image files. These files are stored either on disk or in a database. Once requested, they are sent to and rendered on the client which in the previous examples is either the browser, the Flash or Silverlight plugin or the iPhone. But you might ask: What about JPEG 2000? Indeed, there are some possible candidates for image file formats out there which would bring better support for multiscale imaging in the future. Two of them being JPEG 2000 and HD Photo. We won't see significant adoption of the first anytime soon because of legal issues such as this one. HD Photo originated at Microsoft and is being considered as successor to the JPEG standard dubbed JPEG XR. Again, widespread use won't happen overnight.

[4] By the way, OpenZoom supports most of these out of the box.

[5] In Inside Deep Zoom 2 I've explained the concept of tile overlap.

RTFM / Daniel Gasienica

December 19, 2008

OpenZoom Description Format

Deprecation Warning

The Truth Is Out There

Rumble In the Jungle

Zoomify

Deep Zoom Image (DZI)