folge2TMNavTMHarvestPanckoucke
Hamburg 2004

Overview

TMHarvest gathers Topicmaps.

The seed is your existing data, poured and nourished. The topic map is the crop and TMHarvest helps you to reap it.

Features

Features of TMHarvest:

  1. TMHarvest creates topic maps from existing data sources. Currently it supports the following source-types: SQL-Databases, CSV-Files, XPath-Queries. In addition it is possible to integrate custom model-providers, written in Java. See <customProvider> for details.
  2. The harvesting is driven by a model-file.

    The model may be processed as often as you like. Since the data sources are queried at processing time, it is possible to build topic paps periodically in order to reflect changing content.

  3. The model-file has a modular setup. The resulting topic map is populated step by step from distinct templateActions. The static parts of the topic map (the structural parts, for example the topics that define the ontology of the topic map) may be merged from static files and therefore seperated from the actual harvest.

With TMHarvest you can populate topic maps with real data, processed in real-time.

TMHarvest allows you to build topic maps that have a solid structure and reflect changing content.

Audience

The current version targets developers and information-architects with a strong affinity to technical questions.

In the long run it should be possible to develop interfaces, that enable technophobic authors and surprised users to define and create topic maps with the help of TMHarvest.

The model-file

The model-file contains a set of actions that describe step-by-step the process of creating a topic map.

The file must be written in xml, a documented dtd exists.

You find a more detailed description of the model-file here

Modelling/Ontology

The harvesting process is defined by a set of building modules (templateActions). Such a module may describe - for example - the creation of a topic or of an association. Or it describes the addition of occurrences to some topics. Typically there is a strong correlation between one template-action and a part of the ontology, like one topic type.

Working on the model-file simultaneously shapes the development of the ontology. It is conceivable that this approach will influence graphical user-interface development some day.

The types that constitute the ontology may be created on the fly, while a template is processed. But a much better modus operandi is to define them separately in an external topic map. This map can be included at the beginning of the harvesting-process with the help of the mergeAction.

Pros and Cons

The process which creates the topic map is serialized in the model-file. Once the process is defined, it runs as often as you like. This enables the creation of topic maps, that are refreshed periodically in an automated way.

Working with the templates leads to a certain interactive experience. Shaping the model, letting TMHarvest run, revisiting the results, reshape the model... let you converge the final form

The fact that the model-file must be written by hand, excludes many users, which lack the necessary technical affinity. The development of graphical interfaces or wizards would be really helpful.

Context

TMHarvest is a subproject of TM4J, an open source topicmap engine, written in java and mainly developed by Kal Ahmed

TM4J, along with its subprojects, is hosted at sourceforge.

TMHarvest, as well as TM4J, uses a wide range of existing open source libraries, mainly developed by the apache jakarta and xml-projects.

Reading metadata from mp3-files is done with the help of javamusictag.

Team

TMHarvest is currently developed by Christoph Fröhlich (c_froehlich at users.sourceforge.net) and Harald Kuhn (harald_kuhn at users.sourceforge.net).

Most valid contribution came from Niko Schmuck (niko_schmuck at users.sourceforge.net).

If someone wants to step in, he will be received with cheers.