A Converter to RDF is a tool which converts application data from an application-specific format into RDF for use with RDF tools and integration with other data. Converters may be part of a one-time migration effort, or part of a running system which provides a semantic web view of a given application. See also: RDFImportersAndAdapters
Please add converters as you make them or hear of them.
Formats
in alphabetical order:
BibTex
BibTex is the format for bibliographic references in TeX.
bibtex2rdf transforms BibTEX files into RDF/XML. (Simile)
bibtex2rdf - A configurable BibTeX to RDF Converter by Wolf Siberski.
An online service set up at the Vrije Universiteit in Amsterdam, the Netherlands, following the OntoWeb portal vocabulary. The perl source can also be downloaded.
Java BibTeX-To-RDF Converter based on the SWRC terminology.
Bittorrent
http://www.inf.unideb.hu/~jeszy/rdfizers is alas now 404 (in 2007). This was a link from RDFizers but may be incorrect.
Debian & Fink dependencies
The package information in debian and similar systems, with its general usefullness and its graph-like nature, is a clear candidate for conversion to RDF.
See VitaVoni blog about this.
finkn3.py Takes Fink (OS-X port of Debian packaging) dependencies and converts to to RDF/N3. (SWAP) No idea whether this would be a quick hack to export debian data.
Email (RFC822 headers)
email2rdf transforms email mbox files into RDF/XML. (Simile)
aboutMsg.py converts email metadata to RDF. (SWAP)
SWAML transforms a mailing list into RDF/XML and XHTML+RDFa using SIOC.
Email::MIME::XMTP Perl extension to read and write XMTP
aperture.sf.net IMAP crawler
There are others in this vein which run over IMAP or mailbox files.@@
Excel
TopBraid Composer can convert Excel spreadsheets into instances of an RDF schema.
EXIF
See JPEG.
Flickr data
Dave Becketts flickurl library can access Flickr information (including machine tags) and convert it to RDF
Flat files
Unix systems store data (such as /etc/passwd) in flat files with comma separation.
flat2rdf converts classic unix text database files, like /etc/passwd, into RDF/N3 (Simile)
tab2n3.py Takes Tab-separated text (as typically output by all kinds of things including Microsoft Output and Spreadsheets) and converts it to N3, using the column headings to generate property URIs. (SWAP)
TopBraid Composer can convert tab-separated spreadsheet files into an RDF/OWL class with corresponding properties and instances.
GPS
garmin2rdf.py Reads a Garmin GOPS receiver, dumping the contents in RDF/XML. (Matt Biddulph)
fromGarmin.py Downloads GPS data from a Garmin on a serial link to an RDF/N3 file. (SWAP)
iCalendar
iCalendar is an IETF standard for calendar (event and to-do list) data. Icalendar files typically are stored with a .ics extension.
fromIcal.py converts iCalendar form to RDF
toIcal.py converts RDF back into iCalendar.
aperture.sf.net java converter for iCalendar included
http://torrez.us/ics2rdf/ iCal to RDF Service
Java bytecode
java2rdf scans java bytecode for method calls and creates a description of the dependencies between classes and the package/archive encoded in RDF/N3. (Simile)
Javadoc
javadoc2rdf is a doclet that makes javadoc output metadata about your code (structure of the classes, methods, comments, etc.) encoded in RDF/N3. (Simile)
Issue tracking: [http://www.atlassian.com/software/jira/ Jira]
jira2rdf transforms Atlassian Jira's events about bug reports and issue tracking into RDF/N3.
JPEG
The metadata within JPEG photo is encoded in the EXIF standard.
jpeg2rdf scans a folder for JPEG files, parses the EXIF and IPCT metadata found in those files and dumps an RDF/N3 representation of it into a file. (Simile)
An adapted version of jhead extracts RDF data form the EXIT encoded in JPEG files within a directory. Generates RDF/N3. (SWAP)
LDIF
This is format used for contact information in LDAP server system. It is for example exported by Thunderbird's address-book.
ldif2n3.py Very incomplete, but useful. Generates foaf. Hides email addresses by hashing in the FOAF style if -m command flag is given. (SWAP)
Makefile
The unix Makefile syntax expresses dependencies between files in a software build.
make2n3.py Convert the makefiles in several directories in RDF and merge them to get the big picture. (SWAP)
MARC
transforms MARC records from Z39.2 format into MODS and then from MODS to an RDF representation of MODS.
Meteographical
Meteo is UK weather forecast data in RDF, extracted from NOAA's public domain GRIB files. Example: London.
Multimedia
Following the DRY principle, a pointer to tools in the realm of multimedia (origin: MMSEM-XG):
OAI-PMH
oai2rdf harvests an OAI-PMH repository and transforms the captured metadata in an RDF representation thru pluggable XSLT stylesheets.
Outlook
Microsoft Outlook contains contact and event data, and so on in a proprietary format.
Lookout.py convers the Microsoft Outlook calendar and address format into RDF. (SWAP)
aperture.sf.net includes Java crawler for MS Outlook
Open Financial Exchange (OFX)
OFX is the format for downloaded bank statements and other financial information. There are various levels of OFX, the early ones being HTTP headers followed by SGML, the later ones being HTTP-like headers followed by XML.
OFX-to-n3.y converts OFX format to RDF/N3. The conversion is only syntactic. The OFX modeling is pretty well thought out, so taking it as defining an RDF ontology seems to make sense. Rules can then be used to define mapping into your favorite ontology.
Open CourseWare
ocw2rdf harvests metadata from the MIT OpenCourseWare web site and transforms it into an RDF representation of IEEE LOM.
Palm OS
Palmagent converts the calendar format of PalmOS into RDF. (SWAP)
plist
The Apple OS-X property list (.plist) filetype is an XML fromat for arbitrary structured data. Numeric keys are used as local IDs. OS X applications store many kinds uf data in these files, including configuration data, iPhoto almum and photo data, iTunes metadata, and so on.
To convert plists well, added information is necessary, such as a namespace for the properties.
plist2rdf.xsl is an XSLT script to convert a plist file into RDF/XML. It does not add namespaces to the exported data.
Quicken Interchange Format (QIF)
qif2n3.py Takes Quicken interchange format and converts to to RDF/N3. (SWAP)
Quick and Dirty CSV to RDF Converter (QUIDICRC)
quidicrc A perl script for rapidly transferring csv to RDF with some translation in the middle. (not actively being maintained, available open source -- SWAP)
Random
Seriously.
random2rdf generates synthetic random graphs encoded in RDF/N3.
Spreadsheet
Esxcel2rdf is a Microsoft Windows program (exe) that converts Excel files into valid RDF. It has been tested on Windows 98, and Windows 2000 Professional. (MindSwap) Export can be done via comma- or tab- separated values. See Flat Files above.
aperture.sf.net includes Java crawler for Excel and open document. Does only extract plaintext and basic metadata, though.
RDF123 has Windows and Linux applications to download, a Java application and servlet.
SQL
SQL databases are rich stores of relational data ideal for exporting as RDF. Conference tracks and many papers cover this subject from different angles. See also: RdfAndSql
D2R Server provides a mapping from a SQL server (tested with several brands), producing both linked virtual RDF data files and a SPARQL service. Uses a configuration file in N3. (Bizer et al, Freie Universität Berlin)
dbview.py provides a mapping from a SQL server (tested with mySQL), producing linked virtual RDF data files. Uses a cnfiguration file in N3. (SWAP)
Virtuoso's declarative N3/Turtle based Metaschema Language enables the creation of RDF Instance Data for associated RDF Ontologies via RDF VIEWs of ODBC, JDBC, ADO.NET, and OLE-DB accessbile SQL Data. It is important to note that these VIEWs also apply to Native Virtuoso Data and/or Heterogeneous Data from other Web Services, HTTP/WebDAV, NNTP, and other Data Sources known to Virtuoso. This is an enhancement of the traditional SQL VIEW concept than enables multiple use of the same base SQL Data from a variety of data access points.
Many RDF Triple stores are implemented using SQL databases, but that is not covered here.
Subversion
Subversion is a code-management system.
svn2rdf A pair of scripts; one can be used in a post-commit subversion hook to generate RDF/N3 with each commit, the other on a working copy. (Simile)
Tab Separated Text
See flat files.
Talis SW Format Converter
Talis' converter, convert from various format to various formats (including RDF->RDF with various serializations, RDF->HTML, etc)
UML
TopBraid Composer can convert UML Class Diagrams (XMI format) into RDF/OWL models.
VCARD, Addressbook, …
VACRD is a standard for interchange of contact data, such as business cards and address books.
"Representing vCard Objects in RDF/XML" is a W3C note defining an ontology for VCARD. FOAF is widely used ontology covering some of the domain.
* code to convert your Apple Addressbook into FOAF file (Richard Newman)
* ab-foaf does the same.
* XML::FOAFKnows::FromvCard, Perl extension to create FOAF dumps from vCards. Does not attempt to create a full model, just foaf:knows. It also has some privacy features. In addition to the module, which conforms with the Formatter API specification, comes with a command-line tool.
Weather
weather2rdf Given a US city or ZIP code, retrieves weather report data from weather.com and returns it in RDF. (Simile)
XML
GRDDL: Any XML files can be marked up with pointers to XSLT files which convert them to RDF. The standard for this is GRDDL. A GRDDL pointer can even be put in an XML schema, so that automatically all XML documents written to that schema will have a defined RDF mapping which any GRDDL-aware processor will benefit from. Several XSLT conversion transformations can be found linked from MicroModels
TopBraid Composer can convert XML Schema (and their XML instance files) into RDF/OWL models.
Rhizomik ReDeFer includes XSD2OWL and XML2RDF plus MPEG-7 to RDF (all XSLT-based)
XHTML: Convert existing pages to RDF. For example, see HtmlToRdf.
XMP
XMP is an Adobe-sponsored specification for putting RDF metadata in virtually any form of file, including binary formats. XMP metadata is RDF data in fact, but it has to be extracted from the file.
xmpextractor extracts XMP data. (Jeszenszky Péter)
A python script to extract XMP. There is also a service to do that on-line, see separate page
Frameworks
The following are general tools which provide conversion from many formats.
Aperture
aperture.sf.net Aperture is a project written in Java gathering RDF extractors for many formats, mentioned in the list above.
Aperture supports crawling, making it not a converter but a framework to crawl updates of data (like rsync)
PiggyBank
Piggy-bank is a Simile project which allows the Firefox-based clent to automatically load "RDFizers", javascript-based converters to RDF.
Piggy-bank associates given scarping scripts with given web sites. (How?)
Triplr
There is also a general “Stuff in, triples out” system by Dave Beckett, not bound to one specific format only, handles GRDDL, RSS, Atom, etc.
OpenLink Software
OpenLink Software via the "Sponger" component of Virtuoso's SPARQL Processor and Proxy Web Service (used by default by OpenLink Data Explorer) provides RDFization for:
- RDFa
- GRDDL
- Amazon Web Services
- eBay Web Services
- Freebase Web Services
- Facebook Web Services
- Yahoo! Finance
- XBRL Instance documents
- DOI (includes a custom resolver for HTTP)
- OAI
- RSS/Atom Feeds
- Digital Music Files (various formats via ID3 Tags)
- Image Files
- vCard
- iCalendar
- Microformats - hCard, hCalendar
- HR-XML Resumes
- Flickr
- Del.icio.us
- Bugzilla
- ODBC or JDBC accessible SQL Data
- Others
Notes
Historically, this list was made from a lists of RDFizers and SWAP converters. It has grown significantly from community input since then.
This should be in a data format like Semantic Media Wiki or in N3 -- TimBL
> Would there an advantage to have this kind of list in an RDF file specifically to make queries on it. Maybe if we add a format on how to declare it here, we could create a converter to RDF. -- KarlDubost
> The task force InfoGathering from SWEO works on such a vocabulary, if you want to rewrite this list using this vocab, look here: DataVocabulary or contact me -- LeoSauermann on 22.1.2007