Outwit (software)

The OutWit platform is a Web harvester and download management environment developed by OutWit Technologies and originally released as a first public beta in May 2008.

The central module of the platform, the OutWit Kernel, includes a library of recognition and extraction functions, packaged as a free extension for Mozilla Firefox. Around the kernel can be created specific applications using the application programming interface API. The platform's license allows advanced users to build and distribute their own original tools —called outfits— taking advantage of the Kernel's features for specific applications. Each outfit is a small XUL extension, with its own user interface, features, scripts, scrapers, directory of Web sources...

The technology is presented as a step towards a semantic browser which will recognize data and media elements using metadata when present and inferring semantic information when possible. The software automatically browses through Web sources to harvest information objects and organize them into reusable and sharable collections or mashups.

OutWit Hub is the first tool based on the OutWit platform. The beta version gathers a series of features to ease Web searches and organize collections. By breaking-down the elements of a Web page into different types of data, i.e. images, links, email addresses, text, tables etc., the program allows users to manipulate only the desired data and use it in a variety of applications. the application automatically browses through Web sources in full screen, analyzing each page’s navigation links and guessing the most pertinent next page URL. This way, with or without programming skills or technical knowledge, users can create automatic agents and scrapers to gather and format the information they seek.

While some of the data extraction functions are traditional web/screen scraping features, requiring the creation of a specific extraction masks for a page, others act more as intelligent filters eliminating all data not specifically requested.


The OutWit kernel's basic feature library includes:

  • Data structure recognition
  • Automatic multi-page browsing
  • Full-screen browsing
  • Automatic slide show on image searches
  • Page & image link extraction
  • e-mail extraction (automatic extraction is limited)
  • Table and list extraction
  • Syntax colored page source
  • Scraper editor for custom data extraction

External links

Search another word or see outwiton Dictionary | Thesaurus |Spanish
Copyright © 2015, LLC. All rights reserved.
  • Please Login or Sign Up to use the Recent Searches feature