About Voyeurtools

Voyant Tools is a labour of love. It’s ancestry includes HyperPo and Taporware and more distantly TACT.

Content examination devices backpedal to the primary impromptu devices that Roberto Busa made for his concordance of crafted by Thomas Acquinas and Andrew Booth’s Mechanical Resolution of Linguistic Problems in the 1950s.

Voyant is a suite of investigation and investigation instruments for computerized writings. Not very many commitments to information and innovation are unrecognizable from what went before, and Voyant is no exemption: it is to a great extent based on the establishments of content investigation apparatus outline and system from more than 50 years of humanities processing research. The accompanying are a portion of the instruments that have most impacted content examination apparatus improvement and Voyant specifically:

Software Libraries

Voyant Tools is made possible by several open source libraries (many of these libraries use additional libraries not listed here):

  • back-end system (Trombone):
    • Apache Hadoop for running scalable distributed processes
    • Apache PDFBox for reading PDF documents
    • Apache POI for reading Microsoft Office documents
    • Apache Commons Math, Collections, File Upload, IO, Compress
    • CyberNeko HTML Parser for reading (less than valid) HTML
    • Google Translate API for automatic translation of documents
    • JAMA: Java Matrix Package for principal component and correspondence analysis in ScatterPlot
    • MAchine Learning for LanguagE Toolkit (MALLET), especially for topic clustering
    • Oracle Berkeley DB Java Edition for data storage
    • Stanford Core Natural Language Processing, especially for named entity recognition in RezoViz
    • XStream used to produce XML or JSON results
  • front-end interface (Voyant)
    • arbor.js a graph visualization library used by RezoViz
    • Google Closure Compiler to compress Javascript files
    • Highcharts for interactive graphs like in Word Trends
    • Javascript InfoViz Toolkit a visualization framework used by some tools
    • jQuery another Javascript framework used by some tools
    • Mandala Browser
    • Protovis a graph visualization library
    • Sencha EXT JS the main Javascript framework used