Certiv Analytics

Certiv Analytics

Innovative Legal Analysis Tools

CodeChase

CodeChase is Desktop Search optimized for the Eclipse work environment.

Enables free-text search (Lucene-syntax) over content hosted on the locally accessible filesystem. Content specializations are provided for indexing and searching Java, PDF, HTML/XML, and plain-text documents. Can index and search documents located in jar, zip, tar and the various tgz and tbz2 type archives.

Design intent is to enable direct searching the platform SDK source, local workspaces, and archive code collections. Search results files can be opened directly in content appropriate platform editors.

Architecture

CodeChase is implemented as:

  • search form -- editor-based, allowing multiple search forms to be open concurrently; and
  • repository definition form -- view-based, enabling multiple content repositories to be defined, edited, and indexed.

Features

Indexed content can be hosted on the platform in plain files or contained within archives. CodeChase can open and read jar, zip, tar, the tgz variants, and tbz2 variants.

Screen Shot

Top: Search Results form
Bottom: Repository Configuration and Indexing view form.

Development State

  1. Current version is stable.
  2. Initial version targets function only. Performance is decent even though no attention has been paid to optimization.
  3. No indexing job cancel button (yet).
  4. Retrieving content from tar-based archives is decidedly slow, as should be expected of stream-oriented archives.
  5. Indexing rates are highly dependent on disk speed and the ratio of indexable to non-indexable documents encountered in the index scan. Expect speeds of 50 to 250 documents/sec. The CodeChase index job will appear to pause whenever it encounters a large number of consecutive non-indexable documents.
  6. CodeChase will only read into a single level of archives -- 'tar.gz' is considered a single level.

Use

To get started,

  1. open the repository view form:
    Window->Show View->Other->Certiv->CodeChase
  2. import the sample content repository specification.
  3. edit the repository root directories to match your desired content roots.
  4. open the CodeChase preference page and select a local directory for the Lucene generated index.
  5. start the index job. Indexing can be constrained to checked repositories. Note: none checked is the same as all checked.

Search Constraints

The four search constraints boxes are combined using a logical AND. The internal logic of the individual boxes is defined as follows:

  1. terms entered in the search box are, by default, combined by a logical AND -- standard Lucene term modifiers and boolean operators are supported (click the Syntax link for more info).
  2. terms in the repository box are combined with a logical OR, except none checked is the same as all checked.
  3. terms in the content type box are combined with a logical OR, except none checked is the same as all checked.
  4. terms in the tags box are combined with a logical AND, except that none checked is the same as true.

Inclusion/Exclusion Filters

The proper approach to successfully designing filters is to:

  1. include files and filetypes; and
  2. exclude directory paths, files, and filetypes.

Installation & Requirements

  • CodeChase 0.9.0
    • Update site: http://www.certiv.net/updates
    • Eclipse 3.5.x and JDT
    • Eclipse VM: JDK 1.6+

  • License

    Available only under these terms:

    • License:  A personal right to use, as is, and nothing more; distribution of any component or part to any other person or entity is specifically prohibited. 
    • Warranties:  Absolutely none of any cognizable nature including MERCHANTABILITY and/or FITNESS FOR ANY GENERAL OR SPECIFIC PURPOSE OR USE.  In fact, be forewarned that use of this software will fail, will likely cause failures in other software, and will likely result in the loss of data.
    • If you use this software, then you agree to these terms.