The Lexbe Uber Index
Is Your eDiscovery Search Finding All of the Critical Evidence?
It’s easy to assume that all search tools included in popular e-discovery and litigation document management are about the same, but that could be a costly mistake. In reality, quality varies greatly. Most document review tools in use today regularly exclude important data from their search indices, and as a result, from your search results. They miss key information for a variety of reasons detailed below. Missing search results can be important because if you don’t find the key evidence in your case and the opposition does, you and your client can be at a tremendous disadvantage. As electronically stored information (ESI) collections and the number of custodians increase, the need for comprehensive and accurate search results is more critical than ever.
With the Uber Index feature set part Lexbe eDiscovery Platform, we’ve moved search to the next level in terms of accuracy, quality, and completeness. We capture extracted text in native files, metadata, pseudo metadata, the OCRed text from paginated versions of the same files, as well as an English translation of foreign-language documents, if available, all in a comprehensive, combined index for fast and easy searching. By utilizing Lexbe’s Auto-Translation+ feature you can return documents in other languages with your English search queries. This means that a search for “Office Building” in English will return a German document with “Bürogebäude”.
Shortcomings of Traditional ‘Print-Driver’ based Search Approaches
Most search tools in use today index only the text that is created using a print-driver version of a document (TIFF or PDF), similar to what you would see if you print a document. This ‘virtual printed’ version of a document can be OCRed, but it commonly misses a surprising amount of common real world data, including hidden sheets, filtered cells, and hidden cells in Excel worksheets, speaker notes in PowerPoint presentations, and revisions, notes and comments in Word documents. These pseudo-metadata is generated by the user but not evident in the document in most document review tools.
Commonly Cited in Litigation, Pseudo-Metadata is Required to Achieve a Comprehensive Keyword Search for Evidence
While there are multiple types of metadata generated from applications, systems and users, it is the user-generated pseudo-metadata that is often most challenging for e-discovery and document review platforms to capture. User-generated pseudo-metadata is not readily visible in documents because it is found in hidden cells and document areas that are rarely included in the print selection. The Lexbe Uber Index captures all pseudo-metadata for inclusion in your keyword searches.
How to Avoid Missing Critical Evidence in Hidden Cells and Spreadsheets During eDiscovery
Microsoft Excel, with over 750 million licenses in use worldwide, followed by Google Sheets and Apple’s Numbers support business critical functions with analysis and storage of data in tabular, graphical and functional forms. The spreadsheets generated by these applications frequently grow very large for analyzing and calculating large data sets. Microsoft Excel 2010 has a capacity that exceeds 1,000,000 rows and 16,000 columns. Frequently, for large spreadsheets to be viewed, custodians will hide cells to shrink the desired area to be viewed onto a single screen. This poses a problem for eDiscovery tools that rely just on OCR character extraction because they will miss this evidentiary data. The Lexbe Uber Index solves this problem by extracting all characters from native documents, including those in hidden cells and hidden workbooks. The Lexbe Uber Index ensures that you’re able to generate search results from keywords that reside in hidden cells and workbooks.
This is an Example of How the Lexbe Uber Index Unlocks Critical Evidence from Hidden Cells
How to Avoid Missing Critical Evidence by Including PowerPoint Speaker’s Notes in Your Keyword Search Index
Microsoft PowerPoint, Google Slides, other presentation applications and web conference platforms include the ability for presenters to craft notes for reference during presentations. These notes often contain evidence that should be included in keyword searches. While many eDiscovery platforms miss this important data, the Lexbe Uber Index captures presenters’ notes and makes them instantly searchable.
This is an Example of How the Lexbe Uber Index Reveals Critical Evidence from Speaker’s Notes
Other search tools, including the latest generation of early case analysis (ECA) tools, only use the native extracted text for search. This has the advantage of avoiding many or even all of the above problems but comes with it’s own set of challenges. Information in a document that requires OCR, or cannot be extracted from the native documents, will be lost. Separate or embedded image files that need OCR remain a common and important component of ESI collections and ECA tools often fail to capture this critical data.
Microsoft Word, Google Docs and several collaboration platforms provide the ability to create and track document revisions, comments and notes that generate key evidence in litigation. Many e-discovery and document review platforms fail to capture this key data in their indices an therefore you run the risk of missing evidence. The Lexbe Uber Index captures all document revisions, comments and notes ensuring that you can find that needle in the haystack with a simple keyword search.
This is an Example of How the Lexbe Uber Index Captures Document Revisions, Comments and Notes Through Extraction and OCR
More and more documents in complex cases today may be in multiple languages, sometimes within a single email string! Allowing you to search and review an English translation in-line with your review is a challenge which if not competently addressed, can result in missing key evidence. There are many translation services available to the legal industry, but these services can be slow and expensive. From a document review perspective, they are not usually included in timely search results. Even the most advanced review tools on the market today do not have a solution that integrates search results from documents in multiple languages.
Advantage of ESI Search Multi Index+
Lexbe solves the the search problems of traditional eDiscovery review and litigation document management programs with the Uber search index. The Uber index includes:
- Text extracted from the native file. This picks up all text and any meta-data linked to the
- Text from an OCRed version of the same file
- Text from an English translated version, if available, as well as an English translation of foreign-language documents, if available.
Our search returns results from both text extracted from native files and text from a PDF-paginated version we create, including OCRed images. You get the best of both worlds and better search results. Lexbe’s language translation index rounds out the most robust, multi-layered index on the market. Hidden text, comments, revisions, notes, hidden sheets, foreign language and more, will be available to you and your team. Don’t risk your case by missing key evidence or inadvertently releasing privileged documents!