Version 5.0 Beta is Finally in Final Stages of Internal Testing

This has been a long time coming. But the release is getting closer.

The most significant change we are making with this release is to incorporate a database query and extraction feature. Here is a screen shot of a test query – I am looking for all 8-Ks filed between 2/11/2014 and 7/16/2021 that included ITEM 4.02 (Non-Reliance on Previously Issued Financials) as one of the reason codes.

This was brutally fast. The query executed before I could pretty much blink. Notice the Save Results button. If you hit that, you get a regular file dialog to name a csv file to save the results.

We have two strategic goals with this new version. One is to rationalize the introduction of document metadata. Last year I tried to add metadata to individual documents and while it worked, I heard from some users about the cumbersomeness. Frankly we found it painful because every time we want to add some new metadata to a document – we have to do a ton of work, change the document and then re-index the document collection. Now we can separate the metadata from the document but provide you a link to get from a db query to the original documents (and vice-versa). In plainer English – we intend for you to be able to interact with the databases and if you desire identify some specific documents that interest you based on a database query. Run the query – use the output to run a full-text search and filter the full-text search based strictly on the existence of the document in the csv file from the db query.

We also intend – to allow you to run a full text search, save those results and then use those results to pull the metadata you want from a database query. Like you, I am getting a headache as I am writing this because it is complicated – but it is also powerful.

The second strategic goal is to give you better access to the other data we have available on the platform. In this second image you see a new db is listed as available. The application will inspect the db folder when started and list all db available at that moment. You don’t have to do anything but open the tool. With respect to the databases we will start moving all of our existing tabular data to this platform so you no longer have to use the ExtractionPreprocessed interface. In the screenshot below – I am querying our Executive Compensation data for any officer that has the word counsel or legal in their title, who earned a salary greater than $350,000 and were female. Two key issues here, we are going to shift data availability to data year versus document year. And then second – you can do some advanced filtering using the query interface to select your sample.

Query of EC Data using new query tool

There are other changes, most are relatively minor. I was helping a client who had not used the Options feature to update the indexed documents. Because we now control more of your experience we made the process of updating document index libraries automatic on restart of your session. While we will announce new document collections through the blog you will not need to do anything special to access the indexes.

All of this is a process. We have some folks working on the steps required to move our existing JSON based data into the SQLite databases. I personally am working on adding institutional trading data (from 13F-HR filings) into a format to make it accessible through this interface.

My guesstimate right now is that the we will switch you over to the new version around the 4th of July. All of your search history etc will not be affected by the transition.

Switching to a db model is really exciting for us. Once we are comfortable that the process of converting the existing data is solid we plan to experiment with developing simpler but powerful ways to JOIN data across databases. If you wanted the effects of a join you will have to make two queries and merge based on like CIK and YEAR.

Leave a Reply