Skip to main content
How Can We Help?

Search for answers or browse our knowledge base.

Return to Support Front Page

Categories
Print

Query Tool

The Search, Extraction and Normalization Engine has a built in database query tool. This was added to allow us to transition all of the data that is stored remotely to the drive where our archived filings are stored and to allow us to develop some more unique datasets.

To access the tool select Query Databases from the main menu bar – the tool will open.

Basic Operation

When the tool opens it checks the defined database storage area on the platform and lists all of the databases that are available in the Database to Query section on the interface.

When you select a database from the list the Display Column and the Criteria Column sections of the interface will populate with the names of all of the fields in the particular database. Only a limited number of fields will be initially selected. Only fields that have been selected can be viewed or saved. However, you can query on any field. To select more fields, click on individual fields, to select all fields use the Select All button below the field list. The definitions of the fields can be found in the Database Field Guides section of the Knowledge Base.

You can see I have selected the 10QMETA database. If I simply want a listing of all of the records in any particular database I can just hit the Execute button on the interface and it will load all records and the values for the fields I have selected. Truthfully, the phrase “load all records” is a bit deceitful. The Query Tool is designed to load only the first 1,000 records in the display panel. Our presumption was that you want this data for analysis, not to read and that there are better tools to view and otherwise interact with the data. Thus, even though all of the records that meet your criteria cannot be viewed in this interface, all records that met your criteria will be saved. In the image below I hit the Execute button to load all records from the 10QMETA database and as you can see in the lower right corner, on the day I ran that query there were 834,960 records. I also pulled the slider down as far as I could and confirmed that the last record was the 1,000th record.

To save the results, use the Save Results button to open a control that will allow you to name a location and file where the entire set of results that met your criteria will be saved as a CSV file. The default file name will be populated as DBSaveResults but you are free to modify that. Do not try to modify the Save as type value – these can only be saved in CSV format.

Important reminder – if the field is not selected in the Display Columns section – it will not be included in the output from Save Results.

Caution Regarding Database Size

Some of these databases are huge (20 million plus) and thus may not load completely into your session. If you need access to all records for the larger databases then we suggest you use the Path and Table Name provided in the Database Field Guides to write Python code to access this data. I ran a test on one of our larger databases – without any filtering and after 30 minutes I gave up trying to load the entire database. I wrote some code and in ten minutes had 45 CSV files that had all of the observations.

CIK Filtering

We presumed that most research is going to be CIK constrained (how often do you want the filing times for all 10-Q/10-K/8-K filings made by Nissan Auto Lease Trust 2004-A). To use CIK filtering on a database first create a csv file that has your CIKs in a column named CIK. Note – some other databases use a left padded CIK (0000001800), our platform only works with CIKs stored as numbers (1800). There can be other columns/fields in the CIK file and there is no requirement that the CIK column be the left-most column.

Once your CIK file is prepared, select the Use CIK button and then click the CIK Filter to activate the control that lets you select the csv file that has your CIK list.

Once the CIK file has been selected the Okay button becomes active – hit that and the Set CIK Filter will close. If that is the only filter you need for this search hit the Execute button for the application to process the Query. Below is an image of the results of the Query. I had about 4,400 CIKs and the application indicates that there were 168,696 records in the 10QMETA database for my list of CIKs.

Setting Other Query Constraints

You can also query on any other field(s) in the database by selecting the field from the list in the Criteria Columns, When a field is selected in the Criteria Columns control the available constraints in the Comparison Operator list will change to reflect the nature of the underlying data. In the example below I have selected the CIK_ACC field, the Contains operator and specified the value -19-.

Once you hit the Add Criteria button the application will do any transformation needed to transform the entry into the structured language to use for the query and display the actual query string that will be submitted in the Current Criteria box. The application will also clear the value previously selected that was displayed in the Current Column box.

Notice – in the image above that the CIK Filter control is blue. The CIK Filter is persistent once set until the Use CIK box is unchecked.

Using Date Fields

When any field name that ends with DATE (case insensitive) is selected from the Criteria Columns The operator list will list date relevant operators and the Value to Compare will change to Date to Compare and there will be a date picker control. If Between is selected as the Comparison Operator the controls will change to facilitate selecting a From Date and a To Date. Their is an important error in the image below.

It looks like I am ready to Add Criteria. However, I have not selected an operator yet (there visibility is blocked to capture the date picker. There must be an operator between each of the query constraints.

Not Operator

While the NOT operator is not listed as an available operator – it can be used in conjunction with the AND operator. It cannot be used alone. For that matter, any SQL Logic error will generate an error message like the following.

When you get an Unhandled Exception message it is just best to hit the Quit button and restart the application.

Below is an image of the intended query with the correct syntax after execution.

IN Operator

The IN allows you to specify a comma separated list of values to be used in the query. You can paste or key in the values, the only restriction is that each value must be separated by a comma. In the image below I am adding a list of specific SIC codes to my existing query.

The application will prepare the values and then add the additional criteria to the Current Criteria box as seen below – the image is from after execution.

Case Insensitive

All queries using letters from A-Z are case insensitive – searching for APPLE is the same as searching for apple or any other case variant of the word APPLE.

Filtering Based on Integer Values

When values are stored as integers the Comparison Operator choices change to reflect the expected types of operations/queries that would be used on integers. Below I have selected the DIRECTOR_COMP database, selected all columns for display and I have specified Current Column as CASH, selected the Comparison Operator Between 100000 and 250000.

Using commas in numeric values will lead to an SQL Query Error.

Here are the results of the query. I discuss the appearance of the date anomaly in an aside at the bottom of this page.

Random Aside

I think it is a valid question – Why is director compensation for fiscal year 2019 coming from a filing made in February 2021?. I was curious myself – the company is Granite Construction CIK 861459. The did not file a proxy in 2020 (for FYE 12/31/2019 because in February 2020 they discovered:

In February of 2020, the Audit/Compliance Committee of the Company’s Board of Directors (the “Audit Committee”), assisted by independent counsel, began an investigation of prior-period reporting for the Heavy Civil Operating Group, and the extent to which these matters affect the effectiveness of its internal control over financial reporting. The investigation is in its early stages and the Company cannot predict its scope, duration or outcome. Following the conclusion of the investigation, the Company anticipates it will be in a position to file its Form 10-K.

Ultimately the investigation was not resolved until August 2020 as evidenced by an 8-K filing with an Item 4.02 and the 10-K finally filed on 2/22/2021.

Using Python

The Database Field Guides section of this Knowledge Base lists databases – as a matter of course – as we further complete that section and add new databases we will be including the path to the sqlite3.db file as well as the name of the table in the file.

A review of the fields and the path and table name should get you started on writing Python code to create more sophisticated queries that will probably execute faster than those through the Query Tool. It is important to remember that if you want the query to read or write to files you provide you will generally be working from “D:\PhotonUser\My Files\Temporary Files\SOME_FILE.ext“.

Table of Contents