More of “We don’t know what we don’t know”!

I had an interesting week. The inside joke at our house is my inability to do any home project with only one trip to Lowe’s (my preferred home store). I think the AcademicEDGAR+ LLC inside joke needs to be – my inability to predict the amount of time some goal will take. I left one of our machines running last night to consolidate the holdings data after making the decision to use the most frequently reported price to determine the value we report. I was really excited this afternoon as I logged into the server to start doing some visual review of the data. There are just a ton of different errors in the results that we need some more time to discover what we can fix and what is not going to be addressable. Before I get into the weeds, I still intend to make the current database available. As a matter of fact I will place it there right now and use images from our software interacting with this database in my comments.

My first observation is that we did correctly parse the source files – but more than a handful of filers included data for derivative securities (mostly options) without using the proper tags to indicate that the disclosure was about a put/call or other option. In the image below I did a search for all rows that had the word put or call in the securityclass field. There should not have been any observations but I discovered 5,734. (Note, by the time this hits the wire those will be out of the final data)

Screenshot of Option Errors in 13F Summary Database

Perhaps even more surprising are the CUSIP errors. With the reorganization of Google to become a subsidiary of Alphabet the publicly traded stock of Google was retired and replaced with shares of Alphabet. I have over-simplified the legal details but as a small shareholder I noticed that the securities in my account were replaced. The CUSIP of Google Class A was 38259P508. The CUSIP of Alphabet Class A is 02079K107. And according to the SEC’s official list of 13F Securities (find the archive here) the reporting should be under the Alphabet CUSIP(s) beginning for reports filed to report on activity for Q42015.

Google’s securities as reported in the official 13F Securities List for Q4 2015
Addition of the securities of Alphabet to the Official 13F Securities List for Q4 2015

If we examine the holdings summary, there are a significant number of investors reporting holdings in GOOGLE INC and I suspect that these are actually holdings in ALPHABET INC. I am pretty confident that these CUSIPs are not properly matched and the value in that field should be the CUSIPS assigned to Alphabet.

Uncovering holdings of GOOGLE that should be reported as holdings of Alphabet.

You can see that the investors count declines across time, but still this kind of error in the source data is very surprising to me as the reporting entities have fiduciary responsibilities over more than $100 million dollars.

There are the other types of CUSIP errors – when we started looking closely at the holdings in Apple Inc – more of those came to light than I would have expected. To build the result list I used the following query statement:

(securityname  LIKE '%apple%')  AND NOT(securityname  LIKE '%maui%') and not (securityname  LIKE '%snapple%')  AND not (securityname  LIKE '%nicholas%')  AND  NOT (securityname  LIKE '%appleton%') 

Notice the use of AND NOT in the query. We perhaps should add a NOT button to the operator list. Anyway, I did not want to select on CUSIP so I had to go through some effort to get the list that is displayed. Notice that the UNIFIED SER TR . . . is still in the list.

Searching for summary results for APPLE INC to Review CUSIP errors.

Apple’s CUSIP is 037833100. As you can see above that there is a not insignificant number of data rows where the leading 0 is missing (reported CUSIP = 37833100). And then another set of results with the check figure missing (reported CUSIP = 03783310). Not visible but some other variants that are reported include 037833101 and 378331003 – I suspect that since 0 is the resulting check digit for 03783310 then 037833101 would not validate as a CUSIP (identical sequences should not have different check figures).

The bottom line is that this is going to take more work to refine than I had believed. I am actually surprised that the SEC does not force conformity to their mandated list of 13F Securities. However, while the current data is not as usable as I had hoped it would be – it is a much further along than the raw data available from the DERA website and so I have made the summary data, the raw data as well as the header file data available through the Query Database tool on our platform.

I need to be clear, the summary data consolidates all reported holdings by the reported CUSIP (by quarter). We have not yet attempted to fully analyze the CUSIP errors and so when registrants report holding in CUSIP 37833100 those will not be consolidated with holdings for 037833100. We are not ready to confidently make that claim.

We have added the CIK to both the summary data and the raw data. Again, we are only pushing out the matches for which we have a very high degree of confidence that the match is correct. Of course the CIK-CUSIP matching leads to another interesting problem to solve. When we have registrants who have changed their CIK over time how do we make sure those are available correctly? One thought I have is that for companies like Alphabet/Google, Disney and others we might should include duplicate rows for their duplicate CIKs. In other words – every summary row that contains information about Alphabet should be duplicated with the CIK and CUSIP of Google (and vice-versa). Of course then you have the GMs. The new GM is not the same as the old by any stretch and so making these mappings is complicated. It makes sense to make the time series of holdings of Alphabet include the time series of Google but it does not make sense to merge the time series of the old GM stock with the new GM stock. Clearly if we have a duplicate row there needs to be a flag.

I should also note that there is one other issue relating to amended (/A) reports that we need to address. The instructions to the 13F explicitly state “Amendments to a Form 13F report must either restate the Form 13F report in its entirety or include only holdings entries that are being reported in addition to those already reported in a current public Form 13F report for the same period.” I added the bold for emphasis. After reviewing a limited number of 13F-HR/A I concluded that when the adds new holding entries checkbox is/was selected the filing only adds new holdings to the previously filed report. This is not the case as we have recently discovered.

One example of this relates to filings made by Morgan Stanley to report on their portfolio holdings for the quarter ended 3/31/2017. Here is a link the original information table (MS Q1 2017 HR). Here is a link to the information table in the amended filing (MS Q1 2017 HR/A). The amended filing clearly indicates that it adds new holding entries. However, it only takes a cursory review to understand that the amended filing is to supersede the original because it is a restatement. This was a long-winded way to report that the amended filing was not labeled properly. However, because of the labeling we have double counted some holdings. That addresses a small fraction of the concern I raised a couple of days ago. That has to be the next issue we address. I did think about delaying the availability of this data until we can address this issue but what is the fun of that!

I’ll close this out by reporting that we have about finished making another 1,000 CUSIP-CIK matches and will update both the 13FSUMMARY data and the 13FSHAREDATA files with these new matches by next weekend.

Leave a Reply