13F Summary & Raw Data – Available Soon

Some of you may have noticed that the SEC’s Division of Economic and Risk Analysis (DERA) consolidated the data from 13F-HR filings and made them available on a new website (DERA 13F). While that is generally a good thing I was disappointed because one of the projects we have been working on is to provide a database of the 13F data organized in a way that made it more accessible.

However, after reviewing what DERA has provided – I am glad we started this project because all they have provided is the raw data and as you will understand from reading below – the raw data is not useful without some significant work.

When we started looking at the 13F filings it looked like a simple problem. Basically we needed to consolidate by reporting quarter and security CUSIP. We have been working on another project mapping CUSIPs to CIKs (Central Index Keys) and have made significant progress. The idea is that we would push out a database with both CUSIPs and CIKs as identifiers and provide a table with the summary that included the total value and shares as well as the number of institutional investors by CUSIP/CIK and quarter. I also want to create tables of the derivative holdings (PUT/CALL) and then the debt holdings. In addition we have been strategizing on how to make the complete data set available for those of you with the skills and interest in asking more nuanced questions with this form of access.

As I noted above, this looked like a simple problem. But I am discovering, like everything we do, we don’t know what we don’t know. The code was ready to run the first pull to organize the data. We had sorted out how to identify those filings that were superseded by an amendment (This filing (JP MORGAN 1) was superseded by this one (JP MORGAN 2)). We made sure to add securities that we reported in amendments that just added to the filings list, including ones that were disclosed because of the expiration of confidential treatment.

However, while doing some sanity checks after running the consolidation code we are seeing stuff that does not make sense. I am not sure how prevalent these issues are yet because as our testing right now focuses on the first security that completes processing in the loops and since we are using Python default dictionaries as the intermediate storage container the first security tends to be Abercrombie & Fitch because of their CUSIP (002896207). The total value of holdings for Q1 2014 from summing all of the holdings listed in the relevant filings was 14,019,406. But this number is supposed to be in thousands of dollars (/1,000). When I saw that I thought that the number made no sense – they had a market cap greater than 14 billion? A review of their 10-K cover pages for 2013 – 2015 supported my questioning this value as the reported market cap was in the 2 – 4 billion dollar range.

Initially I just tried to look at individual entries in the raw value database to compare to the source file and I could not find any discrepancies – the data we collected matched what was in the filings.

Finally I just dumped all of their data by quarter into individual CSV files by quarters to see if I could get some insight. I did identify one factor that was contributing to this issue. Some filers are/were reporting actual values, not scaled values.

Here is an image of the summary dump after I computed an imputed_price variable to see if I could identify values outside the range. As you can see there are thirteen entries with an imputed price of about $38,500/share. The median price for the end of Q1 2014 was around $38.50/share. So that explains part of the discrepancy.

Small Section of Abercrombie & Fitch Holdings Reported by 13F Filers for Q1 2014.

If I re-scale those then the value of the total reported holdings by 13F filers drops to around 3 billion. This is more reasonable but it seems too large because the total shares represented in this summary is over 78 million. According to the 10-K they filed on 3/31/2014 the total number of shares reported on the cover page as of March 21, 2014 was 73,403,751.

Let me continue to get into the weeds. This is solvable but it means we have to apply a heuristic (there are over 23,000 securities – we can’t possibly check them all by hand). We have the heuristic sorted out and are about ready to move forward. One of the things I thought is that we could do this by filer. If you look in the report above, the filer value is the CIK of the reporting filer. The value 5272 is the CIK of American International Group (AIG) and so I wondered if AIG flubbed all of the data in that filing – could we put a filing flag for this. But no, that is not the case. Look at the image below – this is a snapshot of the holdings table included in AIG’s Q1 2014 filing:

AIG 13F-HR for Q1 2014 Holdings Reported 5/12/2014

The average price for ABBVIE shares is about $51.40/share. This is consistent with what I am seeing on the web for this time period (3/31/2014 closing price from one source is $50.98).

I think this inconsistency is what surprises me the most. I would have thought that this data would be trivially pulled from some internal system and there would not need to be any price adjustments. But it is hard to imagine that an internal system could be allowed to report these inconsistent values. So why the discrepancy?

We still have the issue that the total holdings seem too large relative to the reported shares. The only thought I have is that this is somehow related to short selling? I am simply baffled right now. The holdings for Abercrombie were pulled from 255 separate 13F filings, I checked 65 filings and confirmed that we captured the reported values. I also confirmed that the other reporting managers listed in 10 of the filings do not have any holding reports. I also walked through the process for kicking out holdings because of restatements. None of these explain the results.

We of course will look at ways to continue to refine this data. However, if we wait to make it perfect then it would never be available. I expect that we will post the initial database over this weekend. When we do I will send a message to your registered user account.

If you are not one of our users – be careful with the raw data.

Leave a Reply