I was fully expecting to begin making the Director-Relationship data available by now. However, we have run into some really interesting problems that we are having to sort through. We made an assumption that there was a one-to-one relationship between a Central Index Key and a person when a person has a SEC reporting obligation. However, as we were aggregating our director data to organize it for the relationship data presentation our data guru (Manish Pokherel) discovered this was not true.
Manish was trying to create various integrity tests before we made the final merge and in one of the scenarios he tested he discovered that there are approximately 40 people who have multiple CIKs. Here is a screenshot of the SEC landing page for Dr. Glimcher (who was on the board of Bristol-Myers Squibb from 1997 to 2017).

Clearly these look to be the same person – if you follow the links and read her biographies in the related filings it becomes clear that yes, Dr. Glimcher ended up with two unique CIKs.
The problem is that we have one CIK associated with some instances of her compensation (and ownership data) for some filings and the other CIK associated with other instances. For the compensation data and the relationship data to have the most value we need to standardize it.
The decision we made last night is that we are going to use the most recent CIK of these individuals. This means we have to go back through the compensation data and replace any instances where the older CIK value is included as the PERSON-CIK. I will observe that other cases of this are not as clear cut as Dr. Glimcher’s.
This has really been an interesting exercise. This is the first time we have pulled all of our compensation data at one time and tried to do some deep analysis. All of our previous integrity analysis has focused on one individual company and a fairly limited time series at a time. We have over 69,000 unique directors identified (NAME-PERSON-CIK). So as you can imagine it is a special challenge to find ways to cross validate the data.
Bottom line is we need to do some more testing – not too much more but we are still trying to identify ways to make sure this resulting data is clean. We also have to sort out how to make sure we propagate a specific CIK for a person through our system. I want to make sure that when you download our ownership transaction data, director votes data, our beneficial ownership data (I can’t remember where else we use the PERSON-CIK) you get clear links across time and between entities.