I think the headline tells the story – we have added two new fields to the Executive Compensation data – CEO and CFO. Each takes the value of 1 or 0. 1 indicates that we have identified that person in that role and 0 otherwise.
We had to make some data collection decisions that we might or might not revisit later. One is that if there is a transition during the year we identify the CEO/CFO as the person in that role on the last day of the fiscal year. For example, Brian Niccol was appointed CEO of Starbucks on September 9, 2024. Their FYE was 9/30. Prior to Mr. Niccol’s appointment Rachel Ruggeri served as the interim CEO from August 12 until September 9 and with respect to the 2024 fiscal year Laxman Narasimhan was the CEO from 9/30/2023 until August 11, 2024. Our compensation data has Ms. Ruggeri’s title for 2024 as EXECUTIVE VICE PRESIDENT, CHIEF FINANCIAL OFFICER AND FORMER INTERIM CHIEF EXECUTIVE OFFICER. Mr. Narasimhan’s title for 2024 is reported as FORMER CHIEF EXECUTIVE OFFICER.
Here is a screenshot after using the Query feature to select all CFOs with a salary greater than 650,000.
As a reminder – our EC data is generally updated each week.
Data collection can be tedious – but there are ways to leverage our platform to really speed up even the most mundane task. I am working on a project with some colleagues and I was tasked with identifying the Lead Independent Director and/or Chair of the board as well as their tenure. This data needed to reflect these positions in calendar year 2024 and the sample is those companies that self-reported as Large Accelerated Filers with a 12/31 (approximate) FYE. The particular challenge for this is to identify the tenure as Lead Independent Director. I am finding that this is not directly disclosed in over 30% of the filings. There is a really easy way to address that – I describe below how to run a second search in another instance of directEDGAR to identify the tenure.
Who wants to type names as well as the critical metadata that is required? Not me – so my first step was to pull the executive and director compensation data for my sample firms. Since these were 12/31 firms I would need to pull 2024 EC and DC data. Below is a screenshot of my pull of the EC data using our Query tool. If you look closely, the CIK Filter button is blue. I have a file that has the 1,297 CIKs in my sample and that file was selected. You can also see that I have the Criteria set as YEAR=2024. I saved the results as a CSV file and then repeated the same steps to collect the DC data.
I then combined the two files. To do so I did have to account for the fact that there are some minor differences in the column headings. For example the EC data has PERSON-TITLE, SALARY and BONUS whereas the DC data does not have PERSON-TITLE, SALARY or BONUS and instead has CASH. Despite these differences it took just a few minutes to arrange the columns – I did add a new column defining the nature of the data (EC/DC). I did that because I wanted to sort these and keep the EC and DC data separate.
There is a little bit of a challenge here. The order of the search results will not match the data order unless we force it. That problem is easily solvable. What I am going to do is to run a search in the 2025 Proxy archive for (DOCTYPE contains(DEF*)) filtered on my CIK list. I am then going to use the SummaryExtraction file to sort my EC/DC data to MATCH the search result order. Here is something important – the search result order will always match this order (if there is a matching document).
After downloading the SummaryExtraction file I inserted a new column to the left of the CIK – ORDER and just numbered the results in order.
I then copied those two columns to my combined compensation file and used EXCEL’s VLOOKUP function to copy the value for ORDER into the compensation data. Here is a screenshot of some of the rows in that file after I added the ORDER value.
This took me about ten minutes but it will save me significant time as I go forward. My next step is to identify those companies that have a lead independent director so I can then identify the name of that person. Not every company will have a lead independent director – particularly those with an independent chair. So I am not going to look at the initial search results. The first search was just to define the order I want to use for my data collection. I am going to run a new search. I am still limiting the results by my CIK list but now I am searching for (DOCTYPE contains(DEF*)) and (lead director or lead independent director) andany since. Here is a screenshot of the result of that search.
I want to explain this search. It only returns documents that
Are DEF 14A
Have either
The phrase Lead Director, or
The phrase Lead Independent Director
If the word Since is present in a document that meets both 1 & 2 above then all instances of the word Since will be highlighted as well.
If you compare these results with the image above from the search of (DOCTYPE contains (DEF*)) you will notice we have fewer results. That just means that those companies not listed in this search do not have either phrase we were searching for. That is fine – I am focused on identifying whether or not there is a person with the title LEAD DIRECTOR or LEAD INDEPENDENT director and I want their tenure.
My goal at this stage is to identify the Lead Independent Director and their tenure in that position. I am clearly going to be lucky because there will be cases where the chair is also identified – if you scan that you can see that Ellen Gordon is the chair and has been since 2015. That is a lucky accident for me. Since I included both EC and DC data I will go ahead and record that information.
The filing expressly reports that they (Tootsie Roll) do not have a lead independent director – so I very quickly completed my data collection for Tootsie Roll.
In a world optimized for my data collection needs the disclosures would all fall into a pattern like this one for BOSTON BEER – very clear indication that Mr. Nemeth is the Lead Director and has served in that position since May 2024.
As many of you know, the world is not optimized for our data collection. Sometimes we need a boost. Below is a screenshot of the location where I identified the name of the Lead Independent Director in Seacoast Banking Corp of Florida’s proxy.
After careful review I could not find the year Mr. Fogal was first elected to that position. So here is an important trick. I suspected that the company has probably treated that information as a boilerplate disclosure. So I started another instance of the directEDGAR application in my Appstream session and used the clues in the disclosure above to search for prior disclosures to see if I could identify when he started as Lead Independent Director. In the next screenshot you can see I have selected proxy filings back to 2005.
Based on the disclosure above the search I ran was elect* w/5 fogal w/10 lead independent. No need to filter on DOCTYPE or do anything more complicated. Here is a screenshot of that search and the evidence I need to record 2018 as the beginning of his tenure as Lead Independent Director.
You can see that my original search is partially visible in the background. I am using the second instance to help identify the tenure once I have identified the Lead Independent Director if their start date is not mentioned in the filing. This keeps me focused and moving forward without having to agonize over how to collect this data. Names are fairly unique, especially when we limit on phrases like Lead Independent Director. I did not run into much noise when I constructed these secondary searches.
We are going to reconstruct our filing archive. I was testing the code that will process the accession.txt files today on 2024 proxy filings. Rather than pull new filings I was working with the 2024 files we pulled during our production run in 2024. It is a bit tedious but necessary to identify all of the issues that we had not thought of when we are working on the code.
Here is the link to the index page for a DEF 14A filing made by Biora Therapeutics – notice that the filing date is reported at 2024-09-16. If you look carefully at the index page you will not see the date 9/20/2024. However, ultimately the filing will end up in the directEDGAR archive with a DEID of 1580063-R20240920-C20241009-F36. The reason for that is kind of interesting if you are an SEC wonk.
Here is a little extract from the accession.txt file that was in our archive and originally pulled on 3:45 PM (CDT) on 9/16:
Notice that there is no content between the SGML open and close TEXT tags despite other clues present that there should have been an HTML file with the filename d48307ddef14a.htm between those tags. Based on these clues I concluded that either the EDGAR processor ripped out the original filing during the submission process or the filer somehow failed to map in the HTML file when constructing the accession.txt file.
If we look at the associated header file from EDGAR – the current version differs from the one we originally pulled. I have pasted the first few lines from each of the header files below:
The date next to the header ‘declaration’ differs between the two filings. However, what I find really interesting is that when this filing was originally submitted a DATE-OF-FILING-DATE-CHANGE tag was added with the date of the original submission. This tag is not found in every filing. Per the EDGAR filer manual it is supposed to indicate the “Date when the last Post Acceptance occurred.” It is an optional tag and we generally see it when the SEC-HEADER associated date does not equal the filing date. I should clarify though – we see it much more frequently than expected. We see it even when the complete filing does not appear to be changed and the date next to the header tag matches the FILING-DATE. The existence of that tag is not a clear signal that something might be amiss.
Chasing these kinds of things is probably a waste of time but I am hugely curious about all things EDGAR so I decided to pull the feed archive for 9/16/2024. I drilled through it and found the original filing and it matched the one we pulled on 9/16/2024 – specifically the HTML content was missing. Based on my experience with these I then pulled the feed archive for 9/20/2024 and found another version of the accession.txt file (saved as an accession.nc file) and it matched the current version that is available from the index page.
What can we conclude – I do not believe the DEF 14A was available to the public on 9/16. If you were running an event study and this filing was in your sample you would be introducing error by using the Filing Date that is reported on the landing page, in the EDGAR indexes and in the filing itself.
What I find particularly interesting is that it is not clear to me that we can rely on the file dates as reported in the EDGAR archive for the filings. Here is a screenshot of the accession archive directory for this filing.
Based on everything above, I am suspicious that the LAST MODIFIED date for this filing is correct. The htm file was not included in the dissemination feed on 9/16 but was present on 9/20. Thus, I suspect that the filing was just not available until 9/20. I find it interesting that the EDGAR code altered the LAST MODIFIED date to match the acceptance date-time.
I did warn you that this was a bit wonky. More on the archive rebuild later.