Unexpected Benefit of Newly Structured Compensation Data

I was reading a recent paper in Accounting Horizons that showed up in my Google Scholar feed with a directEDGAR citation. Professors Bright and Kim from Lehigh University and Professor Koo from Kyungpook National University collaborated on CEO outside Board Service and Managerial Short-Termism. They report finding that suggest CEO’s limited outside board service is associated with the CEO seeming to be more willing to focus on a longer performance horizon. Congratulations to all for having their research recognized and published. I have reached out to Professors Bright and Kim they earned some ice-cream from eCreamery and the shirt off my back!

As I was reading the paper I noted that they needed to identify CEOs with outside directorships. They used BoardEx to identify their sample. I read further that they used our platform to collect stock compensation for their CEOs. This got me wondering – since we use the PERSON-CIK as a key identifier when processing compensation data could I create a sample to identify all executives with outside directorships using directEDGAR? I think I can.

Here is one pass on the code to do so – the results are saved to a CSV file. Because I agonize over every little detail there are too many lines in the output file. I decided to match if the officer served as an outside director in the year before, the year of, or the year after the compensation year.

import sqlite3
import csv


# Paths to the databases
exec_db_path = "S:/directEDGAR_DATA/EXECUTIVE_COMP.db"
dir_db_path = "S:/directEDGAR_DATA/DIRECTOR_COMP.db"

# Establish database connections
with sqlite3.connect(exec_db_path) as exec_conn, sqlite3.connect(dir_db_path) as dir_conn:
    # Cursor to retrieve distinct years in EXECUTIVE_COMP table
    exec_cursor = exec_conn.cursor()
    exec_cursor.execute("SELECT DISTINCT YEAR FROM EXECUTIVE_COMP")
    years = exec_cursor.fetchall()  # List of years in EXECUTIVE_COMP

    output = []  # Collect results here

    for year_tuple in years:
        year = year_tuple[0] # Extract year value

        # Fetch all executives for the given year, filtering for numeric PERSON_CIK
        exec_cursor.execute("""
            SELECT CIK, YEAR, PERSON_TITLE, PERSON_NAME, PERSON_CIK 
            FROM EXECUTIVE_COMP
            WHERE YEAR = ? AND PERSON_CIK NOT LIKE '%AVAI%' 
              AND CAST(PERSON_CIK AS INTEGER) IS NOT NULL
        """, (year,))
        exec_records = exec_cursor.fetchall()


        # Fetch all director records for YEAR-1, YEAR, and YEAR+1, with numeric PERSON_CIK
        for yr in [year - 1, year, year + 1]:   #this checks the year before and after
            dir_cursor = dir_conn.cursor()
            dir_cursor.execute("""
                SELECT CIK AS CIK_DIRECTORSHIP, YEAR AS DIRECTORSHIP_YEAR, PERSON_CIK, AGE, SINCE 
                FROM DIRECTORCOMP
                WHERE YEAR = ? AND PERSON_CIK NOT LIKE '%AVA%' 
                  AND CAST(PERSON_CIK AS INTEGER) IS NOT NULL
            """, (yr,))
            dir_records = dir_cursor.fetchall()

            # Build a dictionary for fast lookup of directors by PERSON_CIK
            dir_dict = {int(record[2]): record for record in dir_records if record[2].isdigit()}

            # Match executives to directors based on PERSON_CIK
            for exec_record in exec_records:
                exec_cik, exec_year, person_title, person_name, person_cik = exec_record
                if person_cik.isdigit():  # Only process if PERSON_CIK is numeric
                    person_cik = int(person_cik)  # Convert to integer for matching
                    # Check if the executive PERSON_CIK matches a director entry
                    if person_cik in dir_dict:
                        dir_cik, dir_year, _, dir_age, dir_since = dir_dict[person_cik]
                        if dir_cik == exec_cik:
                            continue
                        output.append((exec_cik, exec_year, person_title, person_name, person_cik, dir_cik, dir_year,
                                       dir_age, dir_since))
            print(year, len(output))

# Display or save the output as needed
print(len(output))

# Specify the output CSV file path


# Column headers for the CSV
headers = ["CIK", "YEAR", "PERSON_TITLE", "PERSON_NAME", "PERSON_CIK", "CIK_DIRECTORSHIP", "DIRECTORSHIP_YEAR",
           "DIRECTOR_AGE", "DIRECTOR_SINCE"]

# Write the output list to a CSV file
output_csv_path = r"D:\PhotonUser\My Files\Temporary Files\MATCHED_EXECS.csv"
with open(output_csv_path, mode="w", newline="") as file:
    writer = csv.writer(file)
    writer.writerow(headers)  # Write the headers
    writer.writerows(output)  # Write the data rows

print(f"Results have been saved to {output_csv_path}")

Tim Cook has been an outside director at Nike (CIK: 320187) since 2005. Here is a cut of the output file (that has more than 100,000 rows) with some of Mr. Cooks service reported:

CIKYEARPERSON_TITLEPERSON_NAMEPERSON_CIKCIK_DIRECTORSHIPDIRECTORSHIP_YEARDIRECTOR_AGEDIRECTOR_SINCE
3201932011CHIEF EXECUTIVE OFFICERTIMOTHY D. COOK12141563201872011502005
3201932012CHIEF EXECUTIVE OFFICERTIMOTHY COOK12141563201872012512005
3201932013CHIEF EXECUTIVE OFFICERTIMOTHY COOK12141563201872013522005
3201932014CHIEF EXECUTIVE OFFICERTIM COOK12141563201872014532005
3201932015CHIEF EXECUTIVE OFFICERTIM COOK12141563201872015542005
3201932016CHIEF EXECUTIVE OFFICERTIM COOK12141563201872016552005
3201932017 CHIEF EXECUTIVE OFFICERTIM COOK12141563201872017562005
3201932018CHIEF EXECUTIVE OFFICERTIM COOK12141563201872018572005
3201932019CHIEF EXECUTIVE OFFICERTIM COOK12141563201872019582005
3201932020CHIEF EXECUTIVE OFFICERTIM COOK12141563201872020592005
3201932021CHIEF EXECUTIVE OFFICERTIM COOK12141563201872021602005
3201932022CHIEF EXECUTIVE OFFICERTIM COOK12141563201872022612005
3201932023CHIEF EXECUTIVE OFFICERTIM COOK12141563201872023622005

Mr. Cook’s PERSON-CIK is 1214156 – so by checking for his CIK in the DC data by year we have him in our output.

Of course – we are not perfect – in the Bright, Kim and Koo paper they focused on CEOs. We have yet to normalize the titles of the officers. Now that we finally have a complete pipeline to make the compensation data more easily available I think I am going to have one of our interns start working on an officer title mapping dictionary. This should not be challenging, especially when considering key positions like the Chief Executive Officer and the Chief Financial Officer.

As I was writing the last paragraph I started wondering – well suppose I wanted that now – how hard would it be to start cleaning that up. Excel has a great feature – remove duplicates and so I copied the titles to a new sheet and ran that function – out of the more than 106,000 total matches there were only 7,471 unique titles

In this reduced set there were 1,521 titles with the phrase CHIEF EXECUTIVE OFFICER. Of course it is not that simple – if you look at the next screenshot you will see that title is part of the title of an officer who is the CEO of a business unit or subsidiary of the reporting company.

Based on this quick review I am convinced of two things. First, if I wanted to recreate their sample and test another hypothesis I feel confident I could do so starting with that matching code from above. The one thing I have not yet accounted for is a merge with say Compustat. Limiting my sample to those firms covered in Compustat would further reduce the number of titles I would have to analyze. I suspect 15-20% of the executive we have from the first pass would not match to companies available in Compustat. Further, others would be excluded once we started controlling for the usual (for example they excluded all financial firms). Frankly, I think if I wanted to do this for a project today I am confident I could identify all of the CEOs from this file in relatively short order – perhaps a couple of hours. And while this may seem daunting – I began this journey by collecting acquisition purchase price allocations from micro-fiche 10-Ks and annual reports. This is nothing!

A final thought as I wrap this up – so initially it might seem a bit tedious to identify the CEOs by analyzing the titles. I was sharing this discovery/directEDGAR extension with one of my colleagues and she reminded me that she was a co-author on another paper 6/7 years ago. They also needed to match data from BoardEx to our compensation data. She described the process I helped her set-up to match by names across the two samples. That process took much more time and was hugely more tedious then it would take to normalize these titles.

Leave a Reply