In pursuit of evidence: Sampling procedures for locating new businesses

Howard Aldrich, Arne Kalleberg, Peter Marsden, James Cassel

Research output: Contribution to journalArticle

  • 68 Citations

Abstract

Research on entrepreneurship encompasses all stages in the life cycle of businesses, with the period around the initial founding arguably the most important. Decisions made in the crucial early days regarding products, markets, funding, and personnel substantially shape the subsequent course of the business. Despite the obvious importance of this period, there are few studies based on representative samples of newly forming businesses. Our project sought to fill this gap in entrepreneurship research by identifying large representative samples of new businesses from three potential sampling sources. Our results document the advantages and disadvantages of each source and suggest that more attention should be paid to the direct enumeration method for the most up-to-date information on new businesses. We used the Dun's Market Indicator file, the North Carolina Employment Security Commission's ES202 file, and direct enumeration in the field by trained interviewers to collect data on the total business population of Durham County, North Carolina. Supplemental information was collected in telephone interviews with owners and managers. Many trade-offs are apparent between the advantages and disadvantages of each source. The DMI File. On our first criterion, generalizability, we found limits to the DMI file's representativeness: it missed almost three-quarters of the listings of those apparently new businesses found in the ES202 file and over 90% of those found in the enumeration/phone book matching process. The DMI does not do well on timeliness; by May 1987, it included only 12 businesses with a birth year of 1987 in a county containing approximately 5,000 businesses. Units of analysis are fairly easy to identify in the DMI, as it has a code for headquarters, branches, and subsidiaries. On the second criterion, practicality, the DMI is quite good: the source is machine readable, allowing easy processing. We found no duplicates, and extensive auxiliary information is included, such as number of employees. Unfortunately, "year founded" is missing from about a third of its listings overall and from over half the listings in some industries. A large portion of those with missing birth rates are branches, none of which have birth years. On the third criterion, cost, the DMI would be very costly to acquire for anything other than a fairly delimited area, such as one city or county. The ES202 File. On the first criterion, generalizability, the ES202 file covers all businesses with a payroll, thus missing firms without employees. Overlap between it and the DMI file was not high. The ES202 fared little better when compared to the enumeration/phone book match sample, as it missed a high proportion of the enumeration-based cases, but the ES202 file was approximately one year behind the enumeration-based sample. The ES202 does not do well on timeliness, as there is a lag of almost one year between the period covered by a file and its availability to researchers. Analysis also showed that owners evidently wait two to four years, on average, before listing their businesses with the Employment Security Office. Units of analysis are difficult to identify in the ES202; although it does indicate single versus multiestablishment firms, the address given in the file does not necessarily refer to the firm for whom it was filed. On the second criterion, practicality, the file is machine readable, but the only auxiliary information available is number of employees. Cost, the third criterion, is decidedly in the ES202's favor. Access to ES202 files varies by state. The Enumeration/Phone Book Match. In theory, the generalizability of the enumeration/phone book matching method should be superior to the other two, as all businesses with a visible on-street location should be identified, including those in office buildings. Of course, businesses with no onstreet location would be missed, typically small firms operated out of houses. The enumeration method located most of the businesses identified by the DMI as new, but a smaller fraction of those identified as new by the ES202. The size distribution of the confirmed new businesses in the enumeration file is remarkably similar to that of the other two primary sources: 83% have under 10 employees and 89% under 20. The enumeration method is clearly the most timely, allowing the identification of new businesses in their first few months of operation. Enumeration is a fairly practical method, as it eliminates duplicate listings and allows the on-site determination of eligible organizations. However, it must be put in machine-readable form, a very labor-intensive process, and provides no auxiliary information unless combined with an interview or follow-up phone call. Cost is a major consideration in evaluating the enumeration method. Practitioners might draw two lessons from our findings. First, our results should make readers more cautious in accepting empirical generalizations and advice based on existing studies of business start-ups. Valid representative studies of the start-up process should capture new businesses as early as possible. The gestation period for start-ups is often long, and discovering businesses at the intention or resource mobilization stage is very difficult. Research findings on the determinants of start-up success and survival become increasingly biased as the time lag between a founding and its appearance in a source increases, because only successful businesses are well represented in the source. Second, practitioners and policymakers desiring information about the fate of newly formed businesses in their area must carefully evaluate their priorities in deciding on a source of data. None of our methods captures firms early in the founding process, when intentions are still ill formed and resource mobilization still in doubt. The source that comes closest to providing us with a timely, generalizable, and practical sample of new businesses, however, is the enumeration/phone book match. On a number of criteria, it appears to be a better source than the DMI or ES202 files. Thus, for the moment, we have identified the likely alternatives and recommend that researchers and policymakers consider each carefully before beginning their next study of new business formation.

LanguageEnglish (US)
Pages367-386
Number of pages20
JournalJournal of Business Venturing
Volume4
Issue number6
DOIs
StatePublished - Jan 1 1989

Fingerprint

Sampling
Industry
Personnel
New business
Costs
Office buildings
Telephone
Life cycle

ASJC Scopus subject areas

  • Business and International Management
  • Management of Technology and Innovation

Cite this

In pursuit of evidence : Sampling procedures for locating new businesses. / Aldrich, Howard; Kalleberg, Arne; Marsden, Peter; Cassel, James.

In: Journal of Business Venturing, Vol. 4, No. 6, 01.01.1989, p. 367-386.

Research output: Contribution to journalArticle

@article{0f9025670fc148b9aa39d74534eb7a2a,
title = "In pursuit of evidence: Sampling procedures for locating new businesses",
abstract = "Research on entrepreneurship encompasses all stages in the life cycle of businesses, with the period around the initial founding arguably the most important. Decisions made in the crucial early days regarding products, markets, funding, and personnel substantially shape the subsequent course of the business. Despite the obvious importance of this period, there are few studies based on representative samples of newly forming businesses. Our project sought to fill this gap in entrepreneurship research by identifying large representative samples of new businesses from three potential sampling sources. Our results document the advantages and disadvantages of each source and suggest that more attention should be paid to the direct enumeration method for the most up-to-date information on new businesses. We used the Dun's Market Indicator file, the North Carolina Employment Security Commission's ES202 file, and direct enumeration in the field by trained interviewers to collect data on the total business population of Durham County, North Carolina. Supplemental information was collected in telephone interviews with owners and managers. Many trade-offs are apparent between the advantages and disadvantages of each source. The DMI File. On our first criterion, generalizability, we found limits to the DMI file's representativeness: it missed almost three-quarters of the listings of those apparently new businesses found in the ES202 file and over 90{\%} of those found in the enumeration/phone book matching process. The DMI does not do well on timeliness; by May 1987, it included only 12 businesses with a birth year of 1987 in a county containing approximately 5,000 businesses. Units of analysis are fairly easy to identify in the DMI, as it has a code for headquarters, branches, and subsidiaries. On the second criterion, practicality, the DMI is quite good: the source is machine readable, allowing easy processing. We found no duplicates, and extensive auxiliary information is included, such as number of employees. Unfortunately, {"}year founded{"} is missing from about a third of its listings overall and from over half the listings in some industries. A large portion of those with missing birth rates are branches, none of which have birth years. On the third criterion, cost, the DMI would be very costly to acquire for anything other than a fairly delimited area, such as one city or county. The ES202 File. On the first criterion, generalizability, the ES202 file covers all businesses with a payroll, thus missing firms without employees. Overlap between it and the DMI file was not high. The ES202 fared little better when compared to the enumeration/phone book match sample, as it missed a high proportion of the enumeration-based cases, but the ES202 file was approximately one year behind the enumeration-based sample. The ES202 does not do well on timeliness, as there is a lag of almost one year between the period covered by a file and its availability to researchers. Analysis also showed that owners evidently wait two to four years, on average, before listing their businesses with the Employment Security Office. Units of analysis are difficult to identify in the ES202; although it does indicate single versus multiestablishment firms, the address given in the file does not necessarily refer to the firm for whom it was filed. On the second criterion, practicality, the file is machine readable, but the only auxiliary information available is number of employees. Cost, the third criterion, is decidedly in the ES202's favor. Access to ES202 files varies by state. The Enumeration/Phone Book Match. In theory, the generalizability of the enumeration/phone book matching method should be superior to the other two, as all businesses with a visible on-street location should be identified, including those in office buildings. Of course, businesses with no onstreet location would be missed, typically small firms operated out of houses. The enumeration method located most of the businesses identified by the DMI as new, but a smaller fraction of those identified as new by the ES202. The size distribution of the confirmed new businesses in the enumeration file is remarkably similar to that of the other two primary sources: 83{\%} have under 10 employees and 89{\%} under 20. The enumeration method is clearly the most timely, allowing the identification of new businesses in their first few months of operation. Enumeration is a fairly practical method, as it eliminates duplicate listings and allows the on-site determination of eligible organizations. However, it must be put in machine-readable form, a very labor-intensive process, and provides no auxiliary information unless combined with an interview or follow-up phone call. Cost is a major consideration in evaluating the enumeration method. Practitioners might draw two lessons from our findings. First, our results should make readers more cautious in accepting empirical generalizations and advice based on existing studies of business start-ups. Valid representative studies of the start-up process should capture new businesses as early as possible. The gestation period for start-ups is often long, and discovering businesses at the intention or resource mobilization stage is very difficult. Research findings on the determinants of start-up success and survival become increasingly biased as the time lag between a founding and its appearance in a source increases, because only successful businesses are well represented in the source. Second, practitioners and policymakers desiring information about the fate of newly formed businesses in their area must carefully evaluate their priorities in deciding on a source of data. None of our methods captures firms early in the founding process, when intentions are still ill formed and resource mobilization still in doubt. The source that comes closest to providing us with a timely, generalizable, and practical sample of new businesses, however, is the enumeration/phone book match. On a number of criteria, it appears to be a better source than the DMI or ES202 files. Thus, for the moment, we have identified the likely alternatives and recommend that researchers and policymakers consider each carefully before beginning their next study of new business formation.",
author = "Howard Aldrich and Arne Kalleberg and Peter Marsden and James Cassel",
year = "1989",
month = "1",
day = "1",
doi = "10.1016/0883-9026(89)90008-6",
language = "English (US)",
volume = "4",
pages = "367--386",
journal = "Journal of Business Venturing",
issn = "0883-9026",
publisher = "Elsevier Inc.",
number = "6",

}

TY - JOUR

T1 - In pursuit of evidence

T2 - Journal of Business Venturing

AU - Aldrich, Howard

AU - Kalleberg, Arne

AU - Marsden, Peter

AU - Cassel, James

PY - 1989/1/1

Y1 - 1989/1/1

N2 - Research on entrepreneurship encompasses all stages in the life cycle of businesses, with the period around the initial founding arguably the most important. Decisions made in the crucial early days regarding products, markets, funding, and personnel substantially shape the subsequent course of the business. Despite the obvious importance of this period, there are few studies based on representative samples of newly forming businesses. Our project sought to fill this gap in entrepreneurship research by identifying large representative samples of new businesses from three potential sampling sources. Our results document the advantages and disadvantages of each source and suggest that more attention should be paid to the direct enumeration method for the most up-to-date information on new businesses. We used the Dun's Market Indicator file, the North Carolina Employment Security Commission's ES202 file, and direct enumeration in the field by trained interviewers to collect data on the total business population of Durham County, North Carolina. Supplemental information was collected in telephone interviews with owners and managers. Many trade-offs are apparent between the advantages and disadvantages of each source. The DMI File. On our first criterion, generalizability, we found limits to the DMI file's representativeness: it missed almost three-quarters of the listings of those apparently new businesses found in the ES202 file and over 90% of those found in the enumeration/phone book matching process. The DMI does not do well on timeliness; by May 1987, it included only 12 businesses with a birth year of 1987 in a county containing approximately 5,000 businesses. Units of analysis are fairly easy to identify in the DMI, as it has a code for headquarters, branches, and subsidiaries. On the second criterion, practicality, the DMI is quite good: the source is machine readable, allowing easy processing. We found no duplicates, and extensive auxiliary information is included, such as number of employees. Unfortunately, "year founded" is missing from about a third of its listings overall and from over half the listings in some industries. A large portion of those with missing birth rates are branches, none of which have birth years. On the third criterion, cost, the DMI would be very costly to acquire for anything other than a fairly delimited area, such as one city or county. The ES202 File. On the first criterion, generalizability, the ES202 file covers all businesses with a payroll, thus missing firms without employees. Overlap between it and the DMI file was not high. The ES202 fared little better when compared to the enumeration/phone book match sample, as it missed a high proportion of the enumeration-based cases, but the ES202 file was approximately one year behind the enumeration-based sample. The ES202 does not do well on timeliness, as there is a lag of almost one year between the period covered by a file and its availability to researchers. Analysis also showed that owners evidently wait two to four years, on average, before listing their businesses with the Employment Security Office. Units of analysis are difficult to identify in the ES202; although it does indicate single versus multiestablishment firms, the address given in the file does not necessarily refer to the firm for whom it was filed. On the second criterion, practicality, the file is machine readable, but the only auxiliary information available is number of employees. Cost, the third criterion, is decidedly in the ES202's favor. Access to ES202 files varies by state. The Enumeration/Phone Book Match. In theory, the generalizability of the enumeration/phone book matching method should be superior to the other two, as all businesses with a visible on-street location should be identified, including those in office buildings. Of course, businesses with no onstreet location would be missed, typically small firms operated out of houses. The enumeration method located most of the businesses identified by the DMI as new, but a smaller fraction of those identified as new by the ES202. The size distribution of the confirmed new businesses in the enumeration file is remarkably similar to that of the other two primary sources: 83% have under 10 employees and 89% under 20. The enumeration method is clearly the most timely, allowing the identification of new businesses in their first few months of operation. Enumeration is a fairly practical method, as it eliminates duplicate listings and allows the on-site determination of eligible organizations. However, it must be put in machine-readable form, a very labor-intensive process, and provides no auxiliary information unless combined with an interview or follow-up phone call. Cost is a major consideration in evaluating the enumeration method. Practitioners might draw two lessons from our findings. First, our results should make readers more cautious in accepting empirical generalizations and advice based on existing studies of business start-ups. Valid representative studies of the start-up process should capture new businesses as early as possible. The gestation period for start-ups is often long, and discovering businesses at the intention or resource mobilization stage is very difficult. Research findings on the determinants of start-up success and survival become increasingly biased as the time lag between a founding and its appearance in a source increases, because only successful businesses are well represented in the source. Second, practitioners and policymakers desiring information about the fate of newly formed businesses in their area must carefully evaluate their priorities in deciding on a source of data. None of our methods captures firms early in the founding process, when intentions are still ill formed and resource mobilization still in doubt. The source that comes closest to providing us with a timely, generalizable, and practical sample of new businesses, however, is the enumeration/phone book match. On a number of criteria, it appears to be a better source than the DMI or ES202 files. Thus, for the moment, we have identified the likely alternatives and recommend that researchers and policymakers consider each carefully before beginning their next study of new business formation.

AB - Research on entrepreneurship encompasses all stages in the life cycle of businesses, with the period around the initial founding arguably the most important. Decisions made in the crucial early days regarding products, markets, funding, and personnel substantially shape the subsequent course of the business. Despite the obvious importance of this period, there are few studies based on representative samples of newly forming businesses. Our project sought to fill this gap in entrepreneurship research by identifying large representative samples of new businesses from three potential sampling sources. Our results document the advantages and disadvantages of each source and suggest that more attention should be paid to the direct enumeration method for the most up-to-date information on new businesses. We used the Dun's Market Indicator file, the North Carolina Employment Security Commission's ES202 file, and direct enumeration in the field by trained interviewers to collect data on the total business population of Durham County, North Carolina. Supplemental information was collected in telephone interviews with owners and managers. Many trade-offs are apparent between the advantages and disadvantages of each source. The DMI File. On our first criterion, generalizability, we found limits to the DMI file's representativeness: it missed almost three-quarters of the listings of those apparently new businesses found in the ES202 file and over 90% of those found in the enumeration/phone book matching process. The DMI does not do well on timeliness; by May 1987, it included only 12 businesses with a birth year of 1987 in a county containing approximately 5,000 businesses. Units of analysis are fairly easy to identify in the DMI, as it has a code for headquarters, branches, and subsidiaries. On the second criterion, practicality, the DMI is quite good: the source is machine readable, allowing easy processing. We found no duplicates, and extensive auxiliary information is included, such as number of employees. Unfortunately, "year founded" is missing from about a third of its listings overall and from over half the listings in some industries. A large portion of those with missing birth rates are branches, none of which have birth years. On the third criterion, cost, the DMI would be very costly to acquire for anything other than a fairly delimited area, such as one city or county. The ES202 File. On the first criterion, generalizability, the ES202 file covers all businesses with a payroll, thus missing firms without employees. Overlap between it and the DMI file was not high. The ES202 fared little better when compared to the enumeration/phone book match sample, as it missed a high proportion of the enumeration-based cases, but the ES202 file was approximately one year behind the enumeration-based sample. The ES202 does not do well on timeliness, as there is a lag of almost one year between the period covered by a file and its availability to researchers. Analysis also showed that owners evidently wait two to four years, on average, before listing their businesses with the Employment Security Office. Units of analysis are difficult to identify in the ES202; although it does indicate single versus multiestablishment firms, the address given in the file does not necessarily refer to the firm for whom it was filed. On the second criterion, practicality, the file is machine readable, but the only auxiliary information available is number of employees. Cost, the third criterion, is decidedly in the ES202's favor. Access to ES202 files varies by state. The Enumeration/Phone Book Match. In theory, the generalizability of the enumeration/phone book matching method should be superior to the other two, as all businesses with a visible on-street location should be identified, including those in office buildings. Of course, businesses with no onstreet location would be missed, typically small firms operated out of houses. The enumeration method located most of the businesses identified by the DMI as new, but a smaller fraction of those identified as new by the ES202. The size distribution of the confirmed new businesses in the enumeration file is remarkably similar to that of the other two primary sources: 83% have under 10 employees and 89% under 20. The enumeration method is clearly the most timely, allowing the identification of new businesses in their first few months of operation. Enumeration is a fairly practical method, as it eliminates duplicate listings and allows the on-site determination of eligible organizations. However, it must be put in machine-readable form, a very labor-intensive process, and provides no auxiliary information unless combined with an interview or follow-up phone call. Cost is a major consideration in evaluating the enumeration method. Practitioners might draw two lessons from our findings. First, our results should make readers more cautious in accepting empirical generalizations and advice based on existing studies of business start-ups. Valid representative studies of the start-up process should capture new businesses as early as possible. The gestation period for start-ups is often long, and discovering businesses at the intention or resource mobilization stage is very difficult. Research findings on the determinants of start-up success and survival become increasingly biased as the time lag between a founding and its appearance in a source increases, because only successful businesses are well represented in the source. Second, practitioners and policymakers desiring information about the fate of newly formed businesses in their area must carefully evaluate their priorities in deciding on a source of data. None of our methods captures firms early in the founding process, when intentions are still ill formed and resource mobilization still in doubt. The source that comes closest to providing us with a timely, generalizable, and practical sample of new businesses, however, is the enumeration/phone book match. On a number of criteria, it appears to be a better source than the DMI or ES202 files. Thus, for the moment, we have identified the likely alternatives and recommend that researchers and policymakers consider each carefully before beginning their next study of new business formation.

UR - http://www.scopus.com/inward/record.url?scp=38249006233&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=38249006233&partnerID=8YFLogxK

U2 - 10.1016/0883-9026(89)90008-6

DO - 10.1016/0883-9026(89)90008-6

M3 - Article

VL - 4

SP - 367

EP - 386

JO - Journal of Business Venturing

JF - Journal of Business Venturing

SN - 0883-9026

IS - 6

ER -