At a glance
- Researchers must provide National Center for Health Statistics (NCHS) public-use data and other non-NCHS data to their Research Data Center (RDC) analyst.
- Compiling the public-use dataset helps you become familiar with the data.
- Follow the process to provide public-use and non-NCHS data to your RDC.

Please follow these steps when providing your Research Data Center (RDC) analyst with the public-use data and/or non-National Center for Health Statistics (NCHS) data.
Step one
Create a dataset that only includes the variables specified in your application. Do not include variables that you did not list in your approved application. If adding variables, you must first update your application and discuss the matter with your RDC analyst. Additional variables in your dataset that you did not list in your approved application will require additional review. This may delay the start of your project.
Step two
Use original NCHS variable names in your dataset so that variable names match those given in the NCHS public-use data set. If you would like to rename these variables, include the original variable name in the variable description. For non-NCHS data, make sure the variable names you listed in your application's data dictionary match the variables names in the non-NCHS data file.
Step three
If you choose to create derived variables prior to working with the data onsite, make sure to clearly define these variables. The variable description should include the original variable name(s) from its source and any arithmetic code or algorithm used. Please save the code you used to create these variables because your RDC analyst may request it.
Step four
Discuss with your RDC analyst the preferred file format for any merge files. This is especially important for complex merges that involve multiple data sets and multiple merge variables. Please work with your RDC analyst to create these merge files as requested by your RDC analyst. This helps expedite the merge process and improves data quality.
Step five
Email your datasets along with a list of the variables to your RDC analyst. If your datasets are too large to email, please discuss with your RDC analyst.
Important notes about submitting public data
You do not need to provide a public-use dataset for:
- National Hospital Discharge Survey
- National Ambulatory Medical Care Survey
- National Hospital Ambulatory Medical Care Survey
- National Survey on Drug Use and Health
Your RDC analyst will provide an extract from the restricted-use files that includes all the variables specified in your approved application.
RDC considers any attempt to include variables that may lead to re-identification of study participants or establishments a disclosure violation. Disclosure violations will result in the cancellation of your project and possible legal actions.
If you are requesting access to the restricted-use mortality files, you cannot include any public-use mortality variables, or variables derived from the public-use mortality data.
Non-NCHS data includes data collected by the researcher, another government agency, or a private institution that the researcher wishes to merge with NCHS data.