About the Data
Fatal (Death) Data, Nonfatal Patient Discharge
(Hospitalization) Data, and Nonfatal Emergency Department (ED) Data
Information about fatal injuries comes from the California Department of Public Health’s Death Statistical Master file. These data come
from death certificates that are registered in California each year. The SAC
Branch uses this file to describe California residents who die as a result of
injury (that is, whose death certificate includes an external cause of
Prior to 1999, the cause of death was coded using the
International Classification of Diseases, Ninth Revision (ICD-9). Beginning in
1999, deaths are coded using the Tenth revision of the ICD (ICD-10). These two
revisions are significantly different. Users need to be aware that changes in
the number of specific injuries observed over time may be due to changes in
coding practices rather than true changes in causes of death. More information
about ICD-10 and the effects of the change in coding can be found in our FAQ or
at the National Center
for Health Statistics.
Information about nonfatal injuries comes from the California Office of Statewide Health Planning
and Development Patient Discharge Data (PDD) and Emergency Department (ED)
Data. The PDD data set contains information on patients discharged
from all non-Federal hospitals in California, and the ED data set contains
information on patients who were admitted to an emergency department in
California, then treated and released, or transferred to another facility. SAC
uses this data to describe people who are hospitalized or in the ED as a result
of an injury (that is, whose discharge diagnosis includes an external cause of
Records for PDD and ED data represent the first
hospitalization or ED visit for the injury in question, but may not be the only
record for an individual person. Repeat visits for the same injury are not
included in the file so each record represents an incident injury event.
However, two separate injury events that require a hospitalization or ED visit
would be counted twice in the PDD data or ED data. For example, a person who was
hospitalized for a fall, was discharged to go home, and then fell again two
weeks later would be counted in two separate records in the PDD data.
This Excel file will give you more details About the Death, Hospitalization, and ED
variables included in our
fatal and nonfatal injury data sets, including what we exclude from the data on
California Electronic Death Reporting System (CalEVDRS) Data
California’s violent death data come from two separate data
systems – California’s Violent Death Reporting System (CalVDRS) and California’s
Electronic Violent Death Reporting System (CalEVDRS). The former
was administered by CDPH from 2005-2008 as part of CDC’s National Violent Death
Reporting System (NVDRS). The latter system is funded by The
California Wellness Foundation and was created by CDPH in response to issues
with CDC’s system. CalEVDRS has been functioning and expanding
CalEVDRS was built to be compatible with NVDRS by using the
same data specifications. It does not use the same methodology,
however, and that is why this data query makes a point to separate the two
systems. CalVDRS data was manually abstracted from hardcopy
records into CDC software by CDPH and county health department staff who were
trained in abstracting for NVDRS. CalEVDRS data is mostly entered
by coroner staff from participating counties. Although these staff
were trained in abstracting according to NVDRS definitions, CalEVDRS funding has
not been sufficient enough to ensure ongoing training and quality assurance of
CDPH employed Santa Clara County for both systems to evaluate
the data quality of CalEVDRS. This evaluation showed that data are
comparable overall and gave us insight to where further training was needed.
Besides greater efficiency of the CalEVDRS system and the
need for ongoing training and data quality assurance, some differences between
the two systems to keep in mind when using the data query:
CDC does not
consider Supplementary Homicide Reports (SHR) a primary data source for NVDRS so
their software did not contain data fields for many SHR data elements that are
in CalEVDRS data.
- For example, “drive-by shootings” is a circumstance in SHR but there is no place to code it in NVDRS software. Thus, CalVDRS cases of “drive-by shooting” could be underreported, compared to CalEVDRS data, because CalVDRS does not have the benefit of SHR detail.
- SHR also often contains more detail than coroner records on firearm type (i.e. whether it was a handgun vs. long gun). Again, the NVDRS software used by CalVDRS did not have a place where this information could be entered so detail on firearm type may be lacking in CalVDRS, compared to CalEVDRS. The SHR firearm detail was noted in a text field by abstractors and this text field was searched and recoded so some of these cases may be captured but since different abstractors may have written this note in many different ways, this information is much more difficult to capture than if a consistent data field were available.
Some other things to consider when interpreting these
CalVDRS data are not available yet for 2008. CalEVDRS data are available through 2009 so there is a considerable gap in 2008 where much of the combined data are missing.
Violent deaths in these systems are reported by the county where the injury occurred. This means that if an injury occurred outside one of the participating counties and the victim was transported to a hospital in one of the participating counties and died there, that victim would not be reported in this data query. Where a victim was injured in one of the participating counties and died outside the participating counties, that person would be included in these data, to the extent we were able to identify injury location. The injury location of a small percentage of these deaths was unknown. In these cases, the county of injury was assumed to be the same as where the death occurred.
In 2005, Alameda County only reported violent deaths where the injury occurred in the City of Oakland or to residents of Oakland (regardless of where the injury occurred). This means Alameda County violent deaths, as reflected by occurrence in the data query, contain only those that occurred in Oakland or those victims who resided in either Oakland, San Francisco, or Santa Clara County and who were injured anywhere in Alameda County.
A few peculiarities of the CalEVDRS system –
module was inadequately developed initially so data from 2007 through 2009 are
very conservative. Positive toxicology results for these years
should be interpreted as a minimum. The actual number of positive
drug tests are likely higher. This module has been fixed in 2010
to capture more accurate toxicology data.
module is separate from the rest of CalEVDRS data elements and this causes data
entry staff to overlook entering this information. Coroner staff
have been notified of this and asked to go back and enter weapon
information. Data will be updated periodically but that is the
reason for a high number of “unknown” weapon types.
These data are compiled for the purpose of better understanding the circumstances of violent deaths. Hopefully, these data can be used to inform homicide and suicide prevention efforts and policies. However, care must be taken in interpreting these data. As much as the definitions, training, and data quality assurance are standardized, these data, like violent death reporting data in all states, are not perfect. They are documented initially by death investigators, each with their own methods and biases, from interviews with friends and family members of the victims, also each with their own biases. The information is then abstracted by different people, depending on the county. These people are trained to reduce bias and report data consistent with other abstractors but human variation is inevitable.
If you have any further questions, please visit the California’s
Electronic Violent Death Reporting System website or contact Steve Wirtz at (916)
About the Linked Crash Medical Outcomes (CMOD) Data
California’s Crash Medical Outcomes Data (CMOD) project
is modeled on the National Highway Traffic Safety Administration (NHTSA) Crash
Outcome Data Evaluation System (CODES). The CMOD project uses probabilistic
linkage software, LinkSolv, to link data from police traffic crash records
(i.e., scene investigations) to medical data (from emergency departments,
hospitals, and, in a future update, death files). Probabilistic record linkage is useful when the data of interest come
from two or more sources that do not have a common identifier for the same
individual. Using information common to both the crash and medical files (like
age, sex, date of injury) the linkage software mathematically decides whether
two records are likely to refer to the same person.
The hospital outcome data include persons classified as an
injured driver, passenger, pedestrian or bicyclist on a collision report.
Persons who died as a result of their injuries are not included in either the
hospital or emergency department dataset.
Description of Variables
Outcome – Nonfatal emergency
department (treat and release or transferred) refers to patients treated in
emergency departments but not admitted. The vast majority are
treated and released. A small number are transferred to another
hospital for in-patient admission. Nonfatal Hospitalized refers to
persons admitted as in-patients, whether or not they had been treated in an
Age (available in two
Single year of age - Each year of age will appear on its own
line (for example: 0, 1, 2, 3, 4 … up to 90+)
5-year age groups - These start with "0-4" and go to "85-89".
Persons over 89 years old are included in the category "90+"
Race/Ethnicity - We combine
two separate categories, race and Hispanic ethnicity, into a single
race/ethnicity category. We also combine some categories together (such as
combining Asian sub-groups into a single "Asian" category). We do this so that
we have comparable groups both across time and between fatal and nonfatal data.
If you need more detail than we provide, please contact us and we can discuss
what we have available in our data.
Sex - This is the gender of the
Drug/Alcohol Diagnosis – Whether
victim was diagnosed (primary or secondary) with alcohol or drug effects during
the hospitalization or emergency department visit.
Crash temporal variables - The
year, month, day of week, and time of day refer to when the collision
Role - As indicated on the
collision report: motor vehicle driver, motor vehicle passenger,
motorcyclist (includes motorcycle passenger), pedestrian, or bicyclist. The
motorcyclist category in the CMOD query includes riders of motorized scooters
and mopeds. Self-propelled scooter riders are classified as
pedestrians, as are users of wheelchairs and similar mobility chairs.
Vehicle type - The type of vehicle
the injured person was traveling in when collision occurred. CMOD categories
- passenger car (includes minivans and SUVs)
- motorcycle (includes motorized scooters and mopeds)
- pick-up/panel truck
- truck/truck tractor: a truck with two or more axles, or truck tractor, operated singly or with one or more semi-trailers or trailers
- bus (includes school bus)
- all other vehicles (includes emergency vehicles, highway construction and other vehicles)
- not stated
Type of collision - The
general type of collision which was the first event.
Primary collision factor - The one circumstance or driving action
which, in the officer’s opinion, best describes the primary or main cause of the
Safety equipment use - For vehicle occupants this refers
to use of safety restraints such as seat belts and child passenger safety
seats. For motorcycle and bicycle riders, safety equipment refers
to helmet use.
Seat position - Indicates whether the vehicle occupant was in a front
versus any rear seat.
Region - County of collision is
grouped into one of seven regions of the state developed by the UCLA Center for
Health Policy Research. Northern and Sierra Counties: Butte,
Shasta, Humboldt, Del Norte, Siskiyou, Lassen, Modoc, Trinity, Mendocino, Lake,
Tehama, Glenn, Colusa, Sutter, Yuba, Nevada, Plumas, Sierra, Tuolumne,
Calaveras, Amador, Inyo, Mariposa, Mono, and Alpine; Greater Bay Area:
Santa Clara, Alameda, Contra Costa, San Francisco, San Mateo, Sonoma, Solano,
Marin, and Napa; Sacramento Area: Sacramento, Placer, Yolo, and El
Dorado; San Joaquin Valley: Fresno, Kern, San Joaquin, Stanislaus,
Tulare, Merced, Kings, and Madera; Central Coast: Ventura, Santa Barbara,
Santa Cruz, San Luis Obispo, Monterey, and San Benito; Los Angeles County:
Los Angeles; Other Southern California:
Orange, San Diego, San Bernardino, Riverside, and Imperial.
collision - Traffic collision where any
driver, pedestrian, or bicyclist involved in the crash had been
Drug involved collision -
Traffic collision where any driver,
pedestrian, or bicyclist involved in the crash was under the influence of one or
Primary diagnosis (available in two formats): The primary (or principal) diagnosis is the chief reason the patient was
admitted to the hospital or treated in the emergency department.
The primary diagnosis may be the patient's most serious problem, but
sometimes it is not.
Nature of injury - The type of
injury, such as burn, fracture, or open wound.
Body part injured -
The general region of the body injured, such as lower
extremity, torso, or vertebral column.
Disposition on discharge – Where the patient is
sent upon discharge. Common dispositions are released to home and
transferred to another facility.
Length of stay – The number of days
an in-patient stayed in the hospital. There are five categories, ranging from
same day/overnight to more than one week in the hospital. Length
of stay does not apply to patients treated in the emergency
Expected source of payment -
The expected source of payment is the type of payer that is
expected to pay or did pay the greatest portion of the bill for the hospital
stay. Examples are private insurance, Medicare, and
If you have any further questions, please visit the California’s Crash
Medical Outcomes Data website or
contact Steve Wirtz at (916) 552-9831
and Other Drug Consequences (AOD) Data
the ICD-10 and ICD-9-CM codes used in the Alcohol and Other Drugs (AOD) query on
EpiCenter for deaths and hospital and ED data, respectively. Health consequences
include AOD poisoning (overdoses), mental disorders, and physical diseases 100%
attributable to AOD. Multiple cause of death diagnoses are used to capture drug
overdoses by screening the underlying cause of death for drug poisoning codes
and then scanning the multiple causes of death for the T codes of interest. For
deaths, specific substance groupings are available for alcohol and drug
poisonings only. The underlying cause of death is used to identify mental
disorders and physical diseases for deaths. For hospital and ED data, grouping
by several specific substances is available
for poisonings, mental disorders, and physical diseases.
Population data are from the California Department of Finance (DOF)'s Demographic Research Unit. The data files
used are the “Estimates of Race/Ethnic Population with Age and Gender Detail”
data sets for 1990-1999 and 2000-2010, available on the DOF website.
Demographic variables included on EpiCenter are: County, Age, Sex, and
Race/Ethnicity. The data is used on EpiCenter to generate rates in some of our
queries, and it also has its own query, if you would like further detail on
California's demographics. Also note that the race/ethnicity categories in
population data differ from the categories used in our fatal and nonfatal injury
data sets. Population data includes "Unknown/Other" race/ethnicity in with
"White" and also includes a "Multirace" category. These categories are
included when you run the Population Data query. However, when rates are
generated for our injury queries, the race/ethnicity categories displayed will
be "White/Unknown/Other", but will not display a "Multirace" category, to make
the injury data categories as comparable as possible to the population data
Also, the California Department of Finance (DOF) did not start collecting and
reporting data on "Multirace" until 2000. Therefore, the population numbers for
specific races/ethnicities will differ quite drastically for years prior to
2000, compared to 2000 and later, since the population that was "multi-race"
would have been categorized as one of the other races/ethnicites prior to
2000. For this reason, caution must be used when comparing population numbers or
rates for years prior to 2000 and after 2000 when using race/ethnicity.
The data included on EpiCenter incorporate updates based on the U.S. 2010
Census. This change was made in November 2012, so if you looked at
the population numbers or used them to develop rates before November 2012, your
numbers/rates will be fairly different than with the new, updated numbers. If
you still need more information about the population data, please contact the California Department of Finance's Demographic Research Unit.