Data Standards and InventoryThe Educational Data Governance Program (EDGO) establishes and implements data standards and manages its data inventory in an effort to increase data quality, foster common understanding, and ensure data use best practices.
Data quality depends on having a common understanding of what data mean and represent. Standards for data element names, definitions and code sets are the foundation for establishing common understanding.
The California Department of Education (CDE) has established data standards for over 858 data elements and 102 code sets. Sometimes, the CDE refers to these standards as Preferred Variations because the approach we use to establish data standards starts with looking at the variations in how a data element or code set is named and defined. We convene stakeholders (e.g., data collectors, researchers, accountability), for a data subject area and strive to gain consensus. A Preferred Variation conveys an expectation that, in the future, if these data are collected, stored and reported, they will be referred to by their preferred name and carry their preferred definition.
In any case, Preferred Variations are the CDE’s data standards.
A list of common (standard) data names, data definitions, and data code sets that the CDE has designated as preferred by various department-wide working groups can be found in the following documents:
Language Data Standards
The language option set the CDE uses across systems was derived from the International Standards Organization’s (ISO) 639-2 Language Code . The language option set includes the majority of the languages spoken by students and teachers in the State of California, with an “other” option for those languages not included in the set, and an “unknown” option for instances when the language spoken is not available.
Other Language Resources
- U.S. Census Bureau Language Use
Details about language use and coding employed by the United States Census Bureau.
- U.S. Census Bureau 2011 Language Mapper
2007-2011 American Community Survey data tool to map languages spoken across the United States based on U.S. census responses.
Data governance professionals agree that an accurate inventory of the organization’s data resource is vital to effective data management. In 2004, the CDE launched the Data Resource Guide (DRG), an electronic catalog that contains an inventory all of the Department’s educational data assets. The DRG helps the CDE to be transparent about what data are available within its data resource while helping the CDE protect personally identifiable information (PII) and sensitive data. Similar in concept to an electronic catalog of the information resources in a library, which does not contain the actual books; the DRG does not contain actual data – it is a guide to what data are in the CDE’s data resource. Some of the CDE’s data systems contain PII or sensitive data. The DRG is a publicly available. The DRG’s search function allows one to select several criteria that allows one to narrow their search.
Below are the resources of well-known data standards and guides that the CDE references in managing its data assets.
National Center for Education Statistics
The U.S. Department of Education’s National Center for Education Statistics (NCES) is committed to improving the quality and utility of education data. NCES provides the following resources for data standards:
Common Education Data Standards
The Common Education Data Standards (CEDS) are intended to facilitate a common understanding of data standards across the United States by providing alignment and mapping tools for educational data. The CEDS were developed and are maintained by various working groups of education stakeholders. Initially, the CEDS reflected standard names and definitions for data within the K12 domain. Over the years, the CEDS scope has expanded to include common names and definitions for data in other domains (e.g., postsecondary education, early education and workforce). The CEDS includes powerful tools that organizations can use for a variety of data management initiatives such as: becoming more transparent about the data their organization collects and stores; examining how similar or different their organization’s data names and definitions are when compared with another organization’s names and definitions (also known as aligning); clarifying what data go into producing a report and what rules went into deriving the information in that report. The CDE has shared a map of the California Longitudinal Pupil Achievement Data System (CALPADS) in CEDS
- School Courses for the Exchange of Data
The School Courses for the Exchange of Data (SCED) offers a course coding structure that can accommodate diverse course offerings and curricula. The elements that make up a SCED code and the information included in course descriptions were designed to be specific enough to identify the course’s topic and to distinguish it from other courses without defining every aspect of a course, such as course objectives, methods of delivery, or prerequisites. The CDE’s code set ‘Course Group State’ is our state’s equivalent of SCED. When the NCES updates SCED, the CDE compares California’s ‘Course Group State’ code set to the revised SCED and engages internal stakeholders as well as local educational agencies and schools, to determine if the CDE should revise the ‘Course Group State’ code set.
- Education Data Standards and Codes - Free Publications
Free publications including a Guide to School Courses for the Exchange of Data (SCED) Classification System and Forum Guide to Metadata – The Meaning Behind Education Data from the NCES
Access for Learning
Access for Learning (A4L) operates in a collaborative, global community with a focus on using educational data to positively impact learning. A4L offers a number of resources, including white papers covering a range of topics including Data Privacy, Security and Interoperability (PDF).
The ED-Fi Alliance
The Ed-Fi Alliance has established the Ed-Fi Data Standard to allow multiple systems to share educational data.