Well-structured metadata supports the long-term discovery and preservation of research data, but allows for the aggregation and simultaneous searching of research data from tens or hundreds or thousands of researchers. This is why domain-specific repositories typically require highly structured metadata with your data submissions: it enables highly granular searches on their aggregated content. This in turn makes your data easier to find.
Metadata can take many different forms, from free text to standardized, structured, machine-readable, extensible content. Specific disciplines, repositories or data centers may guide or even dictate the content and format of metadata, possibly using a formal standard. Because creation of standardized metadata can be difficult and time consuming, another consideration when selecting a standard is the availability of tools that can help generate the metadata (e.g. Morpho allows for easy creation of EML, Nesstar for DDI data, etc.).
The Digital Curation Center provides a catalog of common metadata standards, organized by discipline: http://www.dcc.ac.uk/resources/metadata-standards.
Some specific examples of metadata standards, both general and domain specific are:
- Dublin Core - domain agnostic, basic and widely used metadata standard
- DDI (Data Documentation Initiative) - common standard for social, behavioral and economic sciences, including survey data
- EML (Ecological Metadata Language) - specific for ecology disciplines
- ISO 19115 and FGDC-CSDGM (Federal Geographic Data Committee's Content Standard for Digital Geospatial Metadata) - for describing geospatial information
- MINSEQE (MINimal information about high throughput SEQeuencing Experiments) - Genomics standard
- FITS (Flexible Image Transport System) - Astronomy digital file standard that includes structured, embedded metadata
- MIBBI - Minimum Information for Biological and Biomedical Investigations