Deuri is the URL scheme Open Data Portal [[!ODP]] of the European Union. Once minted, deuris must exist forever, though redirection techniques could be used.

http://data.europa.eu/eurovoc/4532

This document is a draft specification of the Persistent URI Task Force of the European institutions. Send comments to the editor.

Syntax

Components

The Deuri components are:

http://data.europa.eu/foo/bar
\___________________/\___/\_/
         |             |   |
       head            | local
                   collection

The following strings are placeholder names (metasyntactic variables): foo, and bar.

Head is the scheme and the authority. It is the fix string http://data.europa.eu. Any part of this document stating http must be read as stating http and https.

Collection is the first path segment. A collection is allocated by the Collection Registrar to a Local Minter. Synonyme of collection id.

Local is the path excluding the collection. Local can be composed of several path segments. Local are minted by Local Minters. Synonyme of local id.

Characters

It must conform to the following:

Clients can send deuris in any case combination, but the returned dueris must be in lower case. Note that in URI the scheme and authority are case-insensitive and the path is case-sensitive.

Variants

The format and language variants can be indicated in the URI using dot extensions. These and other variants can also be requested using content negotiation.

http://data.europa.eu/foo/bar                 # Resource - variant as per the negotiation
http://data.europa.eu/foo/bar.de.html         # German, HTML
http://data.europa.eu/foo/bar.de.pdf          # German, PDF
http://data.europa.eu/foo/bar.en.pdf          # English, PDF
http://data.europa.eu/foo/bar.de              # German - format variant as per the negotiation
http://data.europa.eu/foo/bar.html            # HTML - language variant as per the negotiation

Collection

Collection naming

Collection naming should follow current practices; for example, a collection for EuroVoc should be named eurovoc. For new collections, the Collection Registrar will allocate by default an opaque collection name; Local Minters could request a mnemonic collection name.

Reserved collection

Reserved collection are the ones reserved for the functioning of the Open Data Portal.

Collection home

Collection home is the resource associated with a collection.

http://data.europa.eu/foo                     # this return the collection home for the collection "foo"

Collection home format is the format for collection home. This format must be human and machine friendly: one format covering both functionalities. It should be available in several language variants.

Collection list

Collection list is the list of collections in the Open Data Portal.

Collection list URI is the URI that returns the collection list. Corollary: some collection identities could be reserved for the functioning of the Open Data Portal.

http://data.europa.eu/list

Local

The Local should be mnemonic, compact, and as non-hierarchal as possible. Future versions of this document should futher specify the Local.

http://data.europa.eu/foo/34                  # Publication "34"         - non-hierachical  - one segment local
http://data.europa.eu/foo/2013/34             # Publication "34" of 2013 - hierarchical     - two segments local
http://data.europa.eu/foo/2014/34             # Publication "34" of 2014 - hierarchical     - two segments local
http://data.europa.eu/foo/14/34               # Publication "34" of 2014 - hierarchical     - two segments local
http://data.europa.eu/foo/14/m5               # Publication "m5" of 2014 - hierarchical     - two segments local
http://data.europa.eu/foo/14m5                # Publication "m5" of 2014 - non-hierarchical - one segments local

The number of local segment(s) is the number of path segment(s) minus one.

Governace

Collection Registrar is the entity with the mandate to allocate collections to Local Minters. There is only one Collection Registrar.

Local Minter is an entity with the mandate to mint locals for its collection. Corollary: minting a local means minting a URI. There are many Locals Minters.

Committees and similar entities:

Future directions

The following should be addressed:

http://data.europa.eu/foo/bar             # get the resource "bar"
http://data.europa.eu/foo/bar?            # get the metadata of the resource "bar"

Terminology

New terms

DEURI
Abbreviation of Data Europa Uniform Resource Identifier A noun that follows the appropriate language morphology. As a proper noun it must be written as Deuri, as a common noun as Deuri. For example, Deuri, Deuris, deuri, deuris.
Format tag
is a string from File types NAL [[!FTYPE]]. This is a restriction of the term from [[!COMURI]] as the language tags are only from [[!FTYPE]].

Terms from URI

[[!RFC3986]]

Uniform Resource Identifier (URI)
A Uniform Resource Identifier (URI) is a compact sequence of characters that identifies an abstract or physical resource.
Uniform Resource Locator (URL)
the subset of URIs that, in addition to identifying a resource, provide a means of locating the resource by describing its primary access mechanism (e.g., its network "location").
Syntax Components
The generic URI syntax consists of a hierarchical sequence of components referred to as the scheme, authority, path, query, and fragment.
  foo://example.com:8042/over/there?name=ferret#nose
  \_/   \______________/\_________/ \_________/ \__/
   |           |            |            |        |
scheme     authority       path        query   fragment
   |   _____________________|__
  / \ /                        \
  urn:example:animal:ferret:nose
Resource
This specification does not limit the scope of what might be a resource; rather, the term "resource" is used in a general sense for whatever might be identified by a URI. Familiar examples include an electronic document, an image, a source of information with a consistent purpose (e.g., "today's weather report for Los Angeles"), a service (e.g., an HTTP-to-SMS gateway), and a collection of other resources. A resource is not necessarily accessible via the Internet; e.g., human beings, corporations, and bound books in a library can also be resources. Likewise, abstract concepts can be resources, such as the operators and operands of a mathematical equation, the types of a relationship (e.g., "parent" or "employee"), or numeric values (e.g., zero, one, and infinity).
Scheme
Each URI begins with a scheme name that refers to a specification for assigning identifiers within that scheme.
Authority
The authority component is preceded by a double slash ("//") and is terminated by the next slash ("/"), question mark ("?"), or number sign ("#") character, or by the end of the URI.
Path
The path component contains data, usually organized in hierarchical form, that, along with data in the non-hierarchical query component, serves to identify a resource within the scope of the URI's scheme and naming authority (if any). The path is terminated by the first question mark ("?") or number sign ("#") character, or by the end of the URI.
Path segment
A path consists of a sequence of path segments separated by a slash ("/") character.
Query
The query component contains non-hierarchical data that, along with data in the path component (Section 3.3), serves to identify a resource within the scope of the URI's scheme and naming authority (if any). The query component is indicated by the first question mark ("?") character and terminated by a number sign ("#") character or by the end of the URI.
Fragment
The fragment identifier component of a URI allows indirect identification of a secondary resource by reference to a primary resource and additional identifying information. The identified secondary resource may be some portion or subset of the primary resource, some view on representations of the primary resource, or some other resource defined or described by those representations. A fragment identifier component is indicated by the presence of a number sign ("#") character and terminated by the end of the URI.

Terms from HTTP

[[!RFC2616]]

resource
A network data object or service that can be identified by a URI. Resources may be available in multiple representations (e.g. multiple languages, data formats, size, resolutions) or vary in other ways.
content negotiation
The mechanism for selecting the appropriate representation when servicing a request, as described in section 12. The representation of entities in any response can be negotiated (including error responses).
variant
A resource may have one, or more than one, representation(s) associated with it at any given instant. Each of these representations is termed a 'variant'. Use of the term 'variant' does not necessarily imply that the resource is subject to content negotiation.
variant list
A list containing variant descriptions, which can be bound to a transparently negotiable resource.
variant description
A machine-readable description of a variant resource, usually found in a variant list. A variant description contains the variant resource URI and various attributes which describe properties of the variant.
variant resource
A resource from which a variant of a negotiable resource can be retrieved with a normal HTTP/1.x GET request, i.e. a GET request which does not use transparent content negotiation.
list response
A list response returns the variant list of the negotiable resource, but no variant data. It can be generated when the server does not want to, or is not allowed to, return a particular best variant for the request.

Terms from COMURI

[[!COMURI]]

Reduced URI character set
are the characters "0-9" "a-z" "-_."
Base 36 character set
are the characters "0-9" "a-z"
Visual separator character set
are the characters "-_"
Dot separator character
is the character "."
Unnecessary trailing strings
are strings such as /, php, jsp, asp or cgi.
Language code in two characters
is a code from [[!ISO639-1]]. This subset is included in BCP47.
Dot extensions
are the end strings in the last path segment separated by the dot separator character.
Language extension
is the dot extension that indicates the language with language tag or language code in two characters.
Format extension
is the dot extension that indicates the format with format tag.
Empty query
is a query only the character "?"; i.e., without data. The function is obtaining the URI metadata.
URI metadata
See the section URI metadata.
Mnemonic
See the section Mnemonic.