Dublin Core is a set of metadata elements that provide a fundamental group of text elements through which most resources can be described and catalogued. A Dublin Core metadata record is intended to be used for cross-domain information resource description and have become standards in the fields of library science and computer science.
Dublin Core Metadata Initiative (dcmi) was incorporated as in independent entity, providing an open forum for developing interoperablemetadata standards ecompassing a broad range of purposes and business models.
The Dublin Core standard includes two levels: Simple and Qualified. Simple Dublin Core comprises 15 elements, while Qualified Dublin Core includes three additional elements and qualifiers that refine the semantics of the elements so they are useful in resource discovery. The context of this blog post will focuses on the Simple Dublin Core Metadata Element Set in conjunction with a related linkelement.
Utilizing Dublin Core meta elements provides ways for a document to be indexed using several distinct, searchable fields. Thus, the document can be searched by said fields meaning it can be located more specifically than documents not using Dublin Core, therefore increasing findability, searchability and locatability.
Simple Dublin Core Metadata Element Set
The Simple Dublin Core Metadata Element Set (DCMES) consists of 15 metadata elements, listed below:
Title
Markup
Description
Each Dublin Core element is optional and may be repeated. The DCMI has established standard ways to refine elements and encourage the use of encoding and vocabulary schemes. There is no prescribed order in Dublin Core for presenting or using the elements.
Contributor
meta name="DC.Contributor" content="publisher-name" /
An entity responsible for making contributions to the resource. Examples include a person, organization or service. Typically, the name of a Contributor should be used to indicate the entity.
Coverage
meta name="DC.Coverage" content="World" /
Spatial or temporal topic of the resource, spacial applicability of the resource or the jurisdiction under which the resource is relevant. Spatial topic and spatial applicability may be a named place or a location specified by geographic coordinates. Temporal topic may be a name period, date or date range. Jurisdiction may be a named administrative entity or a geographic place to which the resource applies. Best practice here is to used a controlled vocabulary such as the tgn. If appropriate, named places or time periods can be used in preference to numeric identifiers (such as sets of coordinates or date ranges).
Creator
meta name="DC.Creator" content="Administrator" /
An entity primarily responsible for making the resource, like a person, organization or service. Typically, the name of a Creator should be used to indicate the entity.
Date
meta name="DC.Date" scheme="W3CDTF" content="2004-01-01"/
Point or period of time associated with an event in the lifecycle of a resource. May be used to express temporal information at any level of granularity. Best practice is to use an encoding scheme for Date.
Description
meta name="DC.Description" lang="en" content="publisher-name" /
Account of the resource; may include (but not limited to) an abstract, a table of contents, graphical representation, or a free-text account of the resource.
Format
meta name="DC.Format" scheme="IMT" content="text/html" /
File format, physical medium or dimensions of the resource. Examples of dimensions include size and duration. Best practice is to use a controlled vocabulary, as in this example using mime types.
Identifier
meta name="DC.Identifier" scheme="uri" content="http://catalog.loc.gov/67-26020" /
An unambiguous reference to the resource within a given context. Best practice is to identify said resource by means of a string conforming to a formal identification system.
Language
meta name="DC.Language" content="en" /
Language of the resource. Best practice is to use a controlled vocabulary.
Publisher
meta name="DC.Publisher" content="publisher-name" /
Entity responsible for making the resource available, including a person, service or organization. Typically the name of a Publisher should be used to indicate the entity.
Relation
meta name="DC.Relation" content="publisher-name" /
Related resource; best practice is to identify by means of a string conforming to a formal identification system.
Rights
meta name="DC.Rights" content="http://example.org/legal-terms-of-use.html" /
Information about rights held in and over the resource, typically including various property rights (including intellectual property rights) associated with the resource.
Source
meta name="DC.Source" content="/meta-tags/" /
Related resource from which described resource is derived; may be derived from the related resource in whole or in part. Best practice is to identify the related resource by means of a string conforming to a formal identification system.
Subject
meta name="DC.Subject" lang="en" content="Heart Attack" /
Topic of the resource, typically represented by keywords, key phrases or classification codes. Best practice is to use a controlled vocabulary. To describe the spatial or temporal topic of the resource use the Coverage element.
Title
meta name="DC.Title" lang="en" content="Hamlet in Iceland; being the Icelandic romantic Ambales saga" /
Resources given name, typically the name by which a resource is formally known.
Type
meta name="DC.Type" scheme="DCMIType" content="Text" /
Nature or genre of the resource, best practice is to use a controlled vocabulary, this example uses the dcmi Type Vocabulary. To describe file format, physical medium or the resources dimensions, use the Formatelement.
Note: as a general rule, element names may be mixed-case but should always have a lower-case first letter, commonly called camelCased.
Note: the value of the contentattribute is defined to be cdata.
Authors need to use the linkelement to reference the definitions comprised by the element schema, like so:
Authors can express Dublin Core metadata using simple meta and linkelements, which describe how a Dublin Core metadata description set can be encoded in (x)html, which is also an html meta data profile as defined by the html specification. Utilizing Dublin Core is an extremly easy and extensible way for authors to make their content more findable, searchable, and locatable, thus increasing their content’s overall accessibility.
The only reason I’ve ever been against an ajax driven site is because the content is not accessible to the bots, thus seriously limiting your seo and findability capabilities. Google has come up with a solution to make ajax driven sites crawlable, and by following their easy guide, my reasoning against ajax sites is completely misguided. Below is my interpretation of the guide, but you can check it out in its entirety here.
Supporting ajax Crawlability Scheme
You need to let the bots know that you support ajax crawlability scheme, and you do this simply by adding a ! after the hash fragment. Simply adding ! after the # in your url completes your sites adoption of the scheme and your site is now considered ajax Crawlable.
Add _escaped_fragment_ Support to Your Server
You need to provide html snapshots of your url so the bots can see your content. Essentially we want the bots to see the url www.example.com/ajax.html?_escaped_fragment_=key=value instead of what users see www.example.com/ajax.html#!key=value. We accomplish this by using 1 of 3 methods;
Recreate your content using a headless browser
Recreate your content by replacing your client-side JavaScript with server-side code
Enable Crawlability in Pages Without Hash Fragments
In order to make pages without hash fragments crawlable, simply add this metaelement in the document’s head section. It tells the bots to crawl the ugly version of this url.
<meta name="fragment" content="!" />
Update Your Sitemap
Update your existing sitemap so that all of the crawlable urls that you want indexed are indicated there. See, I told ya’ll this was easy.
The metaelement allows authors to attach additional information (representing metadata) to their documents. This information can be used by the user agent for rendering content, it can provide meta information for indexing, it can be used to simulate http Response Headers, and it also can be used to reload the document. The additional information also represents various kinds of metadata that cannot be expressed using the title, base, link, style and scriptelements.
The metaelement belongs in a document’s head and has no content; its attributes define name/value pairs associated with the document. The metaelement information is machine-parsable (read: machine tags and all browsers expose meta information via the dom. These name/value pairs can be used by the server to further define the document type to the user agent.
The metaelement is a void element: an element whose content modelnever allows it to have contents under any circumstances; It requires exactly one of the attributes of name, http-equiv and charset must be specified. If the name or http-equivattributes are specified then the contentattribute must also be specified, otherwise omit content.
html5, whatwg and the dcmi are all working to create a standard lists of meta properties, however currently none exists.
name="" Attribute
The name=""attribute is the name in the name/value pair and represents document-level metadata. This representation specifies what aspect of metadata is being set. If the content=""attribute is omitted then the value of the name/value pair is an empty string. If not provided, the name of the name/value pair is taken from the http-equiv=""attribute.
The name=""attributemust be a defined metadata name or a registered metadata name. name IDL attributesmust reflect the respective content=""attributes of the same name.
w3c Defined Metadata Names
application-name: the value of the content=""attribute must be a string representing the name of the Web App the page represents. If document is not a Web App, don’t use it. Cannot have more than one per document. User Agents can use application name in ui instead of the titleelement because title could have status messages, etc. relevant to the status of the page at a particular time.
author: value of the content=""attribute must be a string giving the name of the author(s) of the document.
description: value of content=""attribute must be a string describing the page. Must be appropriate for use in a search engine. Document cannot have more than one.
generator: value of the content=""attribute must be a string identifying software used to generate the document. Do not use this value on hand-authored pages (notepad denied!).
keywords: value of content=""attribute must be a set of comma-separated strings, where each string is a relevant keyword to the document.
w3c Other Defined Metadata Names
keyword: actual name being defined; should not be similar to any other defined name (like differing only in case).
brief description: short, non-normative description of what the metadata name’s meaning is, including the format the value is required to be in.
specification: link to a more detailed description of metadata name’s semantics and requirements.
synonyms: list of names that have exactly the same processing requirements. do not use the named defined to be synonyms, they’re only intended to allow user agents to support legacy content. Synonms not in practice can be removed, only names that need to be processed as synonyms for compatibility with legacy content are to be registered this way.
status: has three possible values: proposed, ratified, and discontinued
name="status" content="proposed": name has not received wide peer review and approval; name has been proposed and is or soon will be in use.
name="status" content="ratified": name has received wide peer review and approval. its specification unambiguously defines how to handle pages using the name including when used incorrectly.
name="status" content="discontined": metadata name has received wide peer review and has been found wanting. May be in use on existing pages, but new pages should not used it. If anything, “brief description” and “specification” entries give details of what the author should use.
whatwg Registered Metadata Names (technically none so far)
baiduspider – synonym for robots targeting Baidu only.
googlebot – synonym for robots targeting Googlebot only.
ia_archive – synonym for robots targeting Archive and Alexa only.
msnbot – synonym for robots targeting Bing only.
robots – comma-separated list of operators telling search engine bots how to treat content.
slurp – synonym for robots targeting Yahoo! only.
teoma – synonym for robots targeting Teoma and Ask.com only.
viewport – allows documents to specify size, zoom factor and orientation of the viewport used as the base for document’s initial containing block. The following properties can be used in the value of the content=""attribute: width, height, initial-scale, minimum-scale, maximum-scale, user-scalable.
audience – aids search engine classification and aids directory compiliations by providing the audience most appropriate for the page. Values are case-insensitive and comma-separated (singular and plural values are equal).
bot- – represents all bots prefixed with bot-.
created – document creation datetime (ISO8601 date); must follow w3c ISO-8601 datetime profile with a granularity of “complete date” or finer.
creator – off-Web/pre-Web creator of a work for which an author authored a document, so the creator and author can be different people. One element represents one creator; multiple creators need to be represented by multiple creator elemetns.
datetime-coverage – value is a non-vague date or non-vague time (not a range) expressing which time frame is most relevant to the content.
datetime-coverage-end – identical to datetime-coverage except only representing the end. When used without datetime-coverage-start, it is interpreted as ending a range without a start.
datetime-coverage-start – identical to datetime-coverage except representing only the start; if used without datetie-coverage-end, is interpreted as starting a range with no end.
datetime-coverage-vague – identical to datetime-coverage except its value is not clear; use when datetime-coverage, datetime-coverage-end, datetime-coverage-start are all inappaproiate (example: Tuesdays).
DC – stands for Dublin Core, maintained by the dcmi, reserves all strings that begin with DC..
dir-content-pointer – helps search engines organize results by identifying similar sections of pages in a directory with a standard vocabulary. Useful when using different conventions for displaying or printing content. Recognized values are pointer types to which numbers may be suffixed: start, toc (table of contents), intro, abstract, main, bibliography, index, afterword, update, credit and author bio. Number suffixes tell the search engine/directory to arrange like itmes in numerical order within the results. Each directory and subdirectory has its own sequence.
expires – defines expiration date of the document which can be used in preparation for an upcoming event (when you have a pre-set date when the document will no longer be valid, like a sign-up form for an event). Search engines should remove this page from their main search results after the expiration date or by telling the user the result is out of date.
format-print – informs operating system/printer driver of the preferred print medium (like paper size). Recognized values are letter, A4, legal, A5, B5, monarch, envelope 10, envelope 6-3-4 as well as values with integers and decimals, like 8.5 x 11, paper (default color), weight (usually 20lb. stock). You can specify a medium of the given color or mixed by using white, yellow, pink, blue, green, violet or multicolor. Letterhead, p2 letterhead (letterhead for all pages minus first page), watermark and plain (not preprinted/not letterhead).
geographic-coverage – specify geographic relevance of content.
keywords-not – negative keywords that distinguish closely-related themes from a document’s theme. Supports Boolean not searches.
page-datetime – tells search engines recency or relevance to an event date.
page-version – Versions regardless of date may show consecutiveness and can replace vague dates; a version number can be more useful in this case. Versions 0 and 0.n signify beta versions, where .n is number of places and 1 or higher implies final-release versions.
publisher – provides the document’s publisher, which often differs from creator/author when the publisher is an institution.
referrer – controls the sharing of referrer information with linked resources and followed links at the meta level. The content attribute accepts three values, always, default (include referrer information in non-secure conext or for https resources with same origin) and never; The never value is redundant with link rel="nofollow" and supplants said link element, which provides more control.
resolutions – specifies high-resolution versions of images that the browsers should use in place of lower-resolution default images if a high-resolution screen is detected to be in use.
rights – used to assert intellectual property rights in source code.
rights-standard – enable search engines to compile the types of rights allocated to the document; the example belows implies that the Page code is cc By Attribution, Share alike
subj-... – To classify by subject a page’s content, a standard subject taxonomy that will be recognized by a search engine or directory will help. Because many such high-quality taxonomies exist, only a prefix is proposed. Over time, particular taxonomies, in print or online, may be recognized here and keywords assigned for each.
MSSmartTagsPreventParsing – ie6 beta feature allowed browser to add information that wasn’t supplied by the document; this prevents that.
meta Pragma Directives
The http-equivattribute is an enumerated attribute (taking one of a finite set of keywords), and if a meta http-equivelement is present within the document, the user agentmust run the algorithim appropriate for that State. keywords and appropriate algorithims below:
State Content Language maps to keyword content-language; Content language state (http-equiv="content-language"): this pragma sets the pragma-set default language; until successfully processed, there is no pragma-set default language.
note: this will trigger a warning in validators, use langattribute instead.
State Encoding Declaration maps to keyword content-type; Encoding declaration state (http-equiv="content-type") alternative form of setting charset=""attribute; is a character encoding declaration. Encoding declaration state’suser agent requirements are entirely handled by the parsing section of the specification.
State Default Style maps to keyword default-style; Default style state (http-equiv="default-style"): this pragma sets the name of the default alternative style sheet.
State Refresh maps to keyword refresh; Refresh state (http-equiv="refresh"): pragma acts as a timed redirect.
State Cookie Setter maps to keyword set-cookie although Cookie Setter is non-conforming; Cookie setter (http-equiv="set-cookie"): pragma sets http cookie. Is non-conforming, you should use real http Headers instead.
Pragma Directive http-equiv="" Attribute
The http-equiv=""attribute is the name in the name/value pair; tells the server to include said pair in the mime document header that is passed to the browser before actually sending the document. When the http-equiv=""attributee is used, the server adds these name/value pairs to the content header it sends to the browser.
The http-equiv=""attribute is restricted to these values: refresh, default-style and content-type in html5.
http-equiv="refresh" Pragma Directive
The http-equiv=""attribute whose value is refresh represents a pragma directive specifying either a number of seconds after which to reload the current page, or a number of seconds after which to reload a different page and the url for the replacement page.
meta http-equivelement attribute whose value is default-style represents a pragma directive that specifies the document’s preferred stylesheet. content=default-style-name where default-style-name is the name of the preferred stylesheet and must either match the value of the link title="name" that also has href=""attribute referencing the location of said stylesheet or, the names must match the value of the style title="name" whose contents are a css stylesheet.
meta http-equivelement attribute whose value is content-language represents a pragma directive specifying a document-wide default language; this is obselete, specify the language on the root element instead.
meta http-equivelement attribute whose value is content-type and has an accompanying content attribute and value represents a character encoding declaration.
meta http-equiv="content-type" indicates the meta element is in the encoding declaration state and represents a character encoding declaration.
meta http-equiv="content-type" content="meta-charset string" is a specially formatted string providing a character encoding name whose value is content-type and which has an accompanying content attribute and value is said to be in the encoding declaration state.
Non-standard meta http-equiv Values
Allow
Content-Encoding
Content-Language
Content-Length
Content-Type
Date
Expires
Last Modified
Location
Set-Cookie
WWW-Authenticate
content Attribute
The contentattribute is the value in the name/value pair. The value can be any valid string and should always be used in conjunction with a name or http-equivattribute. It gives value of the document metadata or pragma directive when the metaelement is used for these purposes. The content contains the actual metadata defined by the The nameattribute.
The charsetattribute specifies the character encoding used by the document (aka character encoding declaration). It has no effect on xml documents, and is only used to facilitate migration to/from xhtml. The charsetattribute represents a document’s character encoding declaration when an html document is serialized to string form. There must be no more than one meta element with charsetattribute per document.
The charsetattribute represents a character encoding declaration, where charset="name" specifies a character encoding name specified by the iana registry that has a Name or Alias field labeled as “preferred mime name“, or if none of the Alias fields are so labeled, a Name field in the registry.
scheme Attribute
The schemeattribute defines the scheme that should be used to interpret the property’s value, which should be defined within the profile specified by the profileattribute of the headelement. It provides additional information about the meta information; used in conjunction with the profileattribute of the headelement (which provides a pointer to instructions about the contents of the profile which are directly related to meta name attribute values). The value of scheme is a uri specifying the location of the profile information on the server.
The schemeattribute is obselete; use only one scheme per field or make the scheme declaration part of the value.
<meta name="robots" provides instructions to all search engines.
The meta name="robots"element can be used to tell robots not to index the content of a page, not to follow the links in a page, or both.
Robots can and will ignore the meta name="robots"element; do not worry about directing malware bots to your content; the meta name="robots"element is for standards-compliant, well-behaved bots.
Multiple values are allowed in a single meta name="robots"element; if you are not preventing bots from your content, don’t worry about meta name="robots"element, the default (not using one) is content="index, follow".
<meta name="robots"
meta name="robots" content="noindex,nofollow" />
Tells the bots not to index the page’s contents nor follow its links.
<meta name="robots" content="noindex,nofollow" />
meta name="robots" content="noodp" />
To prevent all search engines (that support the meta tag) from using this information for the page’s description, use the following:
content="noodp" stands for No Open Directory Project; Google sometimes uses descriptions from the Open Directory Project to generate the title and description snippets that can be seen in serps; this is telling all Search Engines to not use anything from the Open Directory Project.
<meta name="robots" content="noodp" />
<meta name="robots" content="noindex" />
content="noindex" tells search engines to deny the page from being indexed.
<meta name="robots" content="noindex" />
<meta name="robots" content="nofollow" />
content="nofollow" tells search engines to not follow any links on the page.
<meta name="robots" content="nofollow" />
<meta name="robots" content="noarchive" />
content="noarchive" tells search engines to not provide a cached copy of page in search results.
<meta name="robots" content="noarchive" />
<meta name="robots" content="nosnippet" />
content="nosnippet" tells search engines to not include a description of the page in search results as well as prevents caching of the page.
<meta name="robots" content="nosnippet" />
<meta name="robots" content="none" />
content="none" the same as using content="noindex, nofollow". Note: don’t confuse content="none" as an indicator of no robots restrictions. It will block all search engines from your content.
<meta name="robots" content="none" />
<meta name="robots" content="noimageindex" />
<meta name="robots" content="noimageindex" />tells Google Images Search results not to use your site as a referring page. Note: images on the page may still be included in the image index if they are linked to by other pages.
<meta name="robots" content="noimageindex" />
<meta name="googlebot"
<meta name="googlebot" provides instructions to only Googlebot.
<meta name="googlebot"
Googlebot by default indexes pages and follows links, so you don’t need to add content="index" or content="follow". What about other search engines/bots?
If you use conflicting values, Googlebot will follow the most restrictive.
meta name="googlebot" content="noodp" />
To specifically prevent Googlebot from using this information for a page’s description, use the following:
This is telling Google specifically not to use anything from the Open Directory Project.
<meta name="googlebot" content="noodp" />
If you use the robotsmetaelement for other directives, you can combine those. For instance:
<meta name="googlebot" content="noimageindex" />tells googlebot not to include your images in Google Images Search results not to use your site as a referring page. Note: images on the page may still be included in the image index if they are linked to by other pages.
<meta name="googlebot" content="noimageindex" />
<meta name="msnbot" content="noindex,nofollow" />
Tells MSNbot not to index your page content and not to follow your links.
<meta name="msnbot" content="noindex,nofollow" />
<meta name="robots" content="noydir" />
Tell all bots not to use a Yahoo! Directory title or Yahoo! Directory description.
<meta name="robots" content="noydir" />
<meta name="slurp" content="noydir" />
Specifically tell Yahoo! botname="slurp" to not crawl your page.
And that’s all! The address bar is hidden until the user swipes down near the top bar of the application. With the address bar hidden, your web app can look just like a native app!
If you want to modify your web page for different densities, by using the -webkit-device-pixel-ratio css media query and/or the window.devicePixelRatio dom property, then you should set the meta name="viewport" content="target-densitydpi= to device-dpi. This stops Android from performing scaling in your web page and allows you to make the necessary adjustments for each density via css and JavaScript.
Amongst ie8‘s stellar accomplishments (nice logo) comes the proprietary headerx-ua-compatible which grants magical control over the rendering mode the browser uses.
Don’t use it! For reals! But if you find yourself in that fun situation that you have no control over, change the header on your server so you do not get an html5 validtion eror.