Login  |  Register          Free Newsletter Subscription
Subscribe to SLJ Magazine
Email
Print
Reprint
Learn RSS

Golden Retrievers

Finding things on the Web got you chasing your tail? Bone up on metadata: it gets searchers out of the doghouse.

Deborah Christensen -- School Library Journal, 11/1/1999

Deborah Christensen (deborah@ericir.syr.edu) is a trainer and cataloger for the GEM Project at Syracuse (NY) University.

It's Monday morning and a teacher's just told you her class is coming to the library in 30minutes to work on history reports. She was hoping you could find some Web sites for them.You hop on the Internet and do a search for Civil War sites--and receive over 200,000 hits. Why is it sohard to find specific information on the Net? It's partly because the Net's not cataloged the way library resources are. But there's hope for the future. Metadata--which lets us "tag" online resources so search tools can find them--is a way to actually "catalog" the Net.

Metadata is a confusing concept for many librarians and media specialists--yet at its heart it's easy for library people to understand.The most common definition of metadata is that it is "data about data"--which still leaves people confused. Library staff should think of it as the "data" in a catalog record. Information included in a catalog record--the title, author, subject, and description of a book--is metadata: data about the book that is on the shelf.

For the last two years I've worked for the Gateway to Educational Materials (GEM) project--a "cataloged library" of sites. GEM uses metadata to make online educational resources easier to find by creating the equivalent of catalog records for them. A GEM record describes, manages, and organizes Internet education resources the same way a card catalog record describes, manages, and organizes library resources. Although metadata does not refer solely to Internet-based materials, my focus here will be that aspect of the term.

We embed metatags, mainly into the headers of HTML documents, for retrieval and for documentation. Using metadata for retrieval helps searchers when they submit queries to search engines. Without a metatag describing the content, most search engines use the first two lines of text on the page as the description. As you know, this method produces pretty mixed results.

As for documentation, metadata can provide information on intellectual property rights and the acceptable uses of a resource, prices and terms of payment for electronic commerce, as well as authentication information such as digital signatures.

Metadata can be simple or complex. A MARC record, for instance, is an example of complex metadata. MARC records generally include such information as the Library of Congress Control Number, the author's name, title of thebook, and a description of the work as well as other information. Unfortunately there are too many resources on the Internet and too few professionals to catalog them at this level of complexity.

Then there is simple metadata. Metatags can be added to the headers of HTML pages to provide more information about a resource. Many pages on the Internet already use two kinds of metatags, for description and keywords. Below is a "metatag" for this article:

<HTML>
<HEAD>
<TITLE>Golden Retrievers</TITLE>
<META name="description" content="This article discusses metadata and the ways in which it can help organize the Internet.">
<META name="keywords" content="metadata, media specialists, Internet, libraries, cataloging">
</HEAD>

In the example above, we see both the name of the metatag and the content of the metatag. The first metatag contains the description of the article while the second lists keywords that could be used by search engines. This isthe simplest form of metadata. Web site managers, ideally, should create their own metadata to make the Internet more organized; however, most don't because they don't see the need or benefit. In fact, a recent study in Naturemagazine concluded that only 34.2 percent of server home pages include metadata.

Dublin Core

The next level of metadata is called the Dublin Core. The Dublin Core, created by a group of librarians and other scholars in 1995 (and so named because the original workshop was held in Dublin, OH), consists of 15 elements.These elements help describe electronic materials in a range of formats such as HTML documents, images, sound files, etc. The Dublin Core consists of the following elements:

Coverage: Location and time of the topic covered by the resource.

Creator: The person or organization primarily responsible for creating the intellectual content of the resource.

Date: The date the resource was made available in its present form.

Description: A textual description of the content of the resource, including abstracts in the case of document-like objects, such as an online newspaper article, or content descriptions in the case of visual resources.

Format: The data format of the resource, used to identify the software and possibly hardware that might be needed to display or operate the resource.

Identifier: Data string or number used to identify the resource.

Language: Language(s) of the intellectual content of the resource.

Other Contributor: A person or organization not specified in the creator element who has made significant contributions to the resource (for example, editor, transcriber, or illustrator).

Publisher: The person or entity responsible for making the resource available in its present form, such as a publishing house, a university department, or a corporation.

Relation: The relationship of this resource to other resources. For example: preface or table of contents.

Resource Type: The category of the resource, such as home page, novel, poem, working paper, technical report, essay, dictionary.

Rights Management: A link to a copyright notice, to a rights-management statement, or to a service that would provide information about terms of access to the resource.

Source: A string or number used to uniquely identify the work from which this resource was derived, if applicable. For example: an ISBN or ISSN.

Subject: The topic of the resource.

Title: The name given to the resource by the creator or publisher.

More information can be found on the home page at purl.org/dc/.

Leaving a scent for searchers

So, how does all of this affect you as a librarian? Currently most search engines look for all instances of a word without regard to context. Searchers are then forced to wade through hundreds or thousands of hits that may or may notbe relevant. In his article "Developing a Card Catalog for the Expansive Web" (PC Week, Aug. 25, 1997, p.34), Eamonn Sullivan wrote that "with most search engines, pages on Barney and the Smithsonian's dinosaur pages have equal weight." Metadata could help solve this problem if more documents contained metatags--and search engines used the tags to match queries. Fortunately, metatags are growing moreimportant in search tools. According to Danny Sullivan's Search Engine Watch site (www.searchenginewatch.com), as of September 1 all engines but Google, Lycos, and Northern Light look for metatags when searching. Metatags canprovide more context about a resource, which in turn would give search results more relevance. Go and Inktomi (the engine behind HotBot, MSN Search, and others) are two search engines that look for metatags when ranking pages in hit lists.

Another use of metadata is seen in PICS, or Platform for Internet Content Selection. The PICS standard enables Web content producers to "rate" theirsites, indicating, for instance, the level of sexual and violent content. PICS itself does not apply the ratings and it does not specify particular filtering software. It merely provides the structure for placing ratings information within aresource. This ratings information would then be interpreted by filtering software.

How can you take advantage of metadata's benefits? One way would be to add metatags to any HTML documents you create for your school or library. Another would be to take advantage of search engines that use metatags in their indexing and ranking. Check Search Engine Watch to see which search engines use metatags for descriptions, keywords, or to boost the rankings of results. As metadata becomes more prevalent, more search engines will make use of the information these records contain.

The Gateway to Educational

Materials Project

Unfortunately, it's time consuming to add metadata to a resource by hand. We need to create more tools to help automate this task. That's where GEM comes in. The Gateway to Educational Materials Project was set up in responseto President Clinton's mandate that federal agencies help support technology in all schools. GEM's goal is to use metadata to provide easy access to the thousands of educational resources available on the Internet. Funded bythe U.S. Department of Education's National Library of Education, GEM is a special project of the ERIC Clearinghouse on Information & Technology.

How many of you have used a search engine to try and find educational materials such as lesson plans or activities? You may have had some luck but youwere more likely presented with pages of irrelevant hits. When you looked at the results, were you able to tell from the descriptions if the resources would meet your needs? How much time do you think it took you to wade throughthese results? In order to solve these problems, GEM took the Dublin Core and added eight additional elements tied to education. We call this the "GEM element set," and it allows users to search by grade level, subject, keywords, and eventually, resource type and standards. The eight additional elements that GEM adds are:

Audience: The specific audience of the resource being described.

Cataloging Agency: Basic information about the agency that created the GEM catalog record.

Duration: The duration of the activity or lesson.

Essential Resources: Materials (such as graph paper or rulers) needed for the effective use of the resource by the teacher, student, or parent.

Grade Level: Grade, grade span, educational level, or age of the target audience.

Pedagogy: The student instructional groupings, teaching methods, assessment methods, and learning prerequisites of a resource.

Quality Indicators: Describes the overall quality of the resource.

Standards: State and/or national academic standards mapped to the entity being described.

GEM adds metatags that identify a resource's title, description, grade level, and so on. The resources we catalog are found on the sites of GEM Consortium Members--educational institutions affiliated with the project.Media specialists with resources who would like to apply to GEM to have their resources added to the Gateway should send a message to geminfo@geminfo.org. Catalogers use GEMCat, a simple Java program created byGEM, to produce and embed metadata into resources. It helps if the person has some knowledge of cataloging but it isn't necessary to have technical services experience to use GEMCat. The cataloger simply enters information from the resource into the element fields in GEMCat and then saves the metadata record, preferably within the header of the resource. We gather all the metadata records into what we call the Gateway site index. Users seethe individual records when searching or browsing the site.

The Gateway, a free service for educators, contains lesson plans, activities, and other educational resources in many subject areas. To see how GEM uses metadata, you can use the Gateway at www.thegateway.org/simple1.html asyou would any search tool. The GEM record that you see when you click on a title from the hits page includes a description of the resource, keywords, subject information, grade levels, and other information. A user only has to click onthe title at the top of the record to go to the resource itself. Most of our records link to full-text resources freely available on the Internet. An example of GEM metadata can be found at www.geminfo.org/Workbench/training/INT0008.html#meta.

Just as GEM uses metadata to organize educational resources, other projects use metadata in similar ways. In January, the OCLC Cooperative Online Resource Catalog (CORC) became available, allowing libraries to create and improve guided access toWeb resources. Over 100 libraries are currently engaged in a cooperative effort to select, catalog and provide access to Web resources using MARC21, Dublin Core, and several automated tools. More than 0,000 librarian-selected records are in the database today. For more information, visit purl.oclc.org/corc.

Too many Internet users suffer from "information overload." Just as libraries are organized according to Dewey or LC subject headings, metadata will organize the Net, making Internet resources easier to find. There will undoubtedly bemore projects that make use of metadata to help organize information on the Internet. For now I would suggest that librarians add metatags to their resources and take advantage of search engines and resources such as GEM that usemetadata. One day we will have greater precision when we search; we'll be able, for example, to locate materials throughout the entire Web intended for education and targeted at specific grade levels. Metadata is a step in that direction.

Email
Print
Reprint
Learn RSS

Talkback

We would love your feedback!

Post a comment

» VIEW ALL TALKBACK THREADS

Related Content

Related Content

 

By This Author

There are no other articles written by this author.

Sponsored Links




 
Advertisement

More Content

  • Blogs
  • Podcasts
  • Photos

Blogs


Sorry, no blogs are active for this topic.

» VIEW ALL BLOGS RSS

Photos

Advertisements





SLJ NEWSLETTERS
Click on a title below to learn more.

Extra Helping
Curriculum Connections
SLJTeen
©2008 Reed Business Information, a division of Reed Elsevier Inc. All rights reserved.
All rights reserved.
Use of this Web site is subject to its Terms of Use | Privacy Policy
Please visit these other Reed Business sites