An Interagency Model for Collaboration and Operation

Background - Relationship to CENDI - Funding - Operations - Milestones

PowerPoint file of presentation (11-MB file) and Gallery View

An Interagency Model for Collaboration and Operation. Link to larger image.

Slide 1: An Interagency Model for Collaboration and Operation
Background - Relationship to CENDI - Funding - Operations - Milestones

CENDI Meeting, Nov. 4, 2010

Sharon Jordan
Assistant Director
DOE Office of Scientific & Technical Information
(Operating Agent for

What Is Link to larger image.

Slide 2: What Is

A Unique Collaboration with Tangible Results!

  • An interagency science discovery tool, providing single-query access to multiple government-sponsored R&D results and other S&T information
  • A cross-agency search that integrates and simplifies access to 200 million pages of content from 14 U.S. science agencies
  • The "" science portal (formerly "FirstGov for Science")
  • A voluntary large-scale collaboration of U.S. government agencies

Drills down to selected databases and websites in parallel, then presents relevancy-ranked search results

How Did It Begin? Link to larger image.

Slide 3: How Did It Begin?

  • Two workshops spawned origin:
    • 2000: Blue-ribbon panel explored concept of a physical science information infrastructure. This prompted interagency involvement.
    • 2001: "Strengthening the Public Information Infrastructure for Science" Here the interagency Alliance was formed
  • Participants included federal agencies, academia, information professionals and science experts.
  • gained approval as "Firstgov for Science" in early 2002
  • was launched in December 2002.
Link to larger image.

Slide 4:

Founding Agencies in 2001
  • Department of Agriculture
  • Department of Commerce
  • Department of Defense
  • Department of Education
  • Department of Energy
  • Department of Health and Human Services
  • Department of Interior
  • Environmental Protection Agency
  • National Aeronautics and Space Administration
  • National Science Foundation
New Alliance Members
  • Department of Transportation
  • Library of Congress
  • United States Government Printing Office
  • National Archives and Records Administration
Alliance only
  • United States Forest Service
  • National Institute of Standards and Technology

Support and coordination by CENDI – an interagency forum of senior information managers

Link to larger image.

Slide 5:

Shared Premises

  • Science is not bounded by agency, organization or geography
  • Each agency has vast stores of information that fulfill its mission
  • A single web gateway is the tool of choice*
  • A commitment to voluntary collaboration is necessary

*In OCLC Perceptions of Library and Information Resources, it was reported that 84% of public began search using search engines; only 1% began with online databases. Thus a "Google-like" easy search of authoritative sources with relevant results was desired.

Link to larger image.

Slide 6:

Integration Challenges

  • Broad scope of Federal science and technology research and development missions
  • Wide-ranging interest of potential audiences
  • Information organization (taxonomy) issues given the broad scope of disciplines and audiences
  • Blending information resources from different agencies into cohesive functionality and page design
  • Politics, human resources, funding, sustainability
 Guiding Principles for Content. Link to larger image.

Slide 7: Guiding Principles for Content

√ Select authoritative web-based government-sponsored information resources
√ Rich science content, not merely organization pages
√ Databases contain primarily R&D results in the form of STI (bibliographic data and/or full documents)
√ Supplemented by websites for currency
√ Only freely available content that is well maintained
√ Our audience is "the science-attentive citizen"!

Agency Potluck. Link to larger image.

Slide 8: Agency Potluck

  • Agencies brought to the Internet table their unique information specialties and resources
  • Flagship service a commitment
  • Notable contributions of many:
  • Alliance and CENDI - seized opportunity without mandate
    • - supported the early stages with advice and two grants
    • Member agencies - provided participation of 200 staff members to working teams
    • NLM – provided usability testing prior to initial launch
    • USGS – managed original website search engine (surface web search)
    • NTIS - created initial catalog of S&T websites
    • IIa Inc. – provided secretariat support (CENDI special task)
    • DOE/OSTI - conceived idea, developed technologies/deep web search and hosted website
    • NAL and USGS – provided Alliance co-chairs
Collaboration Is Key. Link to larger image.

Slide 9: Collaboration Is Key

  • Alliance enjoyed extraordinary voluntary collaboration
  • Vision and strategic direction provided by Alliance principals
  • Administration provided by Chair(s) selected from Alliance
  • Technical team provided original technical direction and recommendations
  • Major support provided by CENDI
    • Additional task groups formed as needed
    • taxonomy
    • Content guidance and development
    • Website management and redesign
    • Outreach activities
    • Enhancement development
    • Subject expansion
    • Image library
The Funding Approach. Link to larger image.

Slide 10: The Funding Approach

  • Built and maintained with "in-kind" contributions: each agency's staff time and existing information resources
  • Initial development benefitted from CIO Council e-gov grants for catalog + initial deep web search
  • Alliance annual dues help fund routine operations
  • CENDI support leverages resources
  • In-kind contributions supported special events
  • SBIR R&D resulted in innovations that were implemented in subsequent versions
  • "Pass the hat" contributions to take advantage of an opportunity, such as Version 3.0 development Funding. Link to larger image.

Slide 11: Funding

Doing "a lot with a little" by implementing creative funding methods

  • 2001: Cross agency portal grants: $170,000
  • 2002: DOE SBIR conducts relevancy ranking research
  • 2003-2004: Voluntary Pass-the-Hat contributions: $200,000
  • 2001-Present: Participating agencies and in-kind support develop and maintain Average since 2005 = approx $180K annually (fees plus in-kind support)
CENDI. Link to larger image.

Slide 12: CENDI

  • CENDI promotes the productive intersection of science content, technology and interrelationships
  • The Alliance, made up of CENDI agencies plus others, provides direction and support for this intersection in the form of
  • Through financial and in-kind commitments from its agencies, CENDI provides the ongoing infrastructure needed to offer a large-scale collaboration across organizational boundaries
Overview of CENDI Finances. Link to larger image.

Slide 13: Overview of CENDI Finances

Total Membership Funds Are Combined into One "Pot"

CENDI Reserve

Executive Secretariat for CENDI Includes Support

Maintenance Costs include Alliance Only dues*

A portion of Secretariat effort is used for Tasks

* Alliance Only dues are deposited into the CENDI treasury, with option of being used for direct costs/purchases for (such as exhibit expenses) or being included in funding for overall Secretariat support of

Content Management Is Distributed. Link to larger image.

Slide 14: Content Management Is Distributed

  • NTIS developed the original "catalog" with input from agencies
  • CENDI Secretariat now maintains catalog with agency participation
  • Agency content managers submit and edit their information via a web form
  • Websites identified in the catalog were indexed by USGS; now done by OSTI
  • Deep web databases are identified by agencies and reviewed by team for suitability
  • Real-time search of content in large databases is maintained by OSTI, which continues to host the website and serve as operations manager

The Alliance Members' Page. Link to larger image.

Slide 15: The Alliance Members' Page

Provides administrative information, meeting minutes, usage statistics, content selection and cataloging guidelines, subject category information, and outreach materials such as presentations and flyers.

Metadata Input System: For Websites in Searchable Index ("Surface Web" portion of Link to larger image.

Slide 16: Metadata Input System: For Websites in Searchable Index ("Surface Web" portion of

Provides Alliance members and content managers a secure tool to quickly retrieve Agency metadata, add or edit resource records, and expedite the maintenance and quality control of the metadata and URLs.

Development Milestones. Link to larger image.

Slide 17: Development Milestones

  • Phase 1 (2001-2002)
    • Established policy & governance, technical design teams
    • Agreed on goals, policies, website look & feel
    • Created taxonomy
    • Selected, cataloged and indexed agency resources
  • Version 2.0 launched May 2004
    • Introduced relevancy ranking of metasearch results
    • One-step search across ALL databases
    • Added advanced search
  • Version 3.0
    • Enhanced precision searching, metarank & boolean/fielded searching
    • Other types of science content explored
  • Version 4.0
    • Enhanced relevancy ranking, also full-text relevancy ranking
Development Milestones. Link to larger image.

Slide 18: Development Milestones

  • Version 5.0 (Sept 2008)
    • Clustering of results by subtopics or dates to help target your search
    • Wikipedia results related to your search terms
    • EurekaAlert News results related to your search terms
    • Mark-and-send option to email results to friends and colleagues
    • More science sources for a more thorough search
    • Enhanced information related to your real-time search
    • New look and feel
    • Updated Alerts Service
    • Standardized citation formats available for download

  • Version 5.1 Aggregated news feeds from 11 science agencies
    • Internships and Fellowships section made searchable
    • Image Search Library (Coming soon!) Today. Link to larger image.

Slide 19: Today Finds Content from 200 Million Pages at 2100+ Websites and 42 Databases with One Query

  • Searches selected websites ("surface web") and databases ("deep web") from one search point
  • Combines results from all sources, ranks and displays by relevance and clusters
  • Sends weekly "alerts" for user-defined topics of interest
  • Displays related Wikipedia and EurekAlert items
  • Provides browsing of selected websites
  • Displays an integrated news feed from science agencies
  • Links to special collections and other information
  • Featured search and sites highlight hot topics
42 Large Scientific Databases. Link to larger image.

Slide 20: 42 Large Scientific Databases

Agriculture & Food General Science
AGRICOLA Code of Federal Regulations
Center for Food Safety and Applied Nutrition (CFSAN) Federal Register
Technology Transfer Automated Retrieval System (TEKTRAN) National Technical Information Service (NTIS)
USDA Food and Nutrition Center Websites
Applied Science & Technologies Health & Medicine
DefenseLINK Website
DOT National Transportation Library Integrated Search Centers Biologics Evaluation and Research (CBER)
DTIC S & T Database Center for Drug Evaluation (CDER)
National Institute of Standards and Technology Data Gateway
U.S. Patent & Trademark Office Database MedlinePLUS
Astronomy & Space PubMed Central
NASA Technical Reports Server (NTRS) TOXLINE Toxicology Bibliographic Information
NASA Website  
SAO/NASA Astrophysics Data System (ADS) Math, Physics & Chemistry
  DOE Information Bridge
Biology & Nature DOepatents
NBII National Biological Information Infrastructure DOE R&D Accomplishments Database
  Energy Citations Database
Earth & Ocean Sciences Eprint Network
NOAA Photo Library  
USGS Publications Warehouse Natural Resources & Conservation
Energy & Energy Conservation  
DOE Alternative Fuels and Advanced Vehicles Data Center Science Education
DOE Information Bridge ERIC Education Resources Information Center
EIA Publications NSDL National Science Digital Library
Energy Citations Database NSF Publications Database
National Renewable Energy Lab Website  
Environment & Environmental Quality  
EPA Pesticides Factsheets  
EPA Science Inventory  
HSDB Hazardous Substances Databank  
National Service for Environmental Publications (NSCEP)  
Easy-to-Use Search. Link to larger image.

Slide 21: Easy-to-Use Search

Get the simplicity of a "Google-type" search box; get results that are not "Google-like" at all.

Less than 1% overlap with Google; approximately 3.2% overlap with Google Scholar

Precise, Accurate Results. Link to larger image.

Slide 22: Precise, Accurate Results

More About You May Not Know. Link to larger image.

Slide 23: More About You May Not Know


  • Goes where traditional search engines cannot go. Full-text documents if searchable on the target site are searchable via
  • Real-time search: If a target database adds a document or record, it is available on immediately
  • During the query, the most-relevant documents or records from each source are gathered – approx 100-200 from each source – and then the combined set is relevancy ranked
  • Topic and date clusters for search results – subtopics, publication years displayed on-the-fly to enable efficient drilling down
Usage Continues to Grow. Link to larger image.

Slide 24: Usage Continues to Grow Page View Totals (Dec 02 - Sep 10)

FY10 - 5,166,126
FY09 - 4,074,747
FY08 - 2,946,801
FY07 - 2,591,717
FY06 - 2,593,449
FY05 - 1,793,483
FY04 - 965,146
FY03 - 751,180

Notable Achievements. Link to larger image.

Slide 25: Notable Achievements

  • Large voluntary collaboration between agencies is often cited as a model
  • Collaboration AND infrastructure served as model for; then became U.S.'s contributed content
  • Also a model for
  • A top 10 Google result for "science" with other major science outlets
  • Provides core project for spin-offs such as Science Internships, Aggregated Science News, Science Image Search – and more! In the News. Link to larger image.

Slide 26: In the News is among 10 government websites "meeting and exceeding" the Obama Administration's transparency goals, according to a special report by Government Computer
, released July 27, 2009.

U.S. Department of Energy Office of Science. Link to larger image.

Slide 27: U.S. Department of Energy Office of Science

Real Time Search?
Relevancy Ranked?
All Govt. Science?
Known Sources?
Scholarly Info?
Ads? 5.0 X X X X X X X X X X  
Google Scholar BETA   X     X X
Google   X       X

Content and Purpose: vs Link to larger image.

Slide 28: Content and Purpose: vs

  • Searches for science topics at the full record level
  • Ease of searching, with immediate, useful results
  • For the science-attentive citizen including researchers, teachers, students, business people, and the general public
  • A Google-like interface with an advanced option for power users
  • Drills down into the "deep web"


  • 2668 results for diabetes from 35 sources;
  • 2772 results for climate change from 38 sources
  • Searches at the source level only, not at the record level
  • Interface with search results pointing only to sources or databases
  • Emphasizes machine-readable datasets, available in raw formats; some files are quite large, ranging up to hundreds of megabytes
  • Data generally requires additional manipulation; of limited use to general public. Expect public interest groups, reporters, academics, and others to review information, build interfaces, and report on findings


  • Zero results for specific terms such as diabetes
  • One result (database pointer) for climate change
U.S. Department of Energy Office of Science. Link to larger image.

Slide 29: U.S. Department of Energy Office of Science
Ready to use info. with user friendly interface?
Record level information?
Science research and results only?
Information from multiple agencies?
Repository of datasets and tools?
Provides pointer to database/source?

Link to larger image.

Slide 30:

√ A perfect platform on which to launch new technologies

  • Access to new forms of STI
  • Translation
  • Precision searches
  • Image searching

Current Prototype

Future Opportunities. Link to larger image.

Slide 31: Future Opportunities

What will 10.0 look like?