SAGE is a repository of experts information
within the State of Florida. The SAGE Expert-Finder Knowledge Management
(KM) System creates an integrated database by masking multiple databases
as if they were one.
SAGE has been on-line since August 1999 and can be found at http://sage.fiu.edu.
(See figure 1.) Originally, SAGE unified the researcher databases from
the State of Florida University System. Currently, SAGE has access
to sponsored research data from private institutions as well,
such as Florida Institute
of Technology, Florida Memorial College, and Embry-Riddle Aeronautical
University.
SAGE gives university researchers more visibility and simultaneously
allows interested parties to identify available expertise within
Florida universities.
This application helps to identify a researcher’s proficiency within
a discipline and to facilitate a point of contact.
Benefits that SAGE provides include the following:
- SAGE helps locate
researchers in Florida for collaboration with industry and Federal
agencies, thereby increasing potential research funding
to the Florida universities.
- SAGE combines and
unifies existing data from multiple sources into one
Web-accessible interface.
- SAGE incorporates
a File Transfer Protocol (FTP) client, an application that will be
resident at each of the participating
universities
to automate data transfer to the SAGE server.
In October 2001, the SAGE graphical user interface was completely
redesigned to maintain the uniformity visible throughout all
KM Lab-hosted Web
sites. The new interface provides an enhanced flash capability
as well as a basic
html version. SAGE also includes a thesaurus, which is a collection
of concepts forming an ontology that upon request can perform
a search on
similar words based on the keyword in use. A consistent taxonomy
is applied to improve upon the usual keyword- and full-text-based
techniques.
It allows
an end user to retrieve information using appropriate terminology
and avoids problems of poor selectivity and quality of results
caused by
missing,
inconsistent, or conflicting vocabulary.
The use of a thesaurus can extend the capability of the Web
site by generating new keywords from an existing input provided
by
the user.
The thesaurus
provides a standardized means of organizing many kinds of
information, including both conceptual and taxonomic. In other
words, the
thesaurus is a tool designed to aid users in finding their
way around a vocabulary
database. In addition to its traditional use as an authority
for the terms used in indexing the database, it offers suggestions
of terms
the user
might not even have considered.
The construction of the thesaurus is accomplished using Perl
programming language because of its powerful text processing
capabilities.
The script uses an existing pool of information from Wordsmyth
Educational
Dictionary-Thesaurus
(WEDT) that includes over 50,000 headwords and very precisely
defined and hyperlinked synonyms. It retrieves an extended
set of related
terms or
a set of synonyms. ColdFusion 4.5 is used to cache the
results of the query. In addition, the new output is used in
conjunction
with
the
Verity Search
Engine that utilizes a stoplist and stemming.
To achieve results, we use interprocess communication (IPC).
When a user submits a search, the script will issue an
http request to a remote
server
by communicating through a socket (connection). The http
request queries an external search engine that resides
on the remote
server.
The script
retrieves the html document generated by the search engine,
parses the document by using regular expressions, and
retrieves the
desired information.
Basically, since it is issuing a request, the script
is acting as Web client.
|
|

Current SAGE Architecture
The
script relies on the structure of the document, Web structure mining.
In this case the document is a raw html document. To retrieve data
from an html document efficiently, the document must have a uniform
format or a “cue” for where to start looking for the necessary
data. So the programming task is trivial.
With
the goal of always trying to improve and facilitate the process of
keeping the
SAGE database updated, the latest development is the SAGE Automatic
FTP Client version 3.0 (figure 2). SAGE Auto FTP Client version 3.0
allows the universities that are part of the SAGE community to transfer
their researcher’s information file directly from their servers
to the SAGE server residing at the KM Lab in an easy and efficient
manner. This data can be received in a variety of different formats,
such as Excel or Access, and even in a tab-delimited text file. Once
this data is obtained, the KM Lab members convert it to a standard
format to be displayed in the SAGE Web site.
The SAGE Auto FTP Client version 3.0 offers a simple, user-friendly interface.
It also contains a Help File where the user can find answers to frequently
asked questions. Every university is provided with a unique password to
ensure the accuracy of the transfer. The SAGE Auto FTP Client version 3.0
guarantees the SAGE database will have the most up-to-date information
pertaining to researchers, their projects, and their funding all year long.
The current version of SAGE includes a Newsletter that allows the users to
keep up to date with the information regarding existing and upcoming SAGE
news. To subscribe, the user just clicks on the SAGE Newsletter icon that
appears in the SAGE home page and provides a name and e-mail address. The
user will periodically receive the latest information concerning SAGE (figure
3).
SAGE is hosted from the KM Lab, and it is available 24 hours a day, 7 days
a week. Currently SAGE receives an average of 40 users per day. About 54
percent of the total hits that SAGE gets daily originate from commercial
sites (.com and .net), 39 percent are from educational institutions (.edu),
and 2 percent are from Government and military organizations (.gov and
.mil). Visitors originate from the United States and around the world,
including Japan, France, Austria, Switzerland, Bahamas, Mexico, and the
United Kingdom.
Contacts: Dr. O. Melendez (Orlando.Melendez-1@ksc-nasa.gov),
YA-F2-C, (321) 867-9407; and S.H. Chance, BA-C, (321) 867-4194
Participating Organization: Florida International University (Dr. I. Becerra-Fernandez)
|