Open Source Intelligence:
Access All Intelligence,
All Languages, All the Time
in
Presented by
Abe Lederman, President and CTO
Deep Web Technologies, LLC
IOP ’06
Sheraton Premier, Tysons Corner, Virginia
January 16-20
About Deep Web Technologies (DWT)
DWT is a New Mexico based company focused on providing
state-of-the-art software solutions which search, retrieve,
aggregate, and analyze content.
• Deployed first “federated search” portal in the
Federal Government, 1999
• Major clients include:
–
–
–
–
–
DOE Office of Scientific & Technical Information
Defense Technical Information Center
Science.gov Alliance
DOE Office of Science
National Agricultural Library
Open Source Intelligence
The Problem:
• Collecting and analyzing enormous
quantities of information in any language,
in myriad formats, located anywhere,
accessible through a large variety of
means, with a majority not accessible
through the Internet
Shared Challenge:
OSINT and Knowledge Discovery/Diffusion
OSINT
Challenges
Knowledge
Discovery/
Diffusion
Challenges
DWT for the past six years has been the lead technical
organization addressing these challenges in collaboration
with DOE Office of Scientific & Technical Information
The DWT Proposition
To apply DWT’s technology, expertise
and ongoing innovations* to address
the challenges of OSINT
*Developed in partnership with DOE/OSTI
Challenges in Working with
Thousands of Data Sources
Locate Reliable Sources
Categorize Sources by Content
Configure Sources for Searching
Maintain Sources
Challenges in Searching
Thousands of Sources
Automatically Select
Sources to Search
Perform Many Searches
in Parallel
Translate, Analyze and
Organize Results
Relevance
Rank
Extract Key
Information
Cluster/
Visualize
TM
ResearchAssistant
DWT’s State-of-the-art
Federated Search Engine
• Scalable, grid-computing based federated
search engine
• Sophisticated Search Conductor
• Supports custom connectors
• Multi-tier relevance ranking
• Framework accepts integration of advanced
linguistic, analyses, and visualization
modules
Grid Computing:
Distributing the Workload
Search Conductor
Select sources
to search
Perform search
Enough
good
results?
NO
YES
Can I get
more results
from “good”
sources?
NO
YES
Deliver results
to user
Multi-tier Relevance Ranking
• QuickRankTM – Ranks results based on
occurrence of search terms in title and
snippet
• MetaRankTM – Ranks results utilizing
custom algorithms applied to metadata
• DeepRankTM – Downloads and indexes
full-text documents
Science.gov Alliance Consortium of
12 Federal Government Agencies
Dept of Agriculture
Dept of Interior
Dept of Commerce
Environmental Protection Agency
Dept of Defense
NASA
Dept of Education
National Science Foundation
Dept of Energy
US Government Printing Office
Dept of Health/Human Services
National Archives & Records
Administration
Sponsoring
Science.gov Portal
(Access to most of Federal Government R&D
Science.gov Advanced Search Page
Science.gov Results Page
A Science.gov Document
Next Steps
Identify Sponsors and development
partners that can collaborate on the
development of a pilot that integrates bestof-breed technologies of value to OSINT.
This pilot will result in a portal that
aggregates content of different types,
generating actionable intelligence.
Contact Us
Abe Lederman
122 Longview Drive
Los Alamos, NM 87544
[email protected]
www.deepwebtech.com
http://www.deepwebtech.com/talks/IOP.ppt
Descargar

Deep Web Technologies - OSS.Net, Inc. Home Page