The NBII’s Digital Image Library – Aggregating diverse images means QC and standards TDWG, October 24th 2008 Annette Olson, PhD Biodiversity Scientist, United States Geological Survey Grapevine Beetle (Pelidnota punctata) © 2008 Sam Houston Definition Biological Media • Media, in this case, meaning audio and visual media = Moving Image, Still Image, Sound. Media, multimedia, and audiovisual all have connotations that don’t completely overlap with this definition above. NBII The National Biological Information Infrastructure www.nbii.gov Initial History of the DIL • An in-house library in 2002. • Went online in 2004, with 200 images. All public domain. • 2004 - partners saw it as a platform to serve, showcase and store their images. Two partners began putting their images in. Copyrighted, but allowing nonprofit use. • In 2005 had 1000 images in, offered 100,000 images. Broad mission – to serve diverse, quality-controlled biological images. Audience and Objectives • Users – natural resource managers, researchers, and decision-makers. • Subject scope is defined by what partners deem useful (resource managers = species and ecology) • Public Domain, or Copyrighted with statement or C. Commons license allowing most nonprofit uses. (no fully copyrighted images allowed) • Serve as a repository, but also be a public gateway to other credible biological image galleries. http://images.nbii.gov Jaguar Track in Pantanal, Andrea Grosse and John Mosesso Amerafrican House Gecko (Hemidactylus mabouia) - whole body dorsal view. ©2004 Yuri Huta/Finding Species Rufous-collared sparrow nest (Zonotrichia capensis), © 2005 Guyra Paraguay John J. Mosesso, Fish ladder Giant armadillo (Priodontes maximus), © 2005 Guyra Paraguay Japanese stilt grass (Microstegium vimineum) invasion, © 2004 Elizabeth A. Sellers. Media as Biological Records Each image is dynamically linked to metadata. Required: Photographer, Description Keywords, Cataloger Copyright Scientific and common names, when organism shown. Based on ITIS. Strongly encouraged. date geographic location physical description, PRE-SET METADATA WILL CREATE METADATA Discover Life NatureServe ReGAP Morphbank Finding Species NPN FRAMES US FWS Smithsonian Dept. Botany USGS Personnel Nature USGS Projects Photographer Indiv. Biologist Hobby Photographer WDIN NBII Personnel SAIN USGS Projects DIL + STANDARDS + QC Portlets, RSS DISCOVER LIFE GBIF EoL GAPServe Andrea Grosse, Wasp Nest in Gallery Forest, Paraguay Solutions Original images Mention in metadata Automatic tools, link checks,… Feedback mechanisms Embedding metadata Signed statement of authenticity and ownership – images on sets Image resizing tool Guiding documents Metadata templates QA/QC procedures Devoting staff to screening, tracking Issues • Quality • Digital editing (restrictions, original, info in metadata) • Artificial settings (info in metadata) • Species identification (feedback) • Metadata – burden of creation • (progress being made here - extraction EXIF info, pattern searches…, automatic validation tools) • Change in the resource over time • Broken links (auto validation tools) • Name changes (auto validation tools) • Technical aspects of linking • Copyright/privacy/publicity - global • Standards – or lack of… Solutions • Watermarks – not lost, hard to alter • • • Limited in information, Info changes Watermarks can potentially cover information useful for later research • Embedding metadata via Adobe XMP, EXIF, JPEG2000…– can be stripped. • GUIDS (LSIDs) ….. Do not give information immediately, but resolvable Metadata issues • No one standard fits – digital resource, original, subject Dublin Core – most common among image galleries, doesn’t handle biological info… Darwin Core – specimen oriented EML – dataset oriented (potential, but media module still needs to be created) MODS, PREMIS, DIG35, NISO, “EXIF,” FGDC Biological Profile, OGC, … • Encoding Schemes/data values Multiple subjects within an image • Describing relationships • Repeatability - Multiple species, subjects, time periods… • Pairing of fields – species names with authorities, common names, and taxon levels. Red-crested and Yellow-billed cardinals (Parcaria coronata, Parcaria capitata), ©2004 Guyra Paraguay © 2008 Bruce Avera Hunter, Red-eared Sliders on American alligator's back © Ronald Hoff, White-eared Puffbird (Nystalus chacuru) Related images, issues • Different resolutions for display, dissemination • Rapid Sequences (time lapse) • Related images of same specimen – and the habitat where they were trapped. • Related via other subject matter (habitat changes at a location over time.) • Images derived from other images… (montages) • Illustrations published in a book (dc:Source) • Images linked to a dataset, specimen California broomrape (Orobanche californica), © 2007 Ted Niehaus/Smithsonian Metadata – other issues standards don’t quite cover • History – image derived from image that is derived from video… • Precision of geolocations, • Methods, • Language of metadata • Listing which authority behind a classification Scientific name source (Catalogue of Life, a published paper, Sibley’s Guide to Birds….) Habitat, agriculture (Bailey’s Ecoregions) • Others… Many Communities asking for: • Best Practices • Quality control • Process • Schema • Flat, simple Definitions docs • XML Ruby-crowned Kinglet (Regulus calendula) © 2006 Charles H. Warren Why another image gallery and not Flickr, others? Always more images than there will be galleries to serve. Mirroring doesn’t hurt -- preservation. Capacity building. Mandate to help serve US Dept. of Interior Images, includes long-term preservation. Add value with controlled vocabularies, quality control, standards, and interoperability. Treat the images as biological records. Copyright Issues… • Global library – learning global copyright law. • Tied to “Publication.” If disseminating online – publishing. giving presentations for dissemination on a website can be considered a form of publication. For us, probably fair use, but…. • Images travel! • Downloaded from websites • Can be selected from PDFs • Presentations – slides can be copied People Issues 1. Right to Privacy – name or even just image not used without permission, in some cases even if taken in a public space. 2. Right to not have their images/face used for any commercial use/publicity without their permission. Even if taken in public space and public domain. Signed waivers, or statements in prominent places online. Photos by John J. Mosesso, Bird Banding Facial recognition software is changing this environment. It’s not just about names. Especially for kids!!