Plenary Sessions: Abstracts
Featured Speaker and Open Access Plenary Panel Session
Thursday, June 12 • 9:30 a.m.–10:45 a.m.
Publishing, Open Access & ETD: A Panel Session on Student, Faculty & Publisher Perspectives
The programming language, JAVA, which transformed the Internet, was introduced to the world at the Netscape Developer’s Conference in San Francisco over thirteen years ago on March 5-7, 1996. At the time, those of us present were told that the future would require three dimensions for every resource we produced: everything would have to be interactive, ubiquitous, and distributed. A resource is said to be “interactive” when the user provides significant input or direction and the resource reacts dynamically and appropriately. A resource is said to be “ubiquitous” when it (or a major component of it) is both available everywhere and recognized everywhere as the best means of addressing the problem it is designed to handle. A resource is “distributed” when its components and the responsibility for them are variously located, and not required to reside on a single server. A complete acquisition of these three properties still drives the development of the Internet and these same properties should drive the development of dissertations online more surely than they now do, but it’s important to keep all three dimensions in perspective, to remember what a dissertation is for, and to understand a variety of needs tied to dissertations in order to aid our effort to move the development of ETDs and to bring dissertations to the next level. The NDLTD is striving to be ubiquitous, but it has not reached that point, nor has it neared the tipping point that would precede it; certainly, we can say that ETDs are distributed via the NDLTD, but a certain amount of fear among dissertation writers and directors has worked against fully open access and distribution. When a dissertation is embargoed to a single institution or campus, it is not distributed. Finally, I come back to the first term, interactive. Our theses and dissertations are, by and large, digitized paper documents utilizing PDFs, and every theorist I know of the future of textuality will argue for the advent of interactive dimensions that we have not tried to develop. I see the shortcomings in these dimensions as symptoms of a problem that derives from two related situations: we dissertation-producing faculty seem agreed only in seeing a dissertation or thesis as the production that will certify a single student’s degree, which diminishes collaboration to even less than the amount a good typist provided before the days of word-processing; moreover, we use the term “to publish” as loosely, if not more loosely, than we use the term “to edit,” to the degree that, most of the time when we speak to each other about these things, we’re talking about entirely different concepts. Not only do we need to remedy both of these openly and in an organized fashion in order to advance the production of ETDs generally and the NDTD specifically, but we need to do so in order to advance the forms scholarship should generally be taking at our universities, colleges, institutes and laboratories.
Thursday, June 12 • 3 p.m.–4 p.m.
OpenThesis.org: A Universal ETD Database with Search, Organization, Collaboration & Print Capabilities
Theses and Dissertations have long been inaccessible due to lack of a free, powerful interface to search all of these documents in a single location.
OpenThesis.org, a new web site from the creators of FreePatentsOnline.com, aims to expose all theses and dissertations, making them highly accessible, useful and shareable — at absolutely no cost.
Patents Online is the company behind www.freepatentsonline.com, the #1 worldwide website in terms of audience in the patent space. With its worldwide patent database, freepatentsonline.com averages 10 million unique monthly users, and 1 million users have registered (registration is free, and gives additional functionality).
Having started in the patent space, Patents Online’s mission has expanded to include offering free access to all types of technical and academic documents. Theses and dissertations form an important body of documents in this space.
OpenThesis.org will allow authors to upload their theses/dissertations, and the documents will then be vaulted for posterity and made searchable anywhere in the world. If an author (or an author’s family) has only a paper-copy of a thesis/dissertation, the site will give guidance on how to get it scanned into computer-readable form, for subsequent uploading.
OpenThesis.org will also work directly with Universities to ensure as comprehensive a database as possible. The existence of a comprehensive ETD database will benefit Universities in licensing efforts, and authors via prospective job opportunities.
As is currently true for www.freepatentsonline.com, visitors to OpenThesis will be able to register in about 1 minute, for free, and gain access to special features such as the ability to organize documents into folders, annotate documents, share documents with other users, and set up alerts for automatic notification of new documents of interest.
A few other important aspects of OpenThesis.org:
Theses and Dissertations represent an important compendium of vetted research. OpenThesis.org will ensure that these works, which authors created with formidable investments of inspiration and time, and which universities have ratified as part of awarding graduate degrees, are instantaneously available anywhere in the world via the web at no cost.
This presentation will discuss in depth the structure and use of the NDLTD Union Catalog. Information will be provided on how the data is harvested, indexed and made available for access. Statistical information will be provided on the size of the database (in excess of 750,000 ETD’s) and its source broken by continent, language and country.
Useage statistics will be provided showing the source, frequency and pages viewed. Finally pointers will be provided on what to do with the metadata to make dissertations from you institution more accessible to the world at large.
Objectives: The NDLTD Union Catalog has metadata for over 600,000 Electronic Theses and Dissertations (ETDs) in diverse languages from universities around the world. The users can access these ETDs through various search and browse web interfaces reachable through the NDLTD website (example, from Scirus and VTLS). We aim to improve those services in two ways. First, we develop approaches to build larger collections of ETDs, which consist of ETDs not only collected via NDLTD’s Union Catalog, but also those collected through focused crawling of many universities’ webpages. Second, we develop approaches in order to make these large collections more amenable to being used by students and researchers.
Methods and Results: We have identified repositories for some universities that host ETDs but that are not yet part of NDLTD. We have developed custom crawlers in order to crawl some of these repositories as well as the NDLTD Union Catalog in order to harvest ETDs and their metadata (where permissible). Our current collection has about 40,000 ETDs from Union Catalog for our initial experimentation, and we actively continue to collect more ETDs.
We also have developed a categorization system, based on the Library of Congress categorization system and Wikipedia, that is more suitable for categorizing ETDs, and have categorized ETDs into the resulting category tree. Users can first browse this category tree based on their needs and then can either browse a particular node, or search it for items of interest.
Conclusions: Through focused crawling, we have been able to increase content available to users, and made it available at a single place. Categorization of ETDs has helped organize the ETDs semantically in order to make it easier to find relevant information. As part of future work, we will improve our methods to collect as many ETDs as possible from the NDLTD Union Catalog and from various universities around the world, categorize them, and provide a web interface facilitating access.
New Trends Plenary Panel Session
Thursday, June 12 • 4:15 p.m.–5:45 p.m.
ProQuest Dissertations and Theses Database User Survey: The first large-scale survey of dissertation information seeking behavior
OBJECTIVE: To develop a clearer profile of researchers who use dissertations and a better understanding of how dissertations are used in the research cycle in order to present dissertations in more effective ways to support research.
METHODS: “Users” are those who accessed the ProQuest platform and/or ProQuest Dissertations and Theses during the course of the survey (April 20 – May 15, 2008), and responded to an invitation link to the survey instrument. An incentive drawing was included. 3,034 individuals took part in the survey.
The survey instrument – mostly of structured single response questions – included a two multi-element questions using a 1-10 rating scale, and two open-ended questions on the object of the search and on usage of competitive databases like PQDT.
RESULTS: Graduate students account for nearly half of database users; undergraduates about a third. Nearly half of all those who searched the dissertation database are either studying for doctorates or working on master’s theses. The corollary is that a majority of those accessing PQDT are not doing so.
While librarians in this survey were not themselves frequent users of PQDT, the college or graduate school library website is an “extremely important” influence on accessing the PQDT or ProQuest platforms.
PQDT is likely to be accessed quite specifically with the intention of reviewing dissertations or theses – and often. About one in five users accessed PQDT at least five times in the month immediately prior to the survey.
Social sciences, business and education are the three leading disciplines associated with accessing PQDT. Other important areas are the arts and humanities generally, and medical sciences.
CONCLUSIONS: This survey suggests that there is more to learn about non-student researchers and a need to find ways to provide them access to dissertation research. It also supports the importance of dissertations as primary source material in current research.”
There are many challenges in talking with students about ETDs. Some of the important topics involve those surrounding open access and author rights. Reaching out to students to engage them in these issues can be a complicated process. There are, however, an increasing number of student groups that have become interested in these issues and finding ways to connect with them can lead to increased understanding about the value of ETDs.
The Scholarly Publishing and Academic Resources Coalition (SPARC) has been working with a variety of student groups on open access issues. The techniques in working with students are to find common areas of concern and leverage collaborations. In addition, taking advantage of technologies that appeal to the students increases the reach of the programs.
This paper will provide information on several SPARC student activities. The Right to Research campaign responded to a growing demand from the college student community for tools and resources to express their support for Open Access to research. Students are also interested in Open Educational Resources and SPARC is working with student groups to raise awareness about them as well. The annual SPARKY contest provides awards to innovative videos expressing student perspectives on sharing information. A student open access blog provides a means for students to communicate among themselves.
SPARC’s work with students and student groups suggests many students have embraced the concept of open access and are receptive to raising issues they see as impeding their ability to access information. These activities suggest ways others might consider in their work with students and ETDs.
We have developed a range of tools and protocols that allow the creation, validation, and re-use of “”born digital”” theses in scientific domains, especially disciplines reporting chemical information and results. The primary authoring tools in science are Word and LaTeX, both of which create documents with structures (chapters, sections, etc.) and semantics (annotated paragraphs, tables, graphs, etc.). In many cases the theses also contain raw and processed scientific data which can be at least as important as the natural language text.
We have developed vocabularies and ontologies to describe such theses and, for example, are devloping an authoring tool for creating semantic chemistry in a Word environmant. We urge that instituitons encourage semantic theses and have been developing a proof-of-concept (ICE-TheOREm). Here a student can assemble a thesis from components which can me managed locally on on a server and create either Word2007 or ODT-compatible documents. Such theses preserve all the semantics and data.
In practice current theses are deposited as PDFs and to re-use their contents we must resort to natural language processing (OSCAR3) or semi-structured data tools (OSCAR-DATA). PDF has no semantics and reconstruction is seriously lossy but we can often extract meta/data by machine. We urge instituions always to deposit the native Word2007 or LaTeX documents as well as any PDF; in this way they will capture far more of their science.
We also demonstrate lightweight semantic repositories which provide an embargo mechanism for all or part of the thesis (TheOREm, using ORE). The metadata for the documents (including structuring) are converted to RDF which can be queried with SPARQL providing great flexibility.
Our toolkit and examples are based on the premise that all software, protocols and content should be Open.
We thank JISC, and Microsoft Research for support.
Global Outreach Plenary Panel Session
Friday, June 13 • 9:30 a.m.–10:45 a.m.
Bridging the Knowledge Divide: Expanding Global Open Access
The International Oceanographic Data and Information Exchange (IODE) of the Intergovernmental Oceanographic Commission of UNESCO (IOC) has developed, in the past five years, the tools for a modern information policy focused on increasing accessibility of scientific literature in the field of marine science and oceanography with the help of partners such as Hasselt University Library and EBSCO.
With the OceanDocs repository, originally an African project called OdinPubAfrica, the IOC/IODE community and especially developing countries, received a platform to make publications available worldwide. The Open Science Directory, a directory of scientific journals freely available to institutes and scientists in developing countries, has a wider target group than the oceanographic community, but fits completely in the information policy of IOC-IODE¹. The Open Science Directory is developed with the support of EBSCO.
The World Digital Library (WDL) launched on April 21, 2009. The site is a collaboration between the Library of Congress, UNESCO, and other national libraries and cultural institutions around the world. The objective of the site is to bring historical treasures (maps, manuscripts, photographs, rare books, etc.) representing the contributions of all cultures together on one Web site. On its first day the WDL site, www.wdl.org, received over 7 million page views and over 600,000 visitors. The site received traffic from every country in the world. The WDL project also has the goal of building digitization capacity in partner institutions to narrow the digital divide within and between countries.
Michelle Rago is the Technical Director of the World Digital Library. She has a background in librarianship, information architecture, and Web development and has worked for the Library of Congress since 2002.
JSTOR is an independent not-for-profit organization dedicated to helping the scholarly community discover, use, and build upon a wide range of intellectual content in a trusted digital archive. The JSTOR archive includes over 800 leading academic journals across the humanities, social sciences, and sciences, as well as conference proceedings, transactions, select monographs and other materials valuable for academic work. More than 5,200 academic and other institutions in 143 countries and over 600 learned societies, university presses, cultural heritage, and other content contributors participate in JSTOR. As part of JSTOR’s mission, access to the archive is extended for no cost or for low cost to countries included in the Developing Nations Access Initiative (DNAI), including free access to any not-for-profit institution on the continent of Africa.
In order to unite efforts to serve the scholarly community, JSTOR and Ithaka recently announced that they had merged their organizations. The new combined enterprise will be dedicated to helping the academic community use digital technologies to advance scholarship and teaching and to reducing system-wide costs through collective action.
During 2008, the Ithaka-incubated resource Aluka was integrated into JSTOR as an initial step, further strengthening ties between the organizations. Aluka, a digital library of scholarly resources from and about Africa, offers three collections, also available as part of the DNAI: African Cultural Heritage Sites and Landscapes, African Plants, and Struggles for Freedom in Southern Africa. These collections bring together in one place more than 370,000 objects from leading archives, cultural institutions, and individual scholars around the world. Types of content include: manuscripts, letters, oral histories, government documents, pamphlets, images, 3-D models, and more.
The presentation will discuss the details of the DNAI, and discuss ways that institutions can collaborate with JSTOR on these important endeavors.
Lessons Learned Plenary Panel Session
Friday, June 12 • 3:45 p.m.–5 p.m.
Success & Challenges for ETD Programs: An Open Conversation
If your organization is starting an ETD program or expanding an existing ETD program there are challenges whose answers may be found in the experiences of others. Sharing experiences - successes, concerns, and solutions – will help ETD leaders negotiate the unique political, social and organizational environment at their own organization. This session’s panelists are chosen for their knowledge of a beginning and growing ETD programs and they can speak authoritatively about what they have found to be essential to successful ETD leadership. The featured panelists will represent the perspectives from different regions and all attendees are encouraged to participate in the conversation. The interactive, conversational approach of this presentation has been a proven success in other venues, and will spark ideas and help focus on the issues that are most pertinent to attendees. Among some of the expected topics of conversation are:
All those in the ETD community, even people who are not attending the conference, will have the opportunity to add their ideas prior to the session by visiting this session’s posting at the official conference blog at http://etd2009.blogspot.com/