The open access principle requires that scientific information be made widely and readily available to society. Defined in 2003 as a “comprehensive source of human knowledge and cultural heritage that has been approved by the scientific community”, open access implies that content be openly accessible and this needs the active commitment of each and every individual producer of scientific knowledge. Yet, in spite of the growing success of the open access initiative, a significant part of scientific and technical information remains unavailable on the web or circulates with restrictions. Even in institutional repositories (IRs) created to provide access to the scientific output of an academic institution, more or less important sectors of the scientific production are missing. This is because of lack of awareness, embargo, deposit of metadata without full text, confidential content etc. This problem concerns in particular electronic theses and dissertations (ETDs) that are disseminated with different status – some are freely available, others are under embargo, confidential, restricted to campus access (encrypted or not) or not available at all. While other papers may be available through alternative channels (journals, monographs etc.), ETDs most often are not. Our paper describes a new and unexpected effect of the development of digital libraries and open access, as a paradoxical practice of hiding information from the scientific community and society, while partly sharing it with a restricted population (campus). The study builds on a review of recent papers on ETDs in IRs and evaluates the availability of ETDs in a small panel of European and American academic IRs and networks. It provides empirical evidence on the reality of restricted access and proposes a model of independent variables affecting decisions on embargo and on-campus access, together with a table of different degrees of (non) open access to ETDs in IRs.

The context

Scientific grey literature stands for intellectual works not controlled by commercial publishers, of sufficient quality to be collected and preserved, but often difficult to obtain. The difficulty of acquisition and collection building was one of the main characteristics of grey literature in the past. The Web changed the situation. Dissemination of scientific information and Access to the full text of all kinds of documents became easy. Concerning grey literature, the Web was considered as a solution and at the same time, as the final destination. The idea was simple and convincing: increasing availability and accessibility would change the nature of grey literature and, in the end, make it disappear. Grey would turn into white (Artus 2003).

This belief was strongly supported by the success of the movement towards open access to scientific information (Suber 2012). The open access principle requires that scientific information be made widely and readily available to society (Willinsky 2005). Defined in 2003 as a “comprehensive source of human knowledge and cultural heritage that has been approved by the scientific community”1, open access implies that content be openly accessible and this needs the active commitment of each and every individual producer of scientific knowledge.

The reality is different. Not only the definition of grey literature can (and will) survive the Web and open access, (Schöpfel 2010) but also contrary to all expectations and hopes, the Web sometimes increases barriers to scientific information. In spite of the growing success of the open access initiative, a significant part of scientific and technical information remains unavailable on the web or circulates with restrictions. Even in institutional repositories created to provide access to the scientific output of academic organizations, more or less important sectors of the scientific production are missing. The reasons are multiple: lack of awareness, embargoes, deposit of metadata without full text, confidential content and privacy concerns etc.

This problem concerns in particular electronic theses and dissertations (ETDs). Many are freely available, but others are under embargo or confidential, restricted to campus access (encrypted or not) or not available at all. While some professionals and scholars are increasingly concerned about the situation (Owen et al. 2009), others welcome the protection of copyright (Hawkins et al. 2013).

Our paper provides empirical evidence on restricted access to American and European ETDs, reviews some published explanations,and then makes a proposal of a conceptual model of independent variables affecting decisions on embargo and oncampus access, together with a table of different degrees of (non) open access to ETDs in institutional repositories (IRs).

The paper builds on a study conducted in Lille between January and April 2013 (Schöpfel & Prost 2013) and contributes to a French-German survey on ETD embargoes carried out by the Institute for Science Networking at the University of Oldenburg and the University of Lille 3.


A small but growing number of empirical studies on ETDs reveal figures on access restriction. A survey conducted in winter 2013 produced complementary figures from France, Europe and the United States. Table 1 presents figures from fifteen institutions and service providers, with the surveyed number of theses, the percentage of documents without access restriction, and the part of documents under embargo or restricted to campus-only access.

1 Berlin Declaration on Open Access to Knowledge in the Sciences and Humanities of 22 October 2003


 Table 1. Empirical evidence on restricted access to electronic theses and dissertations (ETDs)


Taken together, about 10% of these roughly 550,000 electronic theses are not freely available on the Internet. Without the ProQuest figures, this part with limited access rises to 26%, ranging from 10% to more than 50%. 17% are embargoed for six months to two years or longer while the other 9% can only be accessed on-campus. This panel may not be representative and the results should be interpreted with caution. Nevertheless, they point out that the problem is not limited to a country or region but concerns all institutions with ETDs infrastructures and IRs. Some examples2:

At Amherst College, Massachusetts, 32% of PhD theses cannot be accessed from outside of the campus and 20% are under embargo for at least six months (Banach 2011). At the University of Maryland, 68% ETDs are available without any restrictions. The other theses are under embargo, 21% up to one year and 11% from one to six years (Owen et al. 2009).

ProQuest Dissertation Publishing conducted in 2012 a study on ten years embargo trends (2000-2010) in the ProQuest Dissertations and Theses (PQDT) database. The surveyed corpus of 500,000+ print and electronic theses contained about 25,000 embargoed items (5%). Most of the embargoes are short-term embargoes, for six months to five years, but a small part of theses are under permanent (long term) embargo.

In Brazil, Pavani & Mazzeto (2009) describe access restriction for 11% ETDs on the campus of the Pontifícia Universidade Católica at Rio de Janeiro. About 21% of these files are under embargo for five years or longer.

The University of Liege (Belgium) document server indicates 191 PhD theses for 2012. 108 are freely available on their IR called ORBi (57%). For 33%, the access is limited to the campus; the remaining 10% are embargoed for a non-specified delay.

Since 2006, French universities have progressively switched from the traditional handling of print PhD theses to the new infrastructure of ETDs called STAR, linked to a national gateway “” run by ABES at Montpellier3. From 2006 to 2012, the STAR system processed 10,631 ETDs. 8,737 theses were available on the web without any restrictions (80%) while access to the other 1,894 theses was limited to on-campus availability (20%). STAR does not provide information about embargoes.

Another example from France: from 2008 to 2011, the University of Lille 1 processed 833 ETDs in Science and Technology. Nearly 80% are in open access on their IR. 15% are available on the campus only while the other 5% are under unlimited embargo, based on a decision of the faculty to protect intellectual property and innovation.

Only few data on long-term trends have been published. Based on figures from ProQuest, Hawkins et al. (2013) identified an increasing number of embargoed ETDs. The findings by Owen et al. (2009) can be interpreted in the same way, especially for short-term one-year embargoes. On the other hand, the embargo statistics at West Virginia appear to be relatively stable over time (Hagen 2010), just like the figures between 2008 and 2011 from Lille.

2 For more details and examples, see Schöpfel & Prost (2013). 

3 Gateway to French theses at  run by the French Bibliographic Agency of Higher Education ABES


Following our review and survey data, experts and professionals explain the access restrictions in different ways, with arguments based on statistics, experience and anecdotic evidence. In a UK survey on mandates for ETDs, 88% of the universities indicated that they allow authors of theses to impose restrictions on access to their work, i.e. the electronic file, for many different reasons. Students, with the agreement of their supervisor, can request an embargo for the following reasons: commercial contract (for instance, funding by an external organisation), patent pending, ethical confidentiality and/or sensitive material (data protection), publication pending and third party copyright (Brown et al. 2010). The same study reveals that restrictions on grounds of third party copyright, data protection or potential risks to personal safety were reported only amongst ETDs (not print support) and that only 60% of the universities allow students to impose restrictions for print theses.

At Brunel University, “while every effort has been made to ensure that embargoing access to theses is not used as ‘a panacea against all ills’, students are offered the option of a 3-year embargo if they have a publication or patent pending” (Brown & Sadler 2010). Academics of the University of Maryland mention future publication, protection of data or work, student request, proprietary data and patent application as primary reasons for approving of embargoes (Owen et al. 2009).

In France, PhD theses are considered as administrative documents and (except confidential research) must be disseminated, at least on the campus (Schöpfel & Lipinski 2012). Yet, following our survey at Valenciennes (France) PhD students appear sometimes confused by embargo, confidentiality and on-campus options.

In Italy, Arabito et al. (2008) justify embargo options as indispensable for the same reasons: “(…) the free availability of doctoral theses on the web can be jeopardized by thorny copyright issues, which arise in the following cases: use of third party owned materials (…), third parties involved (possible infringement of privacy), patentable discoveries (…), and ongoing publication of data (according to the publisher policy)”.

This last argument – expected publication – is by far the most common reason and explains between 1/3 (Owen et al. 2009) and 3/4 (Pavani & Mazzeto 2009) of all embargo decisions. The role of faculty appears to be crucial. At Virginia Tech, nearly half of the students’ embargo decisions were taken on advice by faculty while requests by publishers are insignificant (McMillan et al. 2012). Ramirez et al. (2013) confirm that “scholars continue to doubt the viability of publishing opportunities after a dissertation or thesis becomes available electronically in an open access repository. Perceptions and fear, not data, inform many graduate advisors’ and graduate students’ decisions to restrict access to their ETDs”.

Each graduate school has its own guidelines. A recent survey with more than 150 American graduate schools show that nearly 30% of all institutions “either don’t allow an embargo at all, or don’t tell students (about it at all) where they can find that information readily (…) In their enthusiasm for OA, universities and libraries across the U.S. are cajoling, arm-twisting, or even coercing students into in effect surrendering the copyright to their dissertations and theses, sometimes with the threat that students cannot graduate if they disagree” (Hawkins et al. 2013).

Florida State University Graduate School implemented access restriction – on campus only access – for older, digitized PhD theses: “Since retrospective digitized theses and dissertations did not include retrospective digitized access agreement forms, senior leadership recommended IP restriction for all FSU retrospective digitized theses and dissertations in 2009” (Smith 2009). Kleister et al. (2013) report how changing the embargo policy at the University of North Texas dramatically dropped down the number of embargoed ETDs, from 80-100 to 20 or less per year. Asking for embargo has always been possible but the burden was on the PhD student to initiate the discussion. From the moment (2007) when this “burden” was replaced by a simple option on the agreement form (as check boxes), the number of embargo decisions was multiplied by more than five. Their conclusion is clear: “The needs of students must be balanced against the institution’s needs and goals. Justification for embargo should not be especially onerous, but needs to be more than a mere checkbox on a form…”

At West Virginia University, Hagen (2010) reports that for the period 1998-2010, 85% of the more than 4600 theses are disseminated without any restriction. The part of theses with restricted access decreased from 47% (1998-2000) to 15% in 2010, because the option of encrypted on-campus only access was phased out in 2009 while the part of embargoed ETDs remained stable.

Smith (2009) describes how the Florida State University Graduate School requested campus-community and PDF document security options starting in Fall 2008, and he adds that “since retrospective digitized theses and dissertations did not include retrospective digitized access agreement forms, senior leadership recommended IP restriction for all FSU retrospective digitized theses and dissertations in 2009”. Following the published figures, this part of restricted access can be estimated at about 16%.

Only three studies present detailed embargo statistics cut down by scientific disciplines (Owen et al. 2009, Pavani & Mazzeto 2009, ProQuest 2012). Yet, these survey results are not really reliable. Some disciplines appear to be relatively consistent, such as life and chemical sciences, agriculture and environment, business, some domains of engineering (applied sciences) and public health, all with medium or high rates of embargoes. Pavani & Mazzeto (2009) show that in Science and Technology, pending publications as a reason for embargo concern mostly articles (73%) while in Social Sciences students intend above all publishing a book (57%). Yet, we must be careful with these statistics because of more or less small samples.

People, institutions, reasons and objectives

At first glance, the situation appears rather simple. PhD theses being intellectual work, the student is the only person holding the right to decide about dissemination. Of course, this view is by far much too simplistic. Different actors – people and institutions – can be distinguished who impact more or less the process of decision-making, with different reasons, motivations and objectives. A non-exhaustive list may be helpful to distinguish the different participants in this decision-making process:

  • PhD student: may want to keep the rights to his/her intellectual work; receives advice or orders from the different actors of his/her scientific community
  • Director of PhD thesis: concerned by quality and reputation, fear of plagiarism.
  • Jury: concerned by quality and reputation, protection of results.
  • Community (discipline, staff): supportive or indifferent attitudes towards open access.
  • Other PhD students: shared concerns about career, evaluation, and plagiarism…
  • Graduate school: favourable or indifferent towards open access.
  • University presidency (dean): supporting or not open access policy; concerned with third party rights (confidentiality,copyright infringement).
  • Academic library: often in favour of open access and running an institutional repository.
  • Service provider: supportive or indifferent towards open access.
  • Publishers: opposed or not to open access and publishing of OA theses.

Figure 1 tries to map these players in a system of decision-making of dissemination and access to ETDs. Each player sets his own goals, fulfils specific functions, plays his particular role, sometimes consistent with others, sometimes in opposition. For instance, PhD students may deposit their non-reviewed papers in open archives “off-campus”, outside of their institution and without any validation or authorization, even when the jury rejects the disclosure.

All these people, groups and institutions act in different ways, for different reasons, with different objectives and strategies. The literature review and survey results reveal the following components that may be understood as independent variables of the final decision:

  • A publishing project (article, book): if the student intends to publish his results with a scientific publishing house, he/she may be reluctant to disseminate the thesis on the Internet.
  •  Individual knowledge (or ignorance) of publishers’ policies towards publishing papers that are already available on the Internet.
  •  Individual attitude toward open Access (awareness, ethics, risk avoidance).
  • Institutional decision on confidentiality and dissemination
  • Legal environment (copyright, intellectual property, disclosure of PhD theses).
  • Institutional open access policy (awareness, risk avoidance).
  • Institutional workflow of processing ETDs (reference points, opt-ins or opt-outs, easiness).
  • Protection of third parties’ rights (intellectual property, confidentiality, privacy).
  • The jury members’ advice (quality and excellence, awareness of open access).
  • Tradition and attitudes of the scientific community.
  • Publishers’ acceptance of open access papers: If the publisher does not accept papers or books based on theses openly available on the Internet, his attitude may foster decisions in favor of embargoes.

Each of these aspects acts in a different way. Some elements may decide on dissemination or non-dissemination, while others are limited to embargo or on/off-campus decisions. Moreover, some are case-by-case decisions while others reflect general attitudes and stable behaviours. Again, a schema may be helpful for global understanding (figure 2):

This model may need empirical confirmation and perhaps, more details. Yet, its central characteristics are the multi-factorial or multivariate approach to the prediction of decisions on dissemination or concealment of ETDs. Even if individual publishing strategies and attitudes towards open access may play a major role, other variables such as personal advice from the PhD director, easiness of decision and references should not be neglected, in particular when discussing ways of improving accessibility and availability of PhD theses.

A model of openness

With regards to accessibility and availability of PhD theses, our analysis showed so far that openness is not a simple, binary concept but that the documents can be more or less open, depending on different variables. Some of those variables are similar to articles published in journals or books, but others are specific to PhD theses. In October 2012, the Scholarly Publishing and Academic Resources Coalition (SPARC), “an international alliance of academic and research libraries working to create a more open system of scholarly communication”4,

released a guide called “How Open Is It” that outlines the core components of open access (e.g., reader rights, reuse rights, copyrights, author posting rights, etc.) across the continuum from “open access” to “restricted access”. Compared with our multivariate approach, this Open Access Spectrum (SPARC 2012) helps to get a realistic view on the problems of openness, disclosure and concealment of theses. Table 2 shows a possibleadaptation of the SPARC guide to the specific conditions of the dissemination of PhD theses.

Reader rights: On site only, or also at distance, via authentication? What about interlibrary loan or document delivery?

Reuse rights: Generous reuse rights (CC-BY licensing) or full copyright protection?

Copyrights: No third party claim or complete concealment (confidentiality) because of sensible or protected results?

Institution rights: In France, by decree, the PhD theses must be disclosed, at least on the campus, except for confidential projects.

Institution policy: In fact, at least two different levels must be distinguished, the global policy of the university (or faculty/department), and the approach of the jury which may, in some cases and sometimes for political reasons, reject open access disclosure via institutional repository even for non-confidential theses.

Posting workflow: Following empirical studies on changes in ETD workflows, we adapt the SPARC component “Automatic posting” to the specific ETD environment. The continuum between open and closed ranges from procedures without embargo options, i.e. where an embargo decision needs a specific individual action (written and argued request), to workflows where open Access is available only as an opt-in option while on-campus dissemination is the default option.

Machine readability: The last component is about automatic access and exploitation of the full text and the related data and metadata. Exploitation means: text or data mining, harvesting, or crawling. Our table summarizes the SPARC scale but specifies the existence of supplementary data files (tables, videos, images etc.) that may have been submitted together with the thesis.

Back to grey?

A good idea does not necessarily guarantee success. Internet is not synonymous with openness, and the creation of institutional repositories and ETD workflows does not make all items more accessible and available. Sometimes, the new infrastructure even appears to increase barriers to PhD theses.

Different reasons contribute to this unexpected (and most often, unwanted) development, and in a certain way, new Technologies and digital infrastructures trigger the tendency for access restrictions. In our first paper, we discussed empirical data in terms of ethics, law, legitimate interests and policy, trade secrets, individual and institutional strategies and workflow-biased decision-making. Our present communication adds a conceptual framework and a differential description of the specific conditions of this part of scientific communication.

Open access is without doubt a valuable and important goal for scientific communication. Yet, scientific and technical information, considered as a part of research behaviour and object of strategic decisions (Roosendaal et al., 2010) always included decisions on concealment and parts of secrecy. Together with copyright and technologies, these individual and institutional decisions contribute today to an unsatisfying and inefficient situation where one part of digital PhD theses are easy to find and to obtain while others remain hidden, embargoed and/or limited to on-campus access. As for open access and institutional repositories in general, one part of the research community is (so far) indifferent or hostile to unprotected dissemination of theses. From the moment the decision on dissemination of ETDs moves from institution to the individual author, we have to deal with these attitudes and opinions.

Openness is not enough for scientific communication. Internet does not change grey literature into white in a mechanical way. Without a minimum of quality and standardization (Dobratz & Scholze 2006), without  etadata, referencing, long-term preservation, discovery tools etc., (Schöpfel et al. 2011), and without raising awareness, thorough decision aids and redesigned workflows,perhaps even changes in the legal status of theses, institutional repositories will only provide a partial answer to the question of grey literature in the digital environment. So, back to grey?




Zverejnenie a utajenie elektronických diplomových a dizertačných prác

Princíp otvoreného prístupu vyžaduje, aby vedecké informácie boli ľahko dostupné širokému okruhu používateľov. V roku 2003 bol otvorený prístup definovaný „ako komplexný zdroj ľudského poznania a kultúrneho dedičstva potvrdeného vedeckou komunitou“. Z definície vyplýva, že obsah vedeckých prác má byť verejne dostupný, čo si vyžaduje aktívnu angažovanosť každého jednotlivého tvorcu vedeckých znalostí. Napriek rastúcemu úspechu iniciatívy za otvorený prístup významná časť vedeckých a technických informácií zostáva na webe nedostupná alebo je prístupná len obmedzene. Aj v inštitucionálnych úložiskách, vytvorených s cieľom sprístupniť vedecké výstupy na akademických inštitúciách, chýbajú viac alebo menej dôležité sektory vedeckého bádania. Je to z dôvodu nedostatku povedomia, embarga, absencie úplného textu v spojitosti s metadátami, dôverného obsahu a podobne.

Tento problém sa týka najmä elektronických diplomových a dizertačných prác, ktoré sú šírené pod rozličným statusom niektoré ako voľne dostupné, iné pod embargom, iné ako dôverné, obmedzene prístupné alebo úplne neprístupné. Kým iné publikácie môžu byť dostupné cez alternatívne kanály (časopisy, monografie atď.), elektronické diplomové alebo dizertačné práce často takto prístupné nie sú.

Príspevok opisuje nový a neočakávaný efekt rozvoja digitálnych knižníc a otvoreného prístupu, prejavujúci sa ako paradoxná prax ukrývania informácií pred vedeckou komunitou a spoločnosťou, pričom tieto informácie sú len čiastočne dostupné obmedzenej akademickej komunite v rámci univerzity. Štúdia čerpá z prehľadovej práce, ktorá spracováva nedávno publikované články o elektronických diplomových a dizertačných prácach v inštitucionálnych úložiskách a vyhodnocuje dostupnosť týchto prác vo vybraných európskych a amerických inštitucionálnych úložiskách a sieťach. Poskytuje empirický dôkaz o reálnej existencii obmedzeného prístupu a navrhuje model nezávislých premenných, ktoré ovplyvňujú rozhodnutia, či zaviesť embargo a sprístupniť práce v rámci univerzity. Súčasťou štúdie je tabuľka, ktorá uvádza rôzne stupne (ne)otvoreného prístupu k záverečným prácam v inštitucionálnych úložiskách