In this section
Digital Preservation - Preservation Issues
"Digital materials are especially vulnerable to loss and destruction because they are stored on fragile magnetic and optical media that deteriorate rapidly and that can fail suddenly from exposure to heat, humidity, airborne contaminants, or faulty reading and writing devices." (Hedstrom and Montgomery 1998)
Digital media are subject to destruction and deterioration in new ways, though unintended loss can be avoided if procedures are adapted to the needs of the technology. Precautions can be taken which will help significantly to reduce the danger of loss and include:
- Storing in a stable, controlled environment.
- Implementing regular refreshment cycles to copy onto newer media.
- Making preservation copies (assuming licensing/copyright permission).
- Implementing appropriate handling procedures.
- Transferring to "standard" storage media.
However, while the media on which the information are stored may or may not fail, what is certain is that technology will change rapidly so that even if the media is retained in pristine condition, it may still not be possible to access the information it contains. No matter how exemplary the care of the media is, it will not remove the requirement to deal with changes in technology, though responsible care should make it easier to manage technology changes.
Changes in technology
"Unlike the situation that applies to books, digital archiving requires relatively frequent investments to overcome rapid obsolescence introduced by galloping technological change." (Feeney 1999)
Because digital material is machine dependent, it is not possible to access the information unless there is appropriate hardware, and associated software which will make it intelligible.Technology advances even in the past decade illustrate this point:
- 5¼ inch floppy disks have been superseded by 3½ inch floppy disks;
- There have been several upgrades to Windows software since it was first
introduced and it would now be very difficult to convert from earlier versions to the current versions;
- Thousands of software programs common in the early 1990s are now extinct and unavailable.
The certainty that there will be frequent technological change poses a major challenge and it is therefore not surprising that collection managers quoted in the RLG survey cited technological obsolescence as the greatest threat to successful digital preservation. Precautions can, and should be taken, which will greatly reduce the risk of inadvertently losing access to a resource because of changes in technology.These include:
- Using standard file and media formats, as recommended by reputable sources.
- Providing detailed documentation to enable both context to be determined and also to facilitate successful management. (Guides to good practice are available, some of these are provided in Metadata and Documentation.).
Authenticity and context
"At each stage of the cycle, electronic records need to be actively managed according to established procedures, to ensure that they retain qualities of integrity, authenticity and reliability." (PRO 1999)
While it is technically feasible to alter records in a paper environment, the relative ease with which this can be achieved in the digital environment, either deliberately or inadvertently, has given this issue more pressing urgency. The Public Record Office (PRO) mandates mechanisms in accordance with BSI/PD0008 for its own records (BSI 1999).The PRO draft strategy suggests that the best way to achieve authenticity is through a combination of proper processes (as outlined in their guidance documents) with data integrity mechanisms such as MD5 signatures generated at the time of ingest and the establishment of audit trails for all actions. Duranti makes a useful distinction between authentication (the means used to prove that a record is what it purports to be at a given time) and authenticity (a concept already familiar in archival science and which refers to the quality of the record itself and its essential contextual information). Records management systems need to be able to link to essential contextual information regarding the business procedures of the creating agency. Authenticity and integrity of digital resources can be equally important in other sectors. For example, scholars will need to feel confident that references they cite will stay the same over time, courts of law will need to be assured that material can withstand legal evidential requirements, government departments may well have legally enforceable requirements regarding authenticity, and so on.This issue overlaps with both legal and organisational issues and it may be one which is best resolved within individual sectors rather than through generic procedures.
Although computer storage is increasing in scale and its relative cost is decreasing constantly, the quantity of data and our ability to capture it with relative ease still matches or exceeds it in a number of areas. Some
repositories still face significant challenges in developing and maintaining scaleable architectures and procedures to handle huge quantities of data generated from sources such as satellites or the web.The technical and managerial challenges in accessioning, managing and providing access to digital materials on this scale should not be underestimated.
"Three... approaches to digital preservation have been developed:
- Preserve the original software (and possible hardware) that was used to create and access the information.This is known as the technology preservation strategy. It also involves preserving both the original operating system and hardware on which to run it.
- Program future powerful computer systems to emulate older, obsolete computer platforms and operating systems as required.This is the technology emulation strategy.
- Ensure that the digital information is re-encoded in new formats before the old format becomes obsolete.This is the digital information migration strategy." (Feeney 1999)
Strategies for some formats are well established and tested over time. For example, migration has been used for electronic text, image, and database applications by the computing industry and a number of data archives and centres for decades.
However, all three of the current strategies have potential drawbacks in some circumstances.
The need for further research has been recognised and appropriate strategies are being tested but technology will continue to evolve and will continue to raise new issues. It may well be that there will never be a single definitive strategy and a range of strategies appropriate to different categories of digital materials may need to be employed. In this way a parallel can be drawn with the paper environment which also utilises a range of preservation strategies (de-acidification, microfilming, appropriate storage and handling etc.).The major difference, and the major cause of concern in the digital environment, is that failure to address the long-term access requirements of digital materials at a very much earlier stage than for paper materials will almost inevitably result in their permanent loss.
See Storage and Preservation for more detailed discussion of digital preservation strategies.
Note: This section has been updated by Deborah Woodyard Robinson, [March 17 2006]
While technological issues are undeniably challenging, there are also numerous challenges which relate to the ability of organisations to integrate the management of digital materials into their organisational structure. In addition, there is an increasing need to go beyond the confines of individual organisations, or even countries, to maximise the benefits of the technology, address issues such as copyright, and also to overcome the challenges cost-effectively. Most organisations readily acknowledge the benefits of increased collaboration but also indicate the difficulties of, what one case study interviewee described as "differing agendas and timescales", not to mention different funding mechanisms. The following issues are being faced, and in many cases, systematically addressed, by organisations world-wide.
The cost of digital preservation cannot be easily isolated from other organisational expenses, nor should it be. As discussed in Strategic Overview digital preservation is essentially about preserving access over time and therefore the costs for all parts of the digital life cycle are relevant. In that context even the costs of creating digital materials are integral in so far as they need to include cost elements which will ultimately facilitate their long-term preservation.
Another example of the overlap of Organisational Issues is discussed in the Expertise section. The ability to employ and develop staff with appropriate skills is made more difficult by the speed of technological change and the range of skills needed. It is also limited by resource constraints on organisations which may well need to retain the same level of ongoing commitment to and management of traditional collections and may need to integrate commitment to digital collections without additional resources.
Nonetheless the exercise of calculating costs, however complex, is a valuable and necessary task to establish cost-effective practises and a reliable business model. The cost of the labour required for digital preservation will be the most significant by far and includes not only dedicated experts but varying proportions of many staff such as administration, management, IT support, legal advisers etc.
Other major issues to impact costs include organisational mission and goals including the type and size of collections, the level of preservation committed to and the quantity and level of access required, and time frame proposed for action. These are discussed in detail in the section on Costs and Business Modelling. The relationship of costs and institutional strategies such as collaboration, third party services, rights management, training and standards are also discussed in the previous sections.
An approach to developing a successful business model which builds incrementally on:
- experience within the institution;
- collaboration with others who are confronting the same challenges;
- development of shared tools and services
- will all combine to reduce risk and help develop effective strategies and practices as well as contribute to driving down costs.
See also Collaboration for further discussion on the advantages of co-operation between creating and archiving organisations to reduce costs.
See also Creating Digital Material for references to models for specific aspects of costs, such as digitisation, and maintaining digital archives. Remember however that while there is a wide amount of project-related data on costs, they may or may not have any bearing on the costs of managing digital materials long-term. The further reading and exemplars all support the view that costs for maintaining the digital copy also need to be considered from the beginning whether those materials are produced as a result of digitising analogue materials or whether they are "born digital".
"The need for digital preservation expertise is high: asked to rate staff as expert, intermediate, or novice, only 8 of the 54 institutions considered their staff at the expert level." (Hedstrom and Montgomery 1998)
The dramatic speed of technological change means that few organisations have been able even fully to articulate what their needs are in this area, much less employ or develop staff with appropriate skills. In addition, there is little in the way of appropriate training and "learning by doing" can often be the most practical interim measure.The DLM Forum (see note) has been undertaking work in this area for records management and has made significant progress in terms of identifying a set of six core competencies which have been used as the basis for developing training programmes for records managers.
It will take time for these developments to filter through to the workplace, and in the meantime, organisations and professional organisations need to ensure their existing staff and members can develop, and continue to develop, the range of competencies they need to manage the digital materials in their care. In addition, continuous professional development will be at least as necessary for dealing with digital materials as it is for other developmental needs. Case study interviewers have stressed the need for focussed, tailor-made courses to provide them with their specific requirements.This handbook aims to help fill a gap between current needs and existing training courses by providing guidance and tools which can be used by individuals, institutions and trainers to meet current needs.
"In addition to redefining responsibilities of organisations, it may be necessary to redefine roles within organisations to ensure long-term access to digital information." (PADI)
The nature of the technology and dependencies in the preservation of digital materials are such that there are implications for organisational structures. Organisational structures tend to be segregated into discrete elements for the efficient processing of traditional collections, but will need to cross boundaries in order to draw on the full range of skills and expertise required for digital materials. Many of the activities converge, for example decisions about acquisition and preservation should sensibly be made at the same time. Even with clearly articulated policies in place, this is likely to place strains on resources which may be seen to be competing, at least in the interim. In the absence of strong policy development, it will be impossible to develop effective strategies for managing digital materials. In a worst case scenario, it may even result in a situation in which the management of both traditional and digital materials is placed at risk.
"Although there is continuity of purpose and value within cultural institutions, these exist alongside a fundamental examination of roles and practices." (Dempsey 1999)
There are some existing repositories which undertake responsibility for specific subject areas or specific formats. In the UK, for example, the Arts and Humanities Data Service and Data Archive are two examples of institutions undertaking responsibility for social science and humanities research data, while the National Sound Archive assumes responsibility for its collection of sound recordings. In addition, there is work going on in other countries to establish national co-operative models for digital preservation. Examples of these can be found in Collaboration. In time, it is expected that these efforts in individual countries will crystallise into clearly defined roles and responsibilities where it is as obvious which institution is likely to be the major preserver of specific digital materials as it is for non digital materials. Despite these encouraging developments, at the present time the question of who should be responsible for ensuring long-term preservation is by no means as established in the digital environment as it is in the analogue environment.
Even when it has been determined which organisations will undertake to act on their long-term digital preservation responsibilities the environment will demand far greater engagement with a much larger group of stakeholders than has previously been the case. Some will inevitably choose to contract out all or part of their digital preservation responsibilities to a third party provider.The lifecycle approach advocated by Beagrie and Greenstein has significant implications for the way organisations responsible for long-term preservation need to interact and collaborate with data producers and publishers and each other.
Roles are also changing within as well as between institutions. Assigning responsibility for preservation of digital materials acquired and/or created by an organisation will inevitably require involvement with personnel from different parts of the organisation working together.This can potentially present difficulties unless underpinned by a strong corporate vision which can be communicated to staff. Similarly, staff working in an increasingly electronic environment are needing to modify their role to reflect the different demands of the technology.
Finally, creators of digital materials need to be able to understand the implications of their actions in terms of the medium to long-term viability of the digital material they create.Whether it be a record created during the day-to-day business of the department, a digital copy of analogue collection material, or a "born digital" resource, guidance and support as well as an appropriate technical and organisational infrastructure will assist in facilitating greatly improved prospects for efficient management and preservation.
"In the network environment, any individual with access to the Internet can be a publisher and the network publishing process does not always provide the initial screening and selection at the manuscript stage on which libraries have traditionally relied in the print environment." (National Library of Canada 1998)
The enormous quantity of information being produced digitally, its variable quality, and the resource constraints on those taking responsibility to preserve long-term access, makes selectivity inevitable if the objective is to preserve ongoing access. In the digital environment, it is possible to by-pass the traditional distribution channels, as well as filtering and quality control processes.While there are benefits for users in terms of swift access, there are also difficulties in terms of quality control. Selecting quality materials for long-term retention therefore places a burden on organisations in terms of resources and also in terms of the potential impact of selection.
With traditional collections, lack of selection for preservation may not necessarily mean that the item will be lost, allowing for a comfort zone of potential changes in criteria for selection at a later stage. No such comfort zone exists in the digital environment where non-selection for preservation will almost certainly mean loss of the item, even if it is subsequently considered to be worthwhile.
In cases where there may be multiple versions, decisions must be made in selecting which version is the best one for preservation, or whether more than one should be selected. Sampling dynamic resources as opposed to attempting to save each change, may be the only practical option but may have severe repercussions if the sampling is not undertaken within a well-defined framework and with due regard to the anticipated contemporary and future needs of the users.
Some consideration also needs to be given in the selection to the level of redundancy needed to ensure digital preservation. A level of redundancy with multiple copies held in different repositories is inherent in traditional print materials and has contributed to their preservation over centuries. Although in a digital environment a single institution can provide world-wide access and accept preservation responsibility, it remains an issue of concern to many that a level of redundancy should exist in the digital environment. Such concerns need to be balanced against the potential cost in duplication of effort. Either scenario points to a greater level of overt collaboration in selection between institutions to preserve electronic publications. In any scenario, it will be critical to establish sustainability and unequivocal acceptance of responsibility to avoid the danger of losing access over time.There still needs to be assurance that preservation responsibility will be undertaken, and a clear understanding of who will undertake that responsibility and for what period of time. Otherwise there can be no guarantee that, even if several copies are stored in various repositories, all of those repositories might, for a variety of reasons, cease maintenance of the digital object at some point.
Finally, in all successful preservation strategies it may well be necessary to repeat steps in the selection process, with appropriate documentation, as part of the long-term cycle of actions to maintain access in new technological environments.
See also Appraisal and Selection
"Compounding the technical challenges of migrating digital information is the problem of managing the process in a legal and organizational environment that is in flux as it moves to accommodate rapidly changing digital technologies." (Waters and Garrett 1996)
This section provides an overview of legal issues involved in digital preservation. As such it does not attempt to provide guidance on general legal issues which impact on the operations of libraries, archives and other repositories, as these are covered in a number of other reference works. It is written from a UK perspective and legislation in this area will vary from country to country. It is also an area even in the UK where forthcoming legislation such as the draft EU Copyright Directive may have a substantial future impact. Please note this section does not constitute legal advice. This is a complex and rapidly changing area and readers must seek legal advice for their specific circumstances and national legal frameworks.
Further information on implementation and further reading is listed in Rights Management.
Intellectual property rights (IPR) and preservation:
Copyright and other intellectual property rights (IPR) such as moral rights have a substantial impact on digital preservation. As outlined in Technoloical Issues the preservation of digital materials is dependent on a range of strategies, which has implications for IPR in those materials.The IPR issues in digital materials are arguably more complex and significant than for traditional media and if not addressed can impede or even prevent preservation activities. Consideration may need to be given not only to content but to any associated software. Simply copying (refreshing) digital materials onto another medium, encapsulating content and software for emulation, or migrating content to new hardware and software, all involve activities which can infringe IPR unless statutory exemptions exist or specific permissions have been obtained from rights holders.
As both migration and emulation will involve manipulation and changing presentation and functionality to some degree (especially over any period of time) important issues of principle and practice are raised in negotiations. It is important to establish a dialogue with rights holders so that they are fully aware of these issues and the actions and rights required to ensure the preservation of selected items.
What is different about IPR and electronic materials?
Traditional materials are relatively stable and well established legal and organisational frameworks for preservation are in place.This is not the case for electronic materials. Digital materials need consideration of both content and also hardware and software, and require very different methods of preservation. In addition in the UK there are currently no similar legal
provisions for prescribed libraries and archives permitting preservation activities on electronic items in their permanent collections: the necessary permissions must be obtained from copyright holders.
The duration of IPR in electronic materials will often extend well beyond commercial interests in them and the technology which was used to generate them. Long-term preservation and access may require migration of the material into new forms or emulation of the original operating environment: all of which may be impossible without appropriate legal permissions from the original rights owners of the content and underlying software.
Legal deposit of electronic publications
The position on legal deposit of electronic publications in the UK is different from that of print publications (the current Act refers only and specifically to print).Voluntary deposit arrangements were introduced in January 2000. Statutory provision for legal deposit of electronic publications may follow within two years. However, voluntary arrangements need to be negotiated on a case by case basis.
Other statutory requirements
Other statutory requirements may also apply and influence preservation of digital resources.The requirements of the Public Records Act will apply to government records including those in electronic form. Statutory retention periods will apply to many electronic records (e.g. for accounting and tax purposes). Although these are often of limited duration, it is notable that requirements for retention of electronic records in some sectors (e.g. the pharmaceutical industry), are of increasingly long duration. In such cases long-term preservation strategies will apply as technological change will almost certainly have affected access to such records.
Access and security
Some of the additional complexity in IPR issues relates to the fact that electronic materials are also easily copied and re-distributed. Rights holders are therefore particularly concerned with controlling access and potential infringements of copyright.Technology developed to address these concerns and provide copyright measures can also inhibit or prevent actions needed for preservation.These concerns over access and infringement and preservation need to be understood by organisations preserving digital materials and addressed by both parties in negotiating rights and procedures for preservation.
Business models and licensing
Consideration of the business models for dissemination of electronic materials and the range of stakeholders also impacts on IPR and preservation. In most cases electronic publications (particularly electronic journals) are not physically owned by the subscribers, who license access from the publisher. Subscribers are therefore concerned that publishers consider the archiving and preservation of these works and include archiving and perpetual access to back issues in licensing of these works.
Stakeholders, contract and grant conditions, and moral rights
Electronic materials are the result of substantial financial investment by public funds (e.g. research councils) and/or publishers and intellectual investment by individual scholars and authors. Each of these stakeholders may have an interest in preservation; the archiving organisation will need to acquire permissions from them to safeguard and maximise the financial investment or the intellectual and cultural value of the work for future generations. Such interests may be manifested through contract, licence, and grant conditions or through statutory provision such as "moral rights" for the authors.
Privacy and confidentiality
Information held within the repository may be subject to the Data Protection Act or similar privacy legislation protecting information held on individuals. Information may also be subject to confidentiality agreements. Privacy and confidentiality concerns may impact on how digital materials can be managed within the repository or by third parties, and made accessible for use.
Investment in deposited materials by the repository
Holders of the material over many decades will almost certainly need to invest resources to generate revised documentation and metadata and generate new forms of the material if access is to be maintained. Additional IPR issues in this new investment needs to be anticipated and future re-use of such materials considered.
Where a depositor or licensor retains the right to withdraw materials from the archive and significant investment could be anticipated in these materials over time by the holding institution, withdrawal fees to compensate for any investment may be built into deposit agreements.