Die europäische Variante des Internet Archive:
Vision
Based in Amsterdam and Paris, the European Archive is a non-profit foundation working towards universal access to all knowledge. The archive will achieve this through partnerships with libraries, museums, other collection bodies, and through building its own collections. The primary goal of collecting this knowledge is to make it as publicly accessible as possible, via the Internet and other means.
Rationale
With the Internet, tremendously rich parts of our cultural heritage could be freely accessible online. But most of them are still dormant (non-digitized collections) or disappearing (Web history).
Massive digital collecting, digitizing, and storage techniques make it possible to preserve and give public access to this rich material. Mastering these techniques will be key in the coming years for the future of cultural heritage, both traditional and materials produced in digital form.
Europe, cradle of a unique cultural heritage has a special role to play to this regard. But even in a connected world, propinquity, specific legal and technical environment should not be underestimated and having a European-based institution in this domain makes a difference.
By developing a large-scale archiving architecture in Europe and competences that come with it, the European Archive intends to be a catalyser in the development of skills and know-how in the domain of preservation and access to digital collections. It also intend to bring in Europe a new type of Cultural Institution that focus to free public access to large rich digital collections.
Web archiving
As the web has grown in importance as a publishing medium, we are behind in bringing into operation the archiving and library services that will provide enduring access to many important resources. Where some assumed web site owners would archive their own materials, this has not generally been the case. If properly archived, the Web history can provide a tremendous base for time-based analysis of the content, the topology including emerging communities and topics, trends analysis etc. as well as an invaluable source of information for the future.
The foremost effort to archive the Web has been carried on in the US by the Internet Archive, a non-profit foundation based in San Francisco. Every two months, large snapshots of the surface of the web are archived by the Internet Archive since 1996.
This entire collection offers 500 terabytes of data of major significance in all domain that have been impacted by the development of the Internet, that is, almost all. This represent large amount of data (petabytes in the coming years) to crawl, organize and give access to.
By partnering with the Internet Archive, the European Archive is laying down the foundation of a global Web archive based in Europe.
Digitization
We have entered an era where digitization of all significant cultural artefacts will be completed. Within the next decade, most of the published cultural content (books, music, images and moving images) will have been digitized. Recent commercial announcements have fostered awareness and started this movement, but limited to a few major libraries which leaves an opportunity for an open system to be pursued. This entails digitizing, preserving and providing access to the rich public domain of books, music, images and moving images on a the large scale.
By fostering the development of a large scale, archiving platform, the European Archive intend to facilitate the mastering of processes and tools needed for digital public content archiving and distribution in Europe.
Infrastructure
With the technical support of the Internet Archive and XS4ALL, the European Archive has installed a repository with 250 terabytes (250 000 giga-bytes) capacity in Amsterdam via which a large collection digital material (text, music, moving images, software) can be accessed.
On average, the download rate has been over 350 Megabits per second in December 2005, already making EA a significant content provider in Europe.
The data organization is highly distributed (200 nodes on a cluster) to enable distributed processing.
This achievement represents already a significant step in establishing in Europe an archiving infrastructure to collect and archive digital material at large scale. We plan to extent it to 1 petabyte (1000 terabytes) within the coming years.
The European Archive should be accessible in Europe’s language. A multilingual web management system for large digital collections has been implemented. This system is based on a flat structure permanently indexed and updated. It allows light and flexible management of the web interface to collections, and can scale up unlimitedly in the future.
An opportunity for Europe
We expect the European Archive to become an essential piece in the European cultural heritage landscape.
In order to meet the goals of Lisbon 2010 for Europe to become “the most dynamic knowledge economy in the world” large-scale public archive is a key component.
It will enable public and free access to large portion of European cultural heritage and bring in the broadband network infrastructure tremendous quantity of rich and legal content.
It will bring to traditional heritage institutions a technology partner enabling them to make significant steps towards digitization and public accessibility of their collections, making Europe more visible and attractive in a globally networked world.
By developing cutting hedge technology application in the domain of massive digital collection acquisition, management and storage it will develop a centre of excellence in a key domain of tomorrow’s Internet.