LIS-S 686 Web Archiving and Preservation
3 credits
- Prerequisite(s): LIS-S 500, LIS-S 507, and LIS-S 503 OR LIS-S 581
- Delivery: Online
- Semesters offered: Spring (Check the schedule to confirm.)
Description
To comprehensively represent records created in the 21st century, select websites and other web-based resources should be captured, stored, managed, described, and made accessible as appropriate. Content available primarily or solely online is among the most at-risk of born-digital materials. Materials on the web tend to be ephemeral and subject to loss for reasons including: intentional removal, technical errors, evolution of software and website redesigns, which can result in broken links and/or significant alteration of the look and feel of the site. Web archiving is an evolving practice that presents many opportunities and challenges. In this course we’ll cover core web archiving concepts and discuss how they apply in libraries, archives and museums. We will start with fundamental concepts in web archiving, including: defining terms, exploring ethical considerations, and understanding main components of web archiving workflows. How one can plan (scope) a web collection, form it and share what’s collected will be covered. Students will learn about collecting models for web resources and the context in which this work is being done, including lesson learned via case studies. The course will feature hands-on work with the leading suite of web archiving tools (Archive-It) and explore other approaches to smaller scale collecting.
Program Learning Goals Supported
Instructors map their courses to specific LIS Program Goals. Mapped program goals drive the design of each course and what students can expect to generally learn.
- Connect Core Values and Professional Ethics to Practice
- Curate Collections for Designated Communities
Learning Outcomes
Instructors develop learning outcomes for their courses. Students can expect to be able to achieve the learning outcomes for a given course after successfully completing the course.
- Assess and analyze the characteristics of the web for archiving and preservation.
- Evaluate the potential of the web as an information and preservation medium.
- Analyze the challenges of acquiring, downloading, storing, and providing access to web-based content.
- Analyze ethical, legal and policy constrains on web archiving.
- Evaluate existing tools for web archiving and preservation.
- Create a focused archive of web content from an understanding of existing standards and best practices for sustainability of archived web content.
Course Overview
Instruction is in Canvas. Lessons are organized into Modules whose length may vary.
Module 1: Overview of the course; Introduction to web archiving
- Recorded webinar “Introduction to Web Archiving”: a high-level overview of core web archiving concepts. Viewers will learn about collecting models for web resources and the context in which this work is being done. Key takeaways will include how web archives can be part of a larger collecting program, as well as how web archives are so much more than a collection of static screenshots
Module 2: Basic concepts in Web Archiving
- Following up on core concepts in web archiving
- Review recordings & course materials from short professional development course on web archiving (developed by the instructor)
Module 3: Components and assumptions of web archiving
- Assumptions of web archiving, including some assumptions that might not match the realities how people create, manage and use websites today
Where does social media fit into all this? - Understanding basic concepts in web design and characteristics of technologies used on the ‘live web’ so you can save an accurate representation of web content
Module 4: Web Archiving Tools and Associated Assumptions (part 2)
- Introduction to Archive-It (software) and the multi-faceted work of the Internet Archive
- Components of web archiving tools: collecting (making a functional copy), management, description, access / ‘replay’, long term stewardship
- Data from the web versus a fully functional copy of web content/websites (interactive, resembling the original resource)
Module 5: Accessing Web Archives
- Exploring the Wayback Machine and viewing Archive-It collections
- ‘Save page now’ feature via the Internet Archive
- Introduction to usability of web archives and Quality Assurance issues
Module 6: First steps (planning, expectation setting, advocacy)
- Case study 1: Getting started with web archiving at the Carnegie Hall Susan W. Rose Archives (recorded webinar)
- Beginning collection development planning
- Fractional staffing, time allocation, setting expectations
- Advocating for web archiving
Module 7: Ethics, collection development and planning
- Ethics, collection development and planning
- Intellectual property concepts
- Ethics and Archiving the Web
- Usage of web archives – current, recent and anticipated
Module 8: Archive-It!
- Tools To "Do" Web Archiving
- Archive-It basics and tutorial materials
Module 9: Alternate Approaches and Additional Issues with Complex Sites
- Technological components of web archiving processes
- Beyond the Internet Archive – other web archiving tools and approaches
- Tools that are web archiving adjacent or can supplement web collections
- Case study 2: Stanford University Press
Module 10: Wrap up
- Description & access: managing collections to facilitate use and sharing
- Case study 3: Pelican Bomb (setting priorities and sunsetting a web-based journal)
- Drafting a project plan (for final project)
Module 11:
- Test crawling and beginning to collect web content
- Collaborative collecting models and workflows
- Case study 4: Ivy Plus Confederation Collaborative Web Collecting Program (pilot to present)
Module 12:
- Crawling and features to use in final project
- How to do Quality Assurance testing and report errors
- Troubleshooting, significant properties & when revising expectations might be needed
- Case studies 5 & 6: the New York Art Resources Consortium (NYARC) & Columbia University Libraries
Module 13:
- Final project work – troubleshooting and peer feedback
- New analysis tools for web archives (WARC files)
Module 14:
- Summary of main points and key takeaways
- Wrap up final project related tasks and information sharing
Policies and Procedures
Please be aware of the following linked policies and procedures. Note that in individual courses instructors will have stipulations specific to their course.