Carlpedia
Child pages
  • Bagit
Skip to end of metadata
Go to start of metadata

Update

Carleton now uses a bagit batch processor using Python that can be found on GitHub. 

https://github.com/CarletonArchives/BagBatch

These instructions below are still accurate but the python batch process can be applied to many bags in one operation. 

Overview

Bagit Library is a java based program that creates archival bags of files following the Bagit standard developed by the Library of Congress.  

http://en.wikipedia.org/wiki/BagIt

http://www.youtube.com/watch?v=l3p3ao_JSfo

It can be used to create, update, validate bags along with several other functions and options. 

While the Library of Congress uses Bagit as a way to verify that files have been transferred from one computer to another properly, we also use bags to store our master copies of files, web copies of those files and metadata about those files.  

Note: While the code of this program is Java and should be platform agnostic, this implementation uses a DOS batch file to start and is therefore only runnable on a PC.

Note: In order to install and setup Bagit, you will need administrator rights on your computer.  If you do not have administrator rights, consult your IT department about either getting administrator rights or having an administrative user setting up the software for you.

Download

http://sourceforge.net/projects/loc-xferutils/

Installation for PC - Video Tutorial

For a video tutorial on how to install Bagit and how to install the Java Runtime Environmental (JRE) variable see these videos.

Part 2.1-2.2 http://www.youtube.com/watch?v=OVMo03GYh54

Part 2.3 http://www.youtube.com/watch?v=JCfRPdhHsyo

Part 2.4 http://www.youtube.com/watch?v=4a9Qnw1Pa-E

Installation for PC - Written tutorial

Install Java runtime environment

Check to see if Java is already installed on your computer. 

Go to Start-> setting->control panel

If Java is present in the control panel, proceed to the section on setting the Java_home environmental variable.  

If not download and install Java from the Oracle website or from java.com

Identifying the JAVA_HOME environmental variable path value

If you already know the value of the path for java_home on your computer, proceed to the section for setting the value of this variable. 

If you do not know this value, go to Start/Control Panel/Java

Select the Java tab in the Java Control Panel.

Select View.

The value of JAVA_HOME should be everything in the path field before the bin directory.  For example, if the value of the path field is C:\Program Files\Java\jre6\bin\javaw.exe, then the variable value should be C:\Program Files\Java\jre6

Configuring the Java_home environment variable

Right click on My computer, select Properties/advanced/environmental properties

Scroll down and look for a variable named JAVA_HOME.

If a variable is present, make sure it matches the path from the section above on identifying the JAVA_HOME path

If no JAVA_HOME variable is present, click New. 

The name of the variable should be: JAVA_HOME.

The value of the variable should be the path from the section above on identifying the JAVA_HOME path

Hit OK.

Check that the JAVA_HOME variable is set properly 

Go to Start/Run 

enter cmd in the Run window.

Once a command prompt is open, enter Set and hit return

This will give a list of all the environmental variables available to this user.

JAVA_HOME should be in this list and its value should be the path you identified and entered above. 

If JAVA_HOME is not present or has the wrong value assigned, make sure that all the steps above have been followed correctly.  

Download and install Bagit

Go to the distribution site for Bagit

Download the -bin version of Bagit.  

Note: Do not download the src version of bagit since it is un-compiled source code and not covered in these instructions.  

Unzip the contents by either double clicking on the downloaded file, or right clicking on the file and selecting Open With.../Compressed Zip Folders Option.

Drag or copy the entire contents of the unpacked bagit directory into preferably the C: directory.  

If you place bagit in another directory or in a subdirectory of the C: drive, you will need to navigate to it via  DOS commands before using bagit

Note regarding memory: We increased the memory of the Java app to accommodate larger bags in the 100's of G's.  Follow the instructions in the README.txt file, and set MAXMEM=2048m.  

Installation for Mac 

Download and Placement

Download Bagit and unzip contents

Place contents of unzipped Bagit package somewhere in your home directory such as HD/Users/[username]/bagit 

Setting $JAVA_HOME and increasing memory

Open a terminal by going to Applications/Utilities/Terminal

Open bagit-4.4/bin/bag using a text editor by typing nano ~/bagit/bin/bag

Edit the line "#JAVA_HOME=" to equal the direct path to the version of Java you are either running or want to run.   For example "#JAVA_HOME=/usr/lib/jvm/java-6-sun"

Edit the line "MAXMEM=512m" and change it to "MAXMEM=2048m".

Hit Control O and Return to write your changes.  

Hit Control X to exit.

Using Bagit PC - Written Instructions

There are users manuals online for Bagit here:

But here are descriptions of the most common usages.

Open a command prompt by selecting Start/Run and typing cmd and Enter or Start/Programs/Accessories/Command Prompt.

Navigate the to the directory containing the bag.bat file by entering cd \bagit-3.9\bin and Enter.

Enter "bag baginplace " than the full path of the directory you want to bag.  

This will take the directory above and restructure it into a bag with a data file a checksum manifest and a bag version text document.  This instance of bagit also adds some other text files as well that can be left as is.  

Shortcut: to add the path of the directory without retyping it, select the directory and drag it into the command prompt after entering bag baginplace.  

Other commands that will be useful are:

    • bag verifyvalid --failmode FAIL_SLOW PathToBag
      • Verifies that the bag contents match the manifest.
      • The --failmode FAIL_SLOW operator allows bagit to continue checking the rest of the bag even if errors occur.
    • bag update PathToBag
      • Compares the bag contents to the values in the manifest and adds any new items as well as checksums.
    • bag baginplace PathToBag1 & bag baginplace PathToBag2

For a list of available options in bagit, you can also just enter "bag" while in the directory containing bag.bat.  

Using Bagit PC - Video Tutorials

There are also video tutorials on using bagit to create, verify, validate and update bags here:

BagIt Tutorial #3 (3.1): Creating/Verifying Bag: Introduction (5 of 10) 5 minutes

http://www.youtube.com/watch?v=DmZ4nPQJHFs

BagIt Tutorial #3 (3.2): Creating/Verifying Bag: Demonstrate Create/Verify a Bag (6 of 10) 15 min.

http://www.youtube.com/watch?v=TW4LUPZTwNA

BagIt Tutorial #3 (3.3): Creating/Verifying Bag: Verification errors, Transfer a Bag (7 of 10)

http://www.youtube.com/watch?v=DPJtTsA-Fvs

 

Using Bagit Mac 

Useful commands 

Using Bagit on Macs is almost identical but the syntax is different.  

Open a terminal by going to Applications/Utilities/Terminal

Useful commands include

    • ~/bagit/bin/bag verifyvalid --failmode FAIL_SLOW PathToBag
      • Verifies that the bag contents match the manifest.
      • The --failmode FAIL_SLOW operator allows bagit to continue checking the rest of the bag even if errors occur.
    • ~/bagit/bin/bag update PathToBag
      • Compares the bag contents to the values in the manifest and adds any new items as well as checksums.
    • ~/bagit/bin/bag baginplace PathToBag1 && ~/bagit/bin/bag baginplace PathToBag2

For a list of available options in bagit, you can also just enter "~/bagit/bin/bag"

 

Batch Processing of Bags

The Carleton College Archives has developed a batch process, BagBatch, for creating many bags in one step.  The program can be directed to process large numbers of proto-bags that all reside in the same directory, usually as part of a workflow.  Software requirements, installation instructions and downloads can be found at:

https://github.com/CarletonArchives/BagBatch


Questions or Comments?

Nat Wilson

Digital Archivist

507.222.4265

nwilson@carleton.edu

 

 

  • No labels