UPAA Blog 2020-21 #14 - 1/28/21 (photo above by Matt Cashore)
The 2021 UPAA DAM Survey results are now available and also posted at our website upaa.org. Thanks to all who participated in the survey. Digital asset management, or DAM, means the organization, accessibility and long-term storage of our photos, videos and other content. This article is a compilation of current experience and general knowledge of the last part of that trio: Storage.
“In my opinion, this is the most neglected phase of digital asset management.” -Jaren Wilkey, BYU
There is a surprising amount of energy and hardware devoted to making sure our photos just... sit there.
However, in the spirit of the UPAA motto, we are our institutions' visual historians, and part of our role is to make sure that the small slice of history that we witness and record is preserved well past our tenures. As BYU's Jaren Wilkey says, "From the point of creation to 100 years from now."
But it's tricky. Time and technology keep moving. Pixels get more mega- and more frames are captured every second. Just as we add storage we outgrow it, and nothing will spin at 7200rpm forever. Is there a storage strategy beyond simply: "more"...? Add to that the challenge that University IT departments might not be able to accommodate the volume and pace of photo/video storage.
(photo by Matt Cashore) In the same 6-inch wide space, CDs, DVDs and HDDs show the evolution of storage formats and capacities. The life of the media itself - as well as availability of equipment to read it - is unknown 10, 20 or 30 years from now.
UPAA, as always, is a great resource. Two UPAA-connected resources are just a mouse click away:
- A UPAA friend, Symposium presenter and the man who literally wrote the book on digital asset management: Peter Krogh. The most up-to-date version of his expertise can be found in The DAM Book 3.0, available as a hard copy or PDF.
- UPAA's own Jaren Wilkey put together this detailed multimedia guide. BYU Photo currently has an archive of over 20 million photos (100 TB!).
In this article several UPAA members detail their hardware and strategies for long-term storage. They cover what to keep, how to keep it and how to maintain it. If there is one single critical piece of information to take away with regard to digital storage, it's REDUNDANCY. To repeat: Redundancy. Most digital asset management experts (including UPAA's own!) recommend the 3-2-1 strategy for storage.
PART 1: TRASH OR TREASURE?
(photo by Matt Cashore) Serious storage! Movable stacks in an off-site warehouse are used to keep seldom-used books in the University of Notre Dame library system in high-density storage.
“Storage experts are hoarders.” -Marie Kondo
Ouch! That hurts. C'mon, Marie, we're historians! The choice to keep, delete--or something in between--is going to be unique to every situation. Budgets, philosophies and experience will be different for everyone. Many photographers know the story of photojournalist Dirck Halstead, and what's widely referred to as "The Monica Lesson." Another consideration: Is the time and energy involved in making the decision on what to delete more or less valuable than the cost of an extra hard drive or two? Here are three approaches:
“While I will delete images that are completely unusable (out of focus, eyes closed, etc.) I do not believe in deleting the bulk of a shoot just because I might not find immediate value in them...to another eye 5 days, 5 years or 50 years from now they might hold value.” –Phil Humnicky, Georgetown University
“For me, not every frame is sacred...I toss the garbage and have no regrets. However this only works if you generate a generous quantity of selects to cover lots of bases.” –Fred Zwicky, University of Illinois
I store all raw images for a period of one year, after that any raw image that wasn’t used as a select is deleted. After 3 years all raw images are deleted and only edited jpg files are stored. There’s one exception: We store all raw images forever for events that are truly historic (groundbreakings/openings/VIP visits/etc.) So far, I haven’t had a single instance of a client asking for a file that no longer exists using this system.” –Steven Sobel, Valencia College
PART 2: WHICH SHOEBOX?
(photo by Matt Cashore) Analog asset management!
“I have far too much storage space.” -No one ever
Photo storage can broadly be put into three categories: Cloud, Server, and "Do-it-yourself" (DIY) via some combo of hard drives and enclosures. Again, the main idea is to find a solution with redundant backups. Many universities have made or are in the process of making deals for cloud storage with major names like Google or Amazon, or other providers such as Backblaze or CrashPlan. Other universities have campus-maintained servers. Your mileage may vary, check with your IT department. Phil Humnicky, photographer at Georgetown University, uses box.com and offers this caution: "Speed is awful. At present, Box does not index anything other than file names and does not provide metadata search.” Steven Sobel of Valencia College uses competitor and similarly-named Dropbox and says, “I like our system, it’s fast and easy to work with, but we have a limited number of users and clients, so it wouldn’t work for very large teams or those who provide unlimited searchable access to their entire institution."
(photo by Matt Cashore) "Brother, can you spare a terabyte?" The University of Notre Dame server room. 480 terabytes of sweet, speedy online storage. Notre Dame built it's own 3-2-1 storage plan . Primary online storage is via ethernet to this server, with backups to LTO tape and cloud (AWS Glacier). There are multiple full-time employees dedicated to its maintenance.
There are two heavyweights in the "DIY" approach among those who contributed to this article: Drobo and Synology. Both offer an automated RAID (The "R" stands for "redundant," there's that word again!). There are also many other options for a RAID hard drive array.
Georgetown's Phil Humnicky explains why he's a longtime Drobo user:
"Before Drobo, I had a collection of LeCie RAID drives daisy chained together. One of the drives in one of the enclosures died. Thankfully, all of the lost images were on PhotoShelter and on DVD’s. I had been researching larger drive enclosures and a Drobo purchase was in the works. My boss had switched over to all video work and was struggling with file management as well as repeatedly reimporting clips from tapes. We chose Drobo primarily because it was the only system that allowed for drives being added (or replaced with a larger drive) without having to move all the data elsewhere, reformatting the array and copying the data back. The ability to hot swap a bad drive was a big plus in our book as well. Our in department IT team was fully onboard because they had very limited storage to offer us and no budget to set something up on the server end.
(photo by Phil Humnicky) 7 of 8 bays are filled in this unit. Drobo makes RAID as simple as it gets: If the drive is green, you're good. If it's any other color, replace it. No IT calls required!
What I like: Ability to add additional drives without moving the data elsewhere, ability to hot-swap a drive with a larger one, ability to hot-swap bad drives, at the time of purchase: the ability to leverage the I-SCSI over a Cat-5 cable, USB-C/Thunderbolt on the new enclosure.
What I don’t like: The subscription model for tech support, the loss of I-SCSI on the older enclosure after OSX upgrade several years ago, the lack of swapping drive sets to a new model enclosure (drive sets can only by moved to a new enclosure if the new one is identical to the old one)…not very helpful when your old enclosure hasn’t been manufactured for several years or if you are looking to move from a 4-bay enclosure to an 8-bay.
Would I pick Drobo if starting from scratch: Yes. I have yet to find another Direct Attached Storage solution that offers the ability to add and upgrade storage levels on the fly."
UPAA associate member and Indianapolis-based freelancer Marc Lebryk goes through some highlights of his Synology system:
"I have an 8 bay Synology 1817+ with two 5 bay expansion units making for 18 total bays for hard drives. I'm currently storing 35TB-ish worth of data from the last 15-18 years including my freelance work as well as some of my archives from when I worked at the Indianapolis Star. All on WD RED NAS drives as I find that not mixing and matching brands nets better results.
(photo by Marc Lebryk) A Synology 8-bay enclosure and two 5-bay expansion units. BYU also uses Synology as their primary online storage system.
I landed on Synology for a bunch of reasons. Those reasons are:
1. Expandability. The 8 Bay unit that I have is capable of taking TWO 5 bay expansions. With firmware updates the unit also can make larger volumes as larger drives come out. Eventually I'm sure I'll have to update the hardware (I actually did this once already a few years ago) but for now it's scaleable for the future.
2. The Synology uses a semi proprietary file system. Semi proprietary because unlike the Drobo if the unit craps out, I CAN put the drives into a PC tower and with a little know-how you can recover the data without needing another Synology unit. I'm using the Synology proprietary raid formatting, that's why it's a "semi-proprietary" file system. You have the option of using a standard raid, but the Synology raid provides speed and space benefits.
3. The Synology is self contained. My archive is no longer attached to a MacPro which has cut down on power consumption by only like eleventy thousand percent. The unit sits attached to the network and UPS Power backup and when there is a power fluctuation it shuts itself down, and then restarts once power comes back. It also runs a fidelity check every 30 days and emails me the results. There are many diagnostics you can set it to run and send you. Not only that but my archive is available to me ANYWHERE now. I can even access and search my archive on my phone while standing out in the middle of nowhere. As long as my home internet is active, I can access my archive.
(photo by Marc Lebryk) Remote access
4. The Synology is FULLY INDEXED. That means that when I ingest my take via Photo Mechanic I imprint the required metadata on the files automatically (usually). This way when I need to search for something, I can do so by location, by subject, by publication or etc. All I need to do is search in my Mac Finder when connected to the server, or via the Search option in the browser on the Synology.
The Synology itself is a much more powerful piece of hardware than I am using. It is capable of hosting my website, or even hosting my data as my own personal Dropbox if I wanted. This is helpful if a client loses something. I can literally just send them the link to the files they need on my server instead of downloading them, and then uploading it to dropbox before sending it out. You can install Carbonite or BackBlaze or any number of these online off site storage softwares on your Synology and it can back itself up online automatically though. I never did this as my archive is too big, it would take years to upload all the data and some of it I can't put in the cloud by contract."
(photo by Ken Bennett) Above: Wake Forest's Ken Bennett uses an OWC Thunderbay 4, with 4 6TB drives configured as a RAID 5 using SoftRaid. Ken gets 18TB of usable space with that setup. Ken does the 3-2-1 strategy one better, backing up onto two sets of 3.5-inch bare drives and a third backup on university-provided cloud storage through CrashPlan, though Ken hints at slow speed with CrashPlan saying, "That is still doing the initial upload and will take months or years to complete."
Several UPAA members mentioned difficulties with using Lightroom catalogs on networked storage such as Synology. At the time of writing this appeared to continue to be a trouble spot. This article detailed some workaround possibilities, and UPAA member Bill Cotton of Colorado State University noted, "The catalog itself cannot be on a network, but the assets can. Backups can also be directed to a network location." Bottom line, though, no 'out of the box' solution was apparent.
PART 3: MOVING PICTURES
We're unfortunately not referring to the master work by rock gods Rush, we're talking about data migration, of course!
“At some point everything gets bigger and better and hardware gets old.” -Marc Lebryk
"Every single method of storing digital photos is only temporary," says BYU's Jaren Wilkey. Another way of saying that might be to ask: When is the last time you saw a computer with a CD/DVD drive? Formats change and anything that spins at 7200 rpm will eventually wear out. The main point experts consistently make is to stay ahead of the need.
(photo by Matt Cashore) The UPAA blog is not making any recommendations on products or services *with one exception*: This dual drive dock (or "toaster") by Other World Computing is the one to buy if that's a tool you need for your storage solution. Many dual drive docks don't allow independent operation of two drives at once, but this one does.
(photo by Matt Cashore) Bare hard drives, or HDDs, are currently the most economical storage media. Which hard drive to buy? Inevitably, any hard drive will...eventually...fail. Is there a particular brand that has a better track record than others? Cloud storage company Backblaze publishes a quarterly list of hard drive failure rate stats.
Data migration is, in a word: Inconvenient. A data transfer of hundreds of gigabytes can get derailed by something as simple as a lower case .xmp file. However, migration is also a chance to check storage integrity. "Hard drives are only a temporary storage device," says BYU's Jaren Wikey, "Because they are mechanical devices they will always fail eventually. The only way to overcome failure is to have a rigid backup and migration strategy."
"As far as budgeted upgrades, I pretty much add a drive to it every year. In fact I just ordered a 6TB WD Red drive to add to it this morning. I find that holding ahead of the storage limit is a much more economical practice than waiting until the unit is almost full. $140 for a drive here and there sure beats $500 or $600 because I need to get multiple LARGE drives all at once." -Marc Lebryk, Freelance
"Apparently each and every model of (Drobo) enclosure have slightly different protocols for drive sets. Argh! I had to migrate 15TB of data in small data blocks…approx. 6 months of folders at a time. When I migrated in 2019 that meant nearly 30 hours-long copying sessions….some of those sessions were incomplete for mysterious errors that popped up. I couldn’t do anything else on the computer during these copy sessions for fear of locking up the Finder and thus killing a copy session. I closed out all apps, disconnected the computer from the network, and turned off the screen saver. I think it took just shy of a week to complete the full transfer, and additional days of a manual comparison of folder lists to make sure nothing was missing." -Phil Humnicky, Georgetown University
To end we return to the question that is at the heart of the storage question: Could someone 75 or 100 years from now find a photo you made today?
"There is a reasonable chance that the edited, tagged, captioned, exported JPEG files will be findable and usable. There is exactly zero chance that any of my original raw files will survive -- no one except me cares about them. No one except me even really knows they exist. Once I retire, the RAID drive will end up in the back of a cabinet somewhere and no one will ever look at it again." -Ken Bennett, Wake Forest
"70%" -Jaren Wilkey, BYU
“The DAM Book 3.0”
_____________________________________________
“That's the whole meaning of life, isn't it? Trying to find a place for your stuff.” -George Carlin | Thanks for reading the blog. Articles, article ideas and feedback always welcome. Email editor Matt Cashore, mcashore@nd.edu. Follow UPAA on Instagram!