Campus data storage solutions evolve with addition of Research Object Storage
In 2021, the Peery Wildlife Ecology + Conservation Lab in the Department of Forest and Wildlife Ecology, initiated a bioacoustics monitoring program to study spotted owls spanning the Sierra Nevada ecosystem.
The California spotted owl, a mature forest species which has resided at the center of forest management and planning for several decades in the Sierra Nevadas, is threatened by harvesting of old trees, increasingly large and severe wildfires, and the recent invasion of a dominant competitor, the barred owl.

Spotted owls have been the subject of intensive, long-term population monitoring at a handful of study areas. While providing key insights into population trends and habitat use, these studies were not designed to monitor spotted owls at a regional scale or assess the impacts of individual wildfires and forest restoration activities intended to limit such disturbances. Photo by Sheila Whitmore.
Passive acoustic monitoring is an emerging approach to monitoring biodiversity that enables nearly continuous surveys of animal sounds across much broader scales than traditional, in-person surveys. Each spring and summer, a team of research scientists and technicians deploys 1,500 autonomous recording units (ARUs) in seven National Forests to capture the sounds of spotted owls and other vocalizing species, explains Jay Winiarski, ecological data analyst/manager for the Peery Lab.
Recording from dusk until dawn for five weeks, these devices collect nearly one million hours of recordings, comprising about 100 terabytes (TB) of data annually.
It’s a big research project over a big area that produces big data. The answer for storing this data? UW’s Research Object Storage (S3).
Research Object Storage (S3) is offered through DoIT Research Cyberinfrastructure and provides secure, shareable storage space for faculty PIs, permanent PIs and their research group members. It is suited for long-term archival storage, back up, integration with instruments and computing resources, and hosting static web content.
Research Object Storage (S3) can be less costly when it comes to storing big data than ResearchDrive, a university-wide file storage option, which the Peery lab also uses. ResearchDrive is also suited for data back up and archiving and provides up to 25TB of storage at no cost.
“UW–Madison has been an industry leader in offering no cost data storage to incoming faculty,” says Casey Schacher, research cyberinfrastructure storage lead and member of the Research Cyberinfrastructure Team for the Division of Information Technology. “Substantially reducing data storage costs means that researchers can focus more of their funding on their research. That is an attractive incentive for recruiting and retaining talent to the university.”
Today, there are 1,322 users storing 6,740 TB of data on Research Drive and 52 users storing 812 TB of data on Research Object Storage (S3).
Eligible researchers receive 25TB storage on ResearchDrive at no cost, with additional storage costing $120 per TB per year. They receive 50TB no cost storage on Research Object Storage (S3) and pay $60 per TB per year for additional storage.
Both ResearchDrive and Research Object Storage (S3) have been funded through central campus funding with support from the Office of the Vice Chancellor for Research.
“UW–Madison has made substantial investments in research computing and storage infrastructure over the last six years, and it has really made a difference,” says Jan Cheetham, director of research cyberinfrastructure for the Division of Information Technology. “We are more prepared to welcome new faculty who bring a lot of data with them when they join the University. We are better equipped to support the data- and computation-intensive research workflows of the RISE initiative and other research that is embracing machine learning, artificial intelligence, and data science approaches. These storage resources also enable data-sharing between UW-Madison researchers and their collaborators at other institutions.”
For the Peery Lab, partnerships are key to success.
“After the ARUs are deployed, we transfer the audio data from California to Madison, then upload it to ResearchDrive,” Winiarski explains. “From there, we work with our partners at the K. Lisa Yang Center for Conservation Bioacoustics at Cornell University to automatically detect and identify calls of spotted owls and over 100 other bird species using a state-of-the-art machine learning algorithm called BirdNET.”
To date, collaborative efforts led by the Peery Lab have generated over 400TB of audio recordings and supported studies ranging from estimating spotted owl population trends and responses to severe fire to managing the spread of barred owls, as well as investigating the ecology of other forest owls, songbirds, mammals and amphibians.
Recommendations from these studies are used by agencies such as the U.S. Forest Service to inform forest management and maintain populations of spotted owls and other at-risk species.
ResearchDrive and Research Object Storage (S3) are hosted on premises, with equipment spread across three UW–Madison data centers, making the services highly redundant and fault tolerant.
Research Object Storage (S3) is based on a computer protocol related to Amazon Web Services’ Simple Storage Service. This means that many applications that run natively in the cloud, such as scientific instrument software and static web sites, will work well with Research Object Storage (S3).
To transfer data into/out of Research Object Storage (S3), you can run scripts using an application processing interface or use a tool like Cyberduck to provide an easy GUI for interacting with S3 storage containers, which are called buckets. Buckets are the basic storage units of S3-based storage services, and they can have different settings for access and permissions.
The default bucket setting for UW-Madison Research Object Storage (S3) accounts restricts access to connections from the UW-Madison campus network or off-campus through a VPN. However, other bucket settings can be requested that allow access by individuals beyond the campus network and VPN, which is good if you are hosting content that needs access from across the internet.
That is one of the ways that the State Cartographer’s Office, for example, has created a database of aerial photography and allows people to access the images directly from weblinks.
Zuzana Buřivalová, Nelson Institute professor and assistant professor of environmental studies and researcher in the Center for Sustainability and the Global Environment with affiliations in the College of Agricultural and Life Sciences Department of Forest and Wildlife Ecology, relies on UW data storage options for her research with her Sound Forest Lab, which uses soundscapes – recordings of sounds that animals make – to investigate how tropical forests can stay safe and sound.
The Sound Forest Lab uses Research Object Storage (S3) to host audio files for the Geospatial Soundscape Database. They use Sharedigm (created by Abe Megahed, software developer from the Data Science Institute) as an S3 front end for the application (learn more about the project: https://dsi.wisc.edu/tools/soundforestlab/)
“Research Object Storage (S3) may not be appropriate for all research groups,” Schacher says. “Researchers should discuss it as an option with their IT support to determine if it is the right fit for their data.”
Research Object Storage (S3) is currently not available for restricted data, but work is underway to join it with the UW’s data transfer and sharing service, Globus. This integration with Globus will pave the way for support for restricted data types, such as ePHI.
Given the massive amount of the data the Peery Lab collects each year, Winiarski says they are seeing a significant benefit—roughly half the cost—of moving the acoustic recordings to Research Object Storage (S3) once spotted owls and other species are identified by BirdNET.
“Compared to storing the data using other means, such as external hard drives, Research Object Storage (S3) provides a very accessible and reliable way of archiving the data over the long-term, allowing us to ask more questions or re-analyze the audio recordings with new tools as they become available,” Winiarski says. “Research Object Storage (S3) offers a cost-effective solution to data storage, which would otherwise be a barrier to leveraging these acoustic technologies and conducting biodiversity monitoring and research at this scale.”
Workshop dates:
-
Research Object Storage (S3): An overview for researchers & support staff (FREE and Online)
-
Date: Thu, February 27, 2025 from 10am to 11am CST
-
Where: Virtual via Zoom
-
More information: https://it.wisc.edu/research-ci/research-object-storage-s3-presentation/
-
-
Research Object Storage (S3): A deep dive for technical support staff (FREE and online)
-
Date: Thu, March 6, 2025 from 10am to 12pm CST
-
Where: Virtual via Zoom
-
For more information:
Eligible researchers may request an account by completing the Research Object Storage (S3) Request Form.
More information about Research Object Storage (S3), including considerations for ensuring appropriate access to data is available at: https://kb.wisc.edu/researchdata/search.php?cat=14553
By Natasha Kassulke, natasha.kassulke@wisc.edu
###