Welcome!

California Cooperative Oceanic Fisheries Investigation (CalCOFI) has been conducting marine ecosystem surveys in the California Current since 1949. More information about the CalCOFI program can be found on the CalCOFI website.

The purpose of this Shiny App is to provide scientists with an interactive tool to visualize marine mammal data collected onboard CalCOFI. Here we integrate multiple datastreams, highlighting how marine mammal visual sightings and eDNA detections are represented through time and space. Please stay tuned for the addition of acoustic detections in a future release. By integrating visual, acoustic, and genetic sampling methods, we hope to better understand the detection capabilities of each method for detecting marine mammals in their environment.

How is CalCOFI Data Collected?

Below is an interactive infographic that details the sampling processes for collecting visual, eDNA, and bioacoustic data.

Click or hover on different parts of the visual to learn more about them!

Interactive Species Map

Select Data Type:

Download Data

CalCOFI Bioacoustics

CalCOFI sonobuoy deployment

Bioacoustics refers to the study of the sounds produced by marine mammals and their underwater acoustic environment. Bioacoustic studies of marine mammals involve the collection, analysis, and interpretation of these sounds to gain insights into various aspects of their behavior, ecology, and conservation.

Spectrogram with all identifiable calls present (blue whale A, B, and D calls / fin whale 20 hz and 40 hz calls)

How is Bioacoustic Data Collected and Visualized?

CalCOFI ships deploy sonobuoys at varying depths to record marine mammal acoustics for hours at a time. These devices typically pick up 2000 samples of sound waves per second, which are saved as wav files for further manipulation. A series of Fourier transforms are then applied to short, overlapping segments of the recorded signals to convert the data from the time domain (amplitude varying with time) to the frequency domain. Additional features of the sound wave can then be extracted such as a signal's frequency spectrum and magnitude. Using this acquired information, a spectrogram is constructed by piecing together the audio segments into one continuous plot that provides a time-varying representation of a signal's frequency content. Duration is on the x-axis, frequency is on the y-axis, and magnitude is represented by color. The end product is a colorful visual capture of recorded sound that contains anything from whale songs to white noise from ships, as you can see in the figure above.

Listen to actual recordings by pressing the play buttons!

Faster r-CNN Deep Learning Model

A Faster r-CNN ResNet-50 model was built to automate the identification of whale calls and classify them given spectrograms transformed from wav files of collected bioacoustic data as input. Although functional, imperfect accuracy was likely due to the abundance of noise and insignificant signals scattered throughout the data as shown below. These factors made it difficult for the model to distinguish between the actual calls and undesired visuals.

Noisy spectrogram containing blue whale B call

The solution was to produce a preprocessing pipeline that would eliminate this noise, and consequently, increase both model runtime efficiency and classification accuracy. The following is a small scale demonstration of the final preprocessing pipeline followed by the training of the Faster r-CNN ResNet-50 model with code and output examples:

Step 1: Preprocessing

Step 1a: Organizing the Data

We first iterate through our training data of spectrogram images to flatten each one into a 1-dimensional array that consists of the image's pixel values dependent on color intensity. The arrays are then vertically stacked on top of one another to create a single array containing all of the data. An example of an unmodified spectrogram from the original dataset is shown below.

  #annotations saved in 'unique_annotation' variable
  #images navigated to via 'spectrogram_path'
  
  for index, row in unique_annotation.iterrows():
      image = Image.open(row['spectrogram_path'])
      pixel_values = np.array(image).flatten()
      data_matrix.append(pixel_values)
      
  original_data = np.vstack(data_matrix)

Step 1b: Principal Component Analysis (PCA)

To denoise the spectrograms and cut down on computation time, we decided to perform PCA to the training set. Singular value decomposition was done to separate our array's covariance matrix into three sub-matrices that together comprise the original data: U (eigenvector matrix), S (eigenvalue matrix), and T (feature matrix). Below is an example of a few features extracted from matrix T for a single spectrogram. In theory, combining those 10 separate components would construct something very closely resembling the original observation.

  U, S, T = np.linalg.svd(original_data, full_matrices=False)
  US = U*S
  
  svd_data = US @ T
  svd_data_scaled = scaler.inverse_transform(svd_data)
  
  for i in range(0, 10):
      one_face = T[i]
      plt.subplot(2, 5, i + 1)
      draw_img_single(one_face)

Step 1c: Noise Reduction

With the artifacts and white noise now separated from the signals of interest, we performed a column-wise background subtraction on the sub-component images along with median blurring for further noise reduction. This removes the unwanted columns and messy noise from the spectrograms while accentuating the stronger, brighter whale calls. We can now use these filtered principal components to reconstruct much cleaner spectrograms.

  for i in range(len(T)):
      feature = np.copy(T[i].reshape((141, 601)))
      feature = median_filter(feature, size = 3)
  
      for j in range(feature.shape[1]):
          column = feature[:, j]
          percentile_value = np.percentile(column, 10)
          feature[:, j] = column - percentile_value
          feature[:, j][feature[:, j] < 0] = 0
          
      signal_enhanced_features[i] = feature.flatten()
      
  for i in range(0, 10):
      one_face = signal_enhanced_features[i]
      plt.subplot(2, 5, i + 1)
      draw_img_single(one_face)

Step 1d: Reconstruction

We reconstruct the full spectrogram images by combining the original U and S matrices from Step 1b with the 'signal_enhanced_features' created in Step 1c. This time, however, we keep only the first 150 principal components since those were sufficient in creating reconstructions of the original images that resembled them almost perfectly. The procured reconstructions contained almost no vertical artifacts and significantly fewer features for the model to have to analyze.

  matrix = US[:, 0:150] @ signal_enhanced_features[0:150, :]
  matrix = US @ signal_enhanced_features
  matrix_scaled = scaler.inverse_transform(matrix)
  matrix_scaled = np.where(matrix_scaled < 0, 0, matrix_scaled)

Step 1e: Noise Reduction on Reconstructions

We lastly apply another simple column-wise subtraction to these reconstructions to give us our finalalized preprocessed images. Below is an example of the image of the fin whale 40 Hz pulse from step 1a before and after preprocessing.

  matr_sub = np.zeros_like(matrix_scaled)
    
  for i in range(len(matrix_scaled)):
      spec = np.copy(matrix_scaled[i].reshape((141, 601)))
  
      for j in range(spec.shape[1]):
          column = spec[:, j]
          percentile_value = np.percentile(column, 60)
          spec[:, j] = column - percentile_value
          spec[:, j][spec[:, j] < 0] = 0
  
      matr_sub[i] = spec.flatten()

  for i in range(len(matr_sub)):
  processed_image = matr_sub[i].reshape(141, 601)
  image = Image.fromarray(processed_image.astype(np.uint8), 'L')
  image.save(Path(directory_path) / Path(filenames[i]))

The final preprocessed arrays are converted back into images and saved to the directory path specified by the user, ready to be used for model training.

Step 2: Training the Model

We first read our preprocessed spectrogram images for training and validation into our data loader.

  train_d1 = DataLoader(AudioDetectionData(csv_file='../labeled_data/train_val_test_annotations/train.csv'),
      batch_size=16,
      shuffle = True,
      collate_fn = custom_collate, 
      pin_memory = True if torch.cuda.is_available() else False)
      
  val_d1 = DataLoader(AudioDetectionData(csv_file='../labeled_data/train_val_test_annotations/val.csv'),
      batch_size=1,
      shuffle = False,
      collate_fn = custom_collate,
      pin_memory = True if torch.cuda.is_available() else False)

Next we loaded the pre-trained Faster R-CNN model with a ResNet-50 backbone and a Feature Pyramid Network (fpn), adjusting the final classification layer (box predictor) for the desired number of classes (5).

  model = torchvision.models.detection.fasterrcnn_resnet50_fpn(pretrained=True)
  num_classes = 6
  in_features = model.roi_heads.box_predictor.cls_score.in_features

  model.roi_heads.box_predictor = FastRCNNPredictor(in_features,num_classes)

We then use the torchvision package from PyTorch to create our pretrained Faster r-CNN model architecture with the specified parameters below. The optimizer used for minimizing loss was Stochastic Gradient Descent with a learning rate of 0.001, and the model will be trained for 20 epochs.

  optimizer = torch.optim.SGD(model.parameters(), lr = 0.001, momentum = 0.9, weight_decay= 0.0005)
  num_epochs = 20

With our model ready, we can finally start its training where it reiteratively learns how to correctly identify and label the whale calls in a given spectrogram. Training loss is logged after every epoch as a way to measure how well the model is fitting after each time the model's parameters are updated based on the loss function.

  for epochs in range(num_epochs):
    model.train()
    epoch_train_loss  = 0
    for data in train_d1:
        imgs = []
        targets = []
        for d in data:
            imgs.append(d[0].to(device))
            targ = {}
            targ['boxes'] = d[1]['boxes'].to(device)
            targ['labels'] = d[1]['labels'].to(device)
            targets.append(targ)
            
        loss_dict = model(imgs,targets)
        loss = sum(v for v in loss_dict.values())
        epoch_train_loss += loss.cpu().detach().numpy()
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
    print(f'training loss: {epoch_train_loss}')

Now that the model is trained, we can use our validation dataset to evaluate the performance of the model using the mean Average Precision (mAP) score.

  with torch.no_grad():
      map_value = validation(val_d1, device, model)
      print(f'Validation epoch {epochs} mAP: {map_value}')

The results from our testing data can be visualized through a precision-recall curve which describes how good our model is at classifying each call.

Precision-Recall curve of preprocessed data

CalCOFI Marine Mammal Visual Survey Data

Marine mammal visual line-transect surveys have been conducted on quarterly CalCOFI cruises since 2004. Visual surveys are conducted during daylight hours while the ship is in transit between CalCOFI stations. More information about visual survey protocol can be found in Campbell et al. (2015). Per-cruise marine mammal visual survey effort is visible by clicking ‘Display Visual Effort’. Additionally, sighting group size estimates are visible by selecting a species from the drop-down menu, where circle size on the map is proportional to group size. Only cetacean sightings are included in this interactive map. By selecting a sighting on the map, more information will pop up about that specific sighting.

CalCOFI Marine Mammal eDNA Data

The NOAA CalCOFI Genomic Program (NCOG) has collected envrionmental DNA samples (eDNA) since 2014. Here we used metabarcoding assays to detect cetacean species from water samples collected at 10, 20, or 40 meters. The ‘Display eDNA’ function will plot eDNA sampling effort as opaque black circles, and eDNA detections as green dna helixes. By selecting a detection on the map, more information about that specific detection will pop up.

UCSB Data Science Capstone

The Data Science Capstone is a three-course sequence at the University of California, Santa Barbara (UCSB) in which students engage in project-based learning with data-intensive methodologies with the hopes of making a positive impact on the world. As their project, seven students from the program upgraded this Shiny app to implement new data, improve functionality, and enhance user experience.

Ocean Observing in California Conference

Celebrate the Past, Showcase the Present, Envision the Future

This conference celebrated the rich history of ocean observation, marking the 75th anniversary of CalCOFI and CeNCOOS and the 20th anniversary of SCCOOS. With a goal of sharing insights on the California Current, showcasing successful collaborations between scientists and data users, and strengthening relationships across the ocean community.

The poster can be accessed here.

Co-authors

Luis Barajas¹, Sam Guimte¹, Justin Kim¹, Kaitlyn Lee¹, Yoobin Won¹, Ryan Yee¹, Puyuan Zhang¹, Michaela Alksne², Lauren Baggett ², Julie Dinasquet², Bryce Ellman², Erin Satterthwaite², Brice Semmens², Simone Baumann-Pickering².

Affiliations

¹UC Santa Barbara, Santa Barbara, California 93106.

²Scripps Institution of Oceanography, 8622 Kennel Way, La Jolla, CA 92037.

Funding Sources

This material is based upon research supported by the Office of Naval Research under Award Number (N00014-22-1-2719).

Office of Naval Research, US Navy Pacific Fleet