Tutorial 2
Retrieving data from the database¶
In this tutorial, we show how to retrieve data from a (rather minimal) Korus example database. Jump to the next tutorial (#3), to see how the database was created and populated with data.
Connecting to the database ¶
We begin by importing the necessary modules and opening a connection to the example database,
from korus.database import SQLiteDatabase
db = SQLiteDatabase("db_t3.sqlite")
Viewing annotation taxonomies ¶
As you may recall from Tutorial 1, the taxonomy used by the acoustic analyst for labelling sound sources and sound types, is saved to the Korus database. The taxonomy can be readily retrieved with,
tax = db.taxonomy.current #retrieve the latest version of the taxonomy
The show method helps us visualize the node structure of the taxonomy,
tax.show(append_name=True) #sound-source tree
Unknown
├── Anthro [Anthropogenic]
│ └── Boat
│ ├── Engine
│ └── Prop [Propeller]
└── Bio [Biological]
└── Whale
├── HW [Humpback whale]
└── NARW [North Atlantic right whale]
Retrieving annotations ¶
Korus contains a filter function for retrieving annotations based on various search criteria. For example, we can use the filter function to retrieve annotations with a specific (sound-source, sound-type) label,
# it's always a good idea to call reset_filter() before running a new search
indices = db.annotation.reset_filter().filter(select=("NARW","Upcall")).indices
print(indices)
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25]
We now have the indices of the annotations that match our search criteria. We can use the get method to obtain the data associated with these annotations as a Pandas DataFrame,
df = db.annotation.get(indices, fields=["start_utc", "duration", "label"], as_pandas=True)
print(df.to_string())
start_utc duration label 0 2013-06-23 08:18:47.456000+00:00 3.0 (NARW, LU) 1 2013-06-23 08:19:12.142000+00:00 3.0 (NARW, LU) 2 2013-06-23 08:19:55.394000+00:00 3.0 (NARW, LU) 3 2013-06-23 08:20:26.258000+00:00 3.0 (NARW, LU) 4 2013-06-23 08:22:36.797000+00:00 3.0 (NARW, LU) 5 2013-06-23 08:23:56.098000+00:00 3.0 (NARW, LU) 6 2013-06-23 08:24:47.904000+00:00 3.0 (NARW, LU) 7 2013-06-23 08:25:10.286000+00:00 3.0 (NARW, LU) 8 2013-06-23 08:25:29.211000+00:00 3.0 (NARW, LU) 9 2013-06-23 08:25:35.196000+00:00 3.0 (NARW, LU) 10 2013-06-23 08:28:32.988000+00:00 3.0 (NARW, LU) 11 2013-06-23 08:29:26.867000+00:00 3.0 (NARW, LU) 12 2013-06-23 08:29:36.451000+00:00 3.0 (NARW, LU) 13 2013-06-23 08:31:06.766000+00:00 3.0 (NARW, LU) 14 2013-06-23 08:41:27.124000+00:00 3.0 (NARW, LU) 15 2013-06-23 08:42:34.557000+00:00 3.0 (NARW, LU) 16 2013-06-23 08:42:49.057000+00:00 3.0 (NARW, LU) 17 2013-06-24 08:01:07.471000+00:00 3.0 (NARW, LU) 18 2013-06-24 08:01:44.545000+00:00 3.0 (NARW, LU) 19 2013-06-24 08:17:35.633000+00:00 3.0 (NARW, LU) 20 2013-06-24 08:17:45.900000+00:00 3.0 (NARW, LU) 21 2013-06-24 08:21:29.181000+00:00 3.0 (NARW, LU) 22 2013-06-24 08:22:57.573000+00:00 3.0 (NARW, LU) 23 2013-06-24 08:23:47.266000+00:00 3.0 (NARW, LU) 24 2013-06-24 08:27:42.240000+00:00 3.0 (NARW, LU) 25 2013-06-24 08:27:55.300000+00:00 3.0 (NARW, LU)
Note that although we filtered on the sound type 'Upcall', as defined in 2nd version of the taxonomy, the filter function still helped us find all the annotations that had the "old" sound-type label, 'LU', used in the 1st version of the taxonomy as a label for 'loud' upcalls.
Creating selection windows ¶
Korus also has a function for creating uniform-size windows from a set of annotations, which is useful for creating training sets for machine learning models.
df = db.annotation.create_selections(indices, window=3.0)
print(df.to_string())
sel_id filename start end annot_id 0 0 /home/mrmoose/acoustic-data/20130623/audio_20130623T080000.116Z.flac 1127.340 1130.340 0 1 1 /home/mrmoose/acoustic-data/20130623/audio_20130623T080000.116Z.flac 1152.026 1155.026 1 2 2 /home/mrmoose/acoustic-data/20130623/audio_20130623T080000.116Z.flac 1195.278 1198.278 2 3 3 /home/mrmoose/acoustic-data/20130623/audio_20130623T080000.116Z.flac 1226.142 1229.142 3 4 4 /home/mrmoose/acoustic-data/20130623/audio_20130623T080000.116Z.flac 1356.681 1359.681 4 5 5 /home/mrmoose/acoustic-data/20130623/audio_20130623T080000.116Z.flac 1435.982 1438.982 5 6 6 /home/mrmoose/acoustic-data/20130623/audio_20130623T080000.116Z.flac 1487.788 1490.788 6 7 7 /home/mrmoose/acoustic-data/20130623/audio_20130623T080000.116Z.flac 1510.170 1513.170 7 8 8 /home/mrmoose/acoustic-data/20130623/audio_20130623T080000.116Z.flac 1529.095 1532.095 8 9 9 /home/mrmoose/acoustic-data/20130623/audio_20130623T080000.116Z.flac 1535.080 1538.080 9 10 10 /home/mrmoose/acoustic-data/20130623/audio_20130623T080000.116Z.flac 1712.872 1715.872 10 11 11 /home/mrmoose/acoustic-data/20130623/audio_20130623T080000.116Z.flac 1766.751 1769.751 11 12 12 /home/mrmoose/acoustic-data/20130623/audio_20130623T080000.116Z.flac 1776.335 1779.335 12 13 13 /home/mrmoose/acoustic-data/20130623/audio_20130623T083000.117Z.flac 66.649 69.649 13 14 14 /home/mrmoose/acoustic-data/20130623/audio_20130623T083000.117Z.flac 687.007 690.007 14 15 15 /home/mrmoose/acoustic-data/20130623/audio_20130623T083000.117Z.flac 754.440 757.440 15 16 16 /home/mrmoose/acoustic-data/20130623/audio_20130623T083000.117Z.flac 768.940 771.940 16 17 17 /home/mrmoose/acoustic-data/20130624/audio_20130624T080000.118Z.flac 67.353 70.353 17 18 18 /home/mrmoose/acoustic-data/20130624/audio_20130624T080000.118Z.flac 104.427 107.427 18 19 19 /home/mrmoose/acoustic-data/20130624/audio_20130624T080000.118Z.flac 1055.515 1058.515 19 20 20 /home/mrmoose/acoustic-data/20130624/audio_20130624T080000.118Z.flac 1065.782 1068.782 20 21 21 /home/mrmoose/acoustic-data/20130624/audio_20130624T080000.118Z.flac 1289.063 1292.063 21 22 22 /home/mrmoose/acoustic-data/20130624/audio_20130624T080000.118Z.flac 1377.455 1380.455 22 23 23 /home/mrmoose/acoustic-data/20130624/audio_20130624T080000.118Z.flac 1427.148 1430.148 23 24 24 /home/mrmoose/acoustic-data/20130624/audio_20130624T080000.118Z.flac 1662.122 1665.122 24 25 25 /home/mrmoose/acoustic-data/20130624/audio_20130624T080000.118Z.flac 1675.182 1678.182 25
The table returned by the create_selections method can readily be ingested into Ketos!
Ending the session ¶
As the last step of the tutorial, we close the connection to the database.
db.backend.close()