Tutorial 1
Taxonomies¶
Introduction ¶
Annotation/classification taxonomies are based on a simple, tree-like data structure in which smaller categories derive from larger ones, or more specific concepts derive from more general ones, etc. For example, a killer whale is a particular type of toothed whale, which in turn is a type of whale, which is a type of marine mammal, etc.
Having such taxonomies is useful for two reasons: First, it provides a standard vocabulary for labelling sounds, which ensures that labels are consistent across annotation efforts (e.g. killer whales are consistently tagged as 'KW' rather than a mixture of 'killer whale', 'orca', 'KW', etc.). Second, their hierarchical structure provides a recipe for combining sets of annotations that employ different levels of specificity (e.g. 'killer whale' and 'toothed whale').
When annotating acoustic data, it is customary to use not one, but two 'tags' to label every sound: one tag to specify the sound's source (e.g. a killer whale) and another tag to specify the type of sound (e.g. a tonal call). A taxonomy for acoustic data needs to address both of these. In Korus, we do not enforce a 'universal' taxonomy of sound types shared by all sound sources. Instead, every sound sources can have its own taxonomy of sound types.
Getting ready ¶
We begin by importing the necessary modules, classes, functions, etc.
from korus.tax import AcousticTaxonomy
Creating a taxonomy ¶
The first step in creating an acoustic taxonomy instance of the AcousticTaxonomy
class. In order to do so, we must give the taxonomy a name,
When creating a new taxonomy, our first task consists in specifying the path to the Korus database where we want to store the taxonomy. This can be an existing Korus database, already containing some data, or it can be a new, empty database.
path_db = "tax_t1.sqlite" #filename for a new, empty database
Now we can create the AcousticTaxonomy
class, specifying a name for our taxonomy.
tax = AcousticTaxonomy(name="my-first-taxonomy", path=path_db, overwrite=True)
The database path and the taxonomy name are class attributes, and can be viewed at any time with
print(tax.name)
print(tax.path)
my-first-taxonomy tax_t1.sqlite
Adding sound sources ¶
We begin by adding a top-level node to the taxonomy tree, encompassing all biological sound sources,
node = tax.create_sound_source("Bio", name="Biological", description="Any sound-producing animal")
The first argument is the tag
that will be used for annotating sounds from the source. Ideally, the tag should be as short as possible while still readily intelligible. In addition to the tag
, we also specify the full name
of the sound source and provide a slightly more wordy description. Both the name
and description
arguments are optional while the tag
argument is required.
The create_sound_source
method returns the 'node' object just created. Let us take a look at some of its attributes,
print(node.tag)
print(node.data)
Bio {'name': 'Biological', 'description': 'Any sound-producing animal', 'sound_types': <korus.tree.KTree object at 0x7fc63aa62ac0>}
Next, let us add some more specific biological sound sources
tax.create_sound_source("Whale", parent="Bio")
tax.create_sound_source("NARW", parent="Whale", name="North Atlantic right whale")
tax.create_sound_source("HW", parent="Whale", name="Humpback whale")
Node(tag=HW, identifier=dcb5aafa-2cd0-11ee-a5ac-f1d0a21c7fd4, data={'name': 'Humpback whale', 'sound_types': <korus.tree.KTree object at 0x7fc63aa7a4f0>})
Note how we use the parent
argument to indicate the relationship between the various sound-source categories. For example, Whale
is a particular instance of the more general category Bio
, while NARW
and HW
in turn are particular instances of Whale
.
Let us add another branch to the taxonomy, for anthropogenic sound sources,
tax.create_sound_source("Anthro", name="Anthropogenic", description="Any sound-producing human activity or artefact")
tax.create_sound_source("Boat", parent="Anthro")
tax.create_sound_source("Engine", parent="Boat")
tax.create_sound_source("Prop", parent="Boat", name="Propeller")
Node(tag=Prop, identifier=dcb5aafe-2cd0-11ee-a5ac-f1d0a21c7fd4, data={'name': 'Propeller', 'sound_types': <korus.tree.KTree object at 0x7fc698eda820>})
Finally, let us take a look at the taxonomy that we have just created,
tax.show(append_name=True)
Unknown ├── Anthro [Anthropogenic] │ └── Boat │ ├── Engine │ └── Prop [Propeller] └── Bio [Biological] └── Whale ├── HW [Humpback whale] └── NARW [North Atlantic right whale]
Adding sound types ¶
Having created a small taxonomy of sound sources, it is now time to define some sound types. As mentioned in the introduction every sound source can have its own taxonomy of sound types.
We begin by creating a sound type named 'Tonal call',
node = tax.create_sound_type("TC", source_tag="Whale", name="Tonal call", description="A sound with tonal components")
Note that in order to create the sound type, we had to associate it with a sound source. Here, we chose to associate the sound type 'Tonal call' with the sound source 'Whale' (and all its descendants).
Let us add a few more sound types, specific to the North Atlantic right whale,
node = tax.create_sound_type("LU", source_tag="NARW", parent="TC", name="Loud Upcall", description="A loud upsweep with a typical duration of around 1 s and frequency range of 100-200 Hz")
node = tax.create_sound_type("FU", source_tag="NARW", parent="TC", name="Faint Upcall", description="A faint upcall with a typical duration of around 1 s and frequency range of 100-200 Hz")
node = tax.create_sound_type("GS", source_tag="NARW", name="Gun shot")
Finally, let's inspect the NARW sound-type taxonomy that we have just created
tax.sound_types("NARW").show(append_name=True)
Unknown ├── GS [Gun shot] └── TC [Tonal call] ├── FU [Faint Upcall] └── LU [Loud Upcall]
Saving the taxonomy ¶
Happy with our taxonomy, it is time to save it!
tax.save(comment="this is the first version") # version 1
Modifying or expanding a taxonomy ¶
At a later time, we may decide to expand or modify our taxonomy. For example, we may have realized that faint upcalls (FU) and loud upcall (LU) are in fact the same call type, produced by individuals at different distances from our hydrophone. Therefore, we now want to merge these two sound sources into a single, common node.
The begin by loading the taxonomy from the database file,
tax = AcousticTaxonomy.load(path="tax_t1.sqlite", name="my-first-taxonomy", version=1)
and reminding ourselves what the current tree structure looks like
tax.show(append_name=True) # print all sound sources
Unknown ├── Anthro [Anthropogenic] │ └── Boat │ ├── Engine │ └── Prop [Propeller] └── Bio [Biological] └── Whale ├── HW [Humpback whale] └── NARW [North Atlantic right whale]
tax.sound_types("NARW").show(append_name=True) # print sound types associated with NARW
Unknown ├── GS [Gun shot] └── TC [Tonal call] ├── FU [Faint Upcall] └── LU [Loud Upcall]
Let us now merge the faint upcall and loud upcall nodes into a single node,
# merge the two nodes
tax.merge_sound_types(
tag="Upcall", #tag for the new, merged node
source_tag="NARW", #sound source
children=["FU", "LU"], #the nodes to be merged
remove=True, #whether to remove the child nodes after merging
)
# view the result
tax.sound_types("NARW").show(append_name=True)
Unknown ├── GS [Gun shot] └── TC [Tonal call] └── Upcall
Finally, we must not forget to save the modified taxonomy.
tax.save(comment="merged FU and LU sound types for NARW into a single Upcall sound type") # version 2
This concludes the tutorial.