Class: Spotted::Models::AudioAnalysisRetrieveResponse::Segment

Inherits:
Internal::Type::BaseModel show all
Defined in:
lib/spotted/models/audio_analysis_retrieve_response.rb

Instance Attribute Summary collapse

Method Summary

Methods inherited from Internal::Type::BaseModel

==, #==, #[], coerce, #deconstruct_keys, #deep_to_h, dump, fields, hash, #hash, inherited, #initialize, inspect, #inspect, known_fields, optional, recursively_to_h, required, #to_h, #to_json, #to_s, to_sorbet_type, #to_yaml

Methods included from Internal::Type::Converter

#coerce, coerce, #dump, dump, #inspect, inspect, meta_info, new_coerce_state, type_info

Methods included from Internal::Util::SorbetRuntimeSupport

#const_missing, #define_sorbet_constant!, #sorbet_constant_defined?, #to_sorbet_type, to_sorbet_type

Constructor Details

This class inherits a constructor from Spotted::Internal::Type::BaseModel

Instance Attribute Details

#confidenceFloat?

The confidence, from 0.0 to 1.0, of the reliability of the segmentation. Segments of the song which are difficult to logically segment (e.g: noise) may correspond to low values in this field.

Returns:

  • (Float, nil)


281
# File 'lib/spotted/models/audio_analysis_retrieve_response.rb', line 281

optional :confidence, Float

#durationFloat?

The duration (in seconds) of the segment.

Returns:

  • (Float, nil)


287
# File 'lib/spotted/models/audio_analysis_retrieve_response.rb', line 287

optional :duration, Float

#loudness_endFloat?

The offset loudness of the segment in decibels (dB). This value should be equivalent to the loudness_start of the following segment.

Returns:

  • (Float, nil)


294
# File 'lib/spotted/models/audio_analysis_retrieve_response.rb', line 294

optional :loudness_end, Float

#loudness_maxFloat?

The peak loudness of the segment in decibels (dB). Combined with ‘loudness_start` and `loudness_max_time`, these components can be used to describe the “attack” of the segment.

Returns:

  • (Float, nil)


302
# File 'lib/spotted/models/audio_analysis_retrieve_response.rb', line 302

optional :loudness_max, Float

#loudness_max_timeFloat?

The segment-relative offset of the segment peak loudness in seconds. Combined with ‘loudness_start` and `loudness_max`, these components can be used to desctibe the “attack” of the segment.

Returns:

  • (Float, nil)


310
# File 'lib/spotted/models/audio_analysis_retrieve_response.rb', line 310

optional :loudness_max_time, Float

#loudness_startFloat?

The onset loudness of the segment in decibels (dB). Combined with ‘loudness_max` and `loudness_max_time`, these components can be used to describe the “attack” of the segment.

Returns:

  • (Float, nil)


318
# File 'lib/spotted/models/audio_analysis_retrieve_response.rb', line 318

optional :loudness_start, Float

#pitchesArray<Float>?

Pitch content is given by a “chroma” vector, corresponding to the 12 pitch classes C, C#, D to B, with values ranging from 0 to 1 that describe the relative dominance of every pitch in the chromatic scale. For example a C Major chord would likely be represented by large values of C, E and G (i.e. classes 0, 4, and 7).

Vectors are normalized to 1 by their strongest dimension, therefore noisy sounds are likely represented by values that are all close to 1, while pure tones are described by one value at 1 (the pitch) and others near 0. As can be seen below, the 12 vector indices are a combination of low-power spectrum values at their respective pitch frequencies. ![pitch vector](/assets/audio/Pitch_vector.png)

Returns:

  • (Array<Float>, nil)


334
# File 'lib/spotted/models/audio_analysis_retrieve_response.rb', line 334

optional :pitches, Spotted::Internal::Type::ArrayOf[Float]

#startFloat?

The starting point (in seconds) of the segment.

Returns:

  • (Float, nil)


340
# File 'lib/spotted/models/audio_analysis_retrieve_response.rb', line 340

optional :start, Float

#timbreArray<Float>?

Timbre is the quality of a musical note or sound that distinguishes different types of musical instruments, or voices. It is a complex notion also referred to as sound color, texture, or tone quality, and is derived from the shape of a segment’s spectro-temporal surface, independently of pitch and loudness. The timbre feature is a vector that includes 12 unbounded values roughly centered around 0. Those values are high level abstractions of the spectral surface, ordered by degree of importance.

For completeness however, the first dimension represents the average loudness of the segment; second emphasizes brightness; third is more closely correlated to the flatness of a sound; fourth to sounds with a stronger attack; etc. See an image below representing the 12 basis functions (i.e. template segments). ![timbre basis functions](/assets/audio/Timbre_basis_functions.png)

The actual timbre of the segment is best described as a linear combination of these 12 basis functions weighted by the coefficient values: timbre = c1 x b1 + c2 x b2 + … + c12 x b12, where c1 to c12 represent the 12 coefficients and b1 to b12 the 12 basis functions as displayed below. Timbre vectors are best used in comparison with each other.

Returns:

  • (Array<Float>, nil)


364
# File 'lib/spotted/models/audio_analysis_retrieve_response.rb', line 364

optional :timbre, Spotted::Internal::Type::ArrayOf[Float]