Class: Spotted::Models::AudioAnalysisRetrieveResponse::Segment

Inherits:

Internal::Type::BaseModel

Object
Internal::Type::BaseModel
Spotted::Models::AudioAnalysisRetrieveResponse::Segment

show all

Defined in:: lib/spotted/models/audio_analysis_retrieve_response.rb

Instance Attribute Summary collapse

#confidence ⇒ Float^?

The confidence, from 0.0 to 1.0, of the reliability of the segmentation.
#duration ⇒ Float^?

The duration (in seconds) of the segment.
#loudness_end ⇒ Float^?

The offset loudness of the segment in decibels (dB).
#loudness_max ⇒ Float^?

The peak loudness of the segment in decibels (dB).
#loudness_max_time ⇒ Float^?

The segment-relative offset of the segment peak loudness in seconds.
#loudness_start ⇒ Float^?

The onset loudness of the segment in decibels (dB).
#pitches ⇒ Array<Float>^?

Pitch content is given by a “chroma” vector, corresponding to the 12 pitch classes C, C#, D to B, with values ranging from 0 to 1 that describe the relative dominance of every pitch in the chromatic scale.
#start ⇒ Float^?

The starting point (in seconds) of the segment.
#timbre ⇒ Array<Float>^?

Timbre is the quality of a musical note or sound that distinguishes different types of musical instruments, or voices.

Method Summary

Constructor Details

This class inherits a constructor from Spotted::Internal::Type::BaseModel

Instance Attribute Details

#confidence ⇒ `Float`^?

The confidence, from 0.0 to 1.0, of the reliability of the segmentation. Segments of the song which are difficult to logically segment (e.g: noise) may correspond to low values in this field.

Returns:

(Float, nil)

281	# File 'lib/spotted/models/audio_analysis_retrieve_response.rb', line 281 optional :confidence, Float

#duration ⇒ `Float`^?

The duration (in seconds) of the segment.

Returns:

(Float, nil)

287	# File 'lib/spotted/models/audio_analysis_retrieve_response.rb', line 287 optional :duration, Float

#loudness_end ⇒ `Float`^?

The offset loudness of the segment in decibels (dB). This value should be equivalent to the loudness_start of the following segment.

Returns:

(Float, nil)

294	# File 'lib/spotted/models/audio_analysis_retrieve_response.rb', line 294 optional :loudness_end, Float

#loudness_max ⇒ `Float`^?

The peak loudness of the segment in decibels (dB). Combined with ‘loudness_start` and `loudness_max_time`, these components can be used to describe the “attack” of the segment.

Returns:

(Float, nil)

302	# File 'lib/spotted/models/audio_analysis_retrieve_response.rb', line 302 optional :loudness_max, Float

#loudness_max_time ⇒ `Float`^?

The segment-relative offset of the segment peak loudness in seconds. Combined with ‘loudness_start` and `loudness_max`, these components can be used to desctibe the “attack” of the segment.

Returns:

(Float, nil)

310	# File 'lib/spotted/models/audio_analysis_retrieve_response.rb', line 310 optional :loudness_max_time, Float

#loudness_start ⇒ `Float`^?

The onset loudness of the segment in decibels (dB). Combined with ‘loudness_max` and `loudness_max_time`, these components can be used to describe the “attack” of the segment.

Returns:

(Float, nil)

318	# File 'lib/spotted/models/audio_analysis_retrieve_response.rb', line 318 optional :loudness_start, Float

#pitches ⇒ `Array<Float>`^?

Pitch content is given by a “chroma” vector, corresponding to the 12 pitch classes C, C#, D to B, with values ranging from 0 to 1 that describe the relative dominance of every pitch in the chromatic scale. For example a C Major chord would likely be represented by large values of C, E and G (i.e. classes 0, 4, and 7).

Vectors are normalized to 1 by their strongest dimension, therefore noisy sounds are likely represented by values that are all close to 1, while pure tones are described by one value at 1 (the pitch) and others near 0. As can be seen below, the 12 vector indices are a combination of low-power spectrum values at their respective pitch frequencies. ![pitch vector](/assets/audio/Pitch_vector.png)

Returns:

(Array<Float>, nil)

334	# File 'lib/spotted/models/audio_analysis_retrieve_response.rb', line 334 optional :pitches, Spotted::Internal::Type::ArrayOf[Float]

#start ⇒ `Float`^?

The starting point (in seconds) of the segment.

Returns:

(Float, nil)

340	# File 'lib/spotted/models/audio_analysis_retrieve_response.rb', line 340 optional :start, Float

#timbre ⇒ `Array<Float>`^?

Timbre is the quality of a musical note or sound that distinguishes different types of musical instruments, or voices. It is a complex notion also referred to as sound color, texture, or tone quality, and is derived from the shape of a segment’s spectro-temporal surface, independently of pitch and loudness. The timbre feature is a vector that includes 12 unbounded values roughly centered around 0. Those values are high level abstractions of the spectral surface, ordered by degree of importance.

For completeness however, the first dimension represents the average loudness of the segment; second emphasizes brightness; third is more closely correlated to the flatness of a sound; fourth to sounds with a stronger attack; etc. See an image below representing the 12 basis functions (i.e. template segments). ![timbre basis functions](/assets/audio/Timbre_basis_functions.png)

The actual timbre of the segment is best described as a linear combination of these 12 basis functions weighted by the coefficient values: timbre = c1 x b1 + c2 x b2 + … + c12 x b12, where c1 to c12 represent the 12 coefficients and b1 to b12 the 12 basis functions as displayed below. Timbre vectors are best used in comparison with each other.

Returns:

(Array<Float>, nil)

364	# File 'lib/spotted/models/audio_analysis_retrieve_response.rb', line 364 optional :timbre, Spotted::Internal::Type::ArrayOf[Float]

Class: Spotted::Models::AudioAnalysisRetrieveResponse::Segment

Instance Attribute Summary collapse

Method Summary

Methods inherited from Internal::Type::BaseModel

Methods included from Internal::Type::Converter

Methods included from Internal::Util::SorbetRuntimeSupport

Constructor Details

Instance Attribute Details

#confidence ⇒ Float?

#duration ⇒ Float?

#loudness_end ⇒ Float?

#loudness_max ⇒ Float?

#loudness_max_time ⇒ Float?

#loudness_start ⇒ Float?

#pitches ⇒ Array<Float>?

#start ⇒ Float?

#timbre ⇒ Array<Float>?

#confidence ⇒ `Float`^?

#duration ⇒ `Float`^?

#loudness_end ⇒ `Float`^?

#loudness_max ⇒ `Float`^?

#loudness_max_time ⇒ `Float`^?

#loudness_start ⇒ `Float`^?

#pitches ⇒ `Array<Float>`^?

#start ⇒ `Float`^?

#timbre ⇒ `Array<Float>`^?