Module: Familia::Features::Relationships::Indexing::MultiIndexGenerators
- Defined in:
- lib/familia/features/relationships/indexing/multi_index_generators.rb
Overview
Generators for multi-value index (1:many) methods
Multi-value indexes use UnsortedSet DataType for grouping objects by field value. Each field value gets its own set of object identifiers.
Example: multi_index :department, :dept_index, within: Company
Generates on Company (destination):
- company.sample_from_department(dept, count=1)
- company.find_all_by_department(dept)
- company.dept_index_for(dept_value)
- company.rebuild_dept_index
Generates on Employee (self):
- employee.add_to_company_dept_index(company)
- employee.remove_from_company_dept_index(company)
- employee.update_in_company_dept_index(company, old_dept)
Class Method Summary collapse
-
.generate_factory_method(scope_class, index_name) ⇒ Object
Generates the factory method ON THE SCOPE CLASS (Company when within: Company): - company.index_name_for(field_value) - DataType factory (always needed).
-
.generate_mutation_methods_self(indexed_class, field, scope_class, index_name) ⇒ Object
Generates mutation methods ON THE INDEXED CLASS (Employee): - employee.add_to_company_dept_index(company) - employee.remove_from_company_dept_index(company) - employee.update_in_company_dept_index(company, old_dept).
-
.generate_query_methods_destination(indexed_class, field, scope_class, index_name) ⇒ Object
Generates query methods ON THE SCOPE CLASS (Company when within: Company): - company.sample_from_department(dept, count=1) - random sampling - company.find_all_by_department(dept) - all objects - company.rebuild_dept_index - rebuild index.
-
.setup(indexed_class:, field:, index_name:, within:, query:) ⇒ Object
Main setup method that orchestrates multi-value index creation.
Class Method Details
.generate_factory_method(scope_class, index_name) ⇒ Object
Generates the factory method ON THE SCOPE CLASS (Company when within: Company):
- company.index_name_for(field_value) - DataType factory (always needed)
This method is required by mutation methods even when query: false
75 76 77 78 79 80 81 82 83 84 85 86 87 |
# File 'lib/familia/features/relationships/indexing/multi_index_generators.rb', line 75 def generate_factory_method(scope_class, index_name) actual_scope_class = Familia.resolve_class(scope_class) actual_scope_class.class_eval do # Helper method to get index set for a specific field value # This acts as a factory for field-value-specific DataTypes define_method(:"#{index_name}_for") do |field_value| # Return properly managed DataType instance with parameterized key index_key = Familia.join(index_name, field_value) Familia::UnsortedSet.new(index_key, parent: self) end end end |
.generate_mutation_methods_self(indexed_class, field, scope_class, index_name) ⇒ Object
Generates mutation methods ON THE INDEXED CLASS (Employee):
- employee.add_to_company_dept_index(company)
- employee.remove_from_company_dept_index(company)
- employee.update_in_company_dept_index(company, old_dept)
268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 |
# File 'lib/familia/features/relationships/indexing/multi_index_generators.rb', line 268 def generate_mutation_methods_self(indexed_class, field, scope_class, index_name) scope_class_config = scope_class.config_name indexed_class.class_eval do method_name = :"add_to_#{scope_class_config}_#{index_name}" Familia.debug("[MultiIndexGenerators] #{name} method #{method_name}") define_method(method_name) do |scope_instance| return unless scope_instance field_value = send(field) return unless field_value # Use helper method on scope instance instead of manual instantiation index_set = scope_instance.send("#{index_name}_for", field_value) # Use UnsortedSet DataType method (no scoring) index_set.add(identifier) end method_name = :"remove_from_#{scope_class_config}_#{index_name}" Familia.debug("[MultiIndexGenerators] #{name} method #{method_name}") define_method(method_name) do |scope_instance| return unless scope_instance field_value = send(field) return unless field_value # Use helper method on scope instance instead of manual instantiation index_set = scope_instance.send("#{index_name}_for", field_value) # Remove using UnsortedSet DataType method index_set.remove(identifier) end method_name = :"update_in_#{scope_class_config}_#{index_name}" Familia.debug("[MultiIndexGenerators] #{name} method #{method_name}") define_method(method_name) do |scope_instance, old_field_value = nil| return unless scope_instance new_field_value = send(field) # Use Familia's transaction method for atomicity with DataType abstraction scope_instance.transaction do |_tx| # Remove from old index if provided - use helper method if old_field_value old_index_set = scope_instance.send("#{index_name}_for", old_field_value) old_index_set.remove(identifier) end # Add to new index if present - use helper method if new_field_value new_index_set = scope_instance.send("#{index_name}_for", new_field_value) new_index_set.add(identifier) end end end end end |
.generate_query_methods_destination(indexed_class, field, scope_class, index_name) ⇒ Object
Generates query methods ON THE SCOPE CLASS (Company when within: Company):
- company.sample_from_department(dept, count=1) - random sampling
- company.find_all_by_department(dept) - all objects
- company.rebuild_dept_index - rebuild index
98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 |
# File 'lib/familia/features/relationships/indexing/multi_index_generators.rb', line 98 def generate_query_methods_destination(indexed_class, field, scope_class, index_name) # Resolve scope class using Familia pattern actual_scope_class = Familia.resolve_class(scope_class) # Get scope_class_config for method naming (needed for rebuild methods) scope_class_config = actual_scope_class.config_name # Generate instance sampling method (e.g., company.sample_from_department) actual_scope_class.class_eval do define_method(:"sample_from_#{field}") do |field_value, count = 1| index_set = send("#{index_name}_for", field_value) # i.e. UnsortedSet # Get random members efficiently (O(1) via SRANDMEMBER with count) # Returns array even for count=1 for consistent API index_set.sample(count).map do |id| indexed_class.find_by_identifier(id) end end # Generate bulk query method (e.g., company.find_all_by_department) define_method(:"find_all_by_#{field}") do |field_value| index_set = send("#{index_name}_for", field_value) # i.e. UnsortedSet # Get all members from set index_set.members.map { |id| indexed_class.find_by_identifier(id) } end # Generate method to rebuild the multi-value index for this parent instance # # Multi-indexes create separate sets for each field value, requiring a three-phase approach: # 1. Loading: Load all objects once and cache them (discovers field values simultaneously) # 2. Clearing: Remove all existing index sets using SCAN # 3. Rebuilding: Rebuild index from cached objects (no reload needed) # # @param batch_size [Integer] Number of identifiers to process per batch # @yield [progress] Optional block called with progress updates # @yieldparam progress [Hash] Progress information with keys: # - :phase [Symbol] Current phase (:loading, :clearing, :rebuilding) # - :current [Integer] Current item count # - :total [Integer] Total items (when known) # - :field_value [String] Current field value being processed # # @example Basic rebuild # company.rebuild_dept_index # # @example With progress monitoring # company.rebuild_dept_index do |progress| # puts "#{progress[:phase]}: #{progress[:current]}/#{progress[:total]}" # end # # @example Memory-conscious rebuild for large collections # # Process in smaller batches to reduce memory footprint # company.rebuild_dept_index(batch_size: 50) # # @note Memory Considerations: # This method caches all objects in memory during rebuild to avoid duplicate # database loads. For very large collections (>100k objects), monitor memory usage # and consider processing in chunks or using a streaming approach if memory # constraints are encountered. The batch_size parameter controls Redis I/O # batching but does not affect memory usage since all objects are cached. # define_method(:"rebuild_#{index_name}") do |batch_size: 100, &progress_block| # PHASE 1: Find the collection containing the indexed objects # Look for a participation relationship where indexed_class participates in this scope_class collection_name = nil # Check if indexed_class has participation to this scope_class if indexed_class.respond_to?(:participation_relationships) participation = indexed_class.participation_relationships.find do |rel| rel.target_class == self.class end collection_name = participation&.collection_name if participation end # Get the collection DataType if we found a participation relationship collection = collection_name ? send(collection_name) : nil if collection # PHASE 2: Load objects once and cache them for both discovery and rebuilding # This avoids duplicate load_multi calls (previous approach loaded twice) progress_block&.call(phase: :loading, current: 0, total: collection.size) field_values = Set.new cached_objects = [] processed = 0 collection.members.each_slice(batch_size) do |identifiers| # Load objects in batches - SINGLE LOAD for both phases objects = indexed_class.load_multi(identifiers).compact cached_objects.concat(objects) objects.each do |obj| value = obj.send(field) # Only track non-nil, non-empty field values field_values << value.to_s if value && !value.to_s.strip.empty? end processed += identifiers.size progress_block&.call(phase: :loading, current: processed, total: collection.size) end # PHASE 3: Clear all existing field-value-specific index sets # Use SCAN to find all existing index keys (including orphaned ones from deleted field values) progress_block&.call(phase: :clearing, current: 0, total: field_values.size) # Get the base pattern for this index by creating a sample index set # The "*" creates a wildcard pattern like "company:123:dept_index:*" for SCAN sample_index = send(:"#{index_name}_for", "*") index_pattern = sample_index.dbkey # Find all existing index keys using SCAN cleared_count = 0 dbclient.scan_each(match: index_pattern) do |key| dbclient.del(key) cleared_count += 1 progress_block&.call(phase: :clearing, current: cleared_count, total: field_values.size, key: key) end # PHASE 4: Rebuild index from cached objects (no reload needed) progress_block&.call(phase: :rebuilding, current: 0, total: cached_objects.size) processed = 0 cached_objects.each_slice(batch_size) do |objects| transaction do |_tx| objects.each do |obj| # Use the generated add_to method to maintain consistency # This ensures the same logic is used as during normal operation obj.send(:"add_to_#{scope_class_config}_#{index_name}", self) end end processed += objects.size progress_block&.call(phase: :rebuilding, current: processed, total: cached_objects.size) end Familia.info "[Rebuild] Multi-index #{index_name} rebuilt: #{field_values.size} field values, #{processed} objects" processed # Return count of processed objects else # No participation relationship found - warn and suggest alternative Familia.warn <<~WARNING [Rebuild] Cannot rebuild multi-index #{index_name}: no participation relationship found Multi-index rebuild requires a participation relationship to find objects. Add a participation relationship to #{indexed_class.name}: class #{indexed_class.name} < Familia::Horreum participates_in #{self.class.name}, :collection_name, score: :field end Then access the collection via: #{self.class.config_name}.collection_name WARNING nil end end end end |
.setup(indexed_class:, field:, index_name:, within:, query:) ⇒ Object
Main setup method that orchestrates multi-value index creation
39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 |
# File 'lib/familia/features/relationships/indexing/multi_index_generators.rb', line 39 def setup(indexed_class:, field:, index_name:, within:, query:) # Multi-index always requires a scope context scope_class = within resolved_class = Familia.resolve_class(scope_class) # Store metadata for this indexing relationship indexed_class.indexing_relationships << IndexingRelationship.new( field: field, scope_class: scope_class, within: within, index_name: index_name, query: query, cardinality: :multi, ) # Always generate the factory method - required by mutation methods if scope_class.is_a?(Class) generate_factory_method(resolved_class, index_name) end # Generate query methods on the scope class (optional) if query && scope_class.is_a?(Class) generate_query_methods_destination(indexed_class, field, resolved_class, index_name) end # Generate mutation methods on the indexed class generate_mutation_methods_self(indexed_class, field, resolved_class, index_name) end |