-
Notifications
You must be signed in to change notification settings - Fork 1
besquared/quotient-cube
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
Quotient Cube is an algorithm and strategy for storing and querying data cubes.
This is a pure ruby implementation of that algorithm.
Requires tokyocabinet, autobots-transform
It is currently powering a personal analytics project and I think it is stable
enough (under some assumptions) for test and personal usage. Can also be used
as a tool for learning how this particular algorithm works.
Right now there is no support for incremental updates so everything is done
in batch. Smaller tables say with <10 dimensions can be built relatively
quickly (a few thousand rows/second) so this is appropriate for daily batches
consisting of several million rows. Once the cube is pre-computed querying
is very fast. Point queries on a small tree can be done at the rate of 4000-5000
per second. While range queries can be performed at rate of about 1000 per second.
Example:
require 'tokyocabinet'
require 'quotient-cube'
# 1) Base Data Table
@table = Table.new(
:column_names => [
'store', 'product', 'season', 'sales'
], :data => [
['S1', 'P1', 's', 6],
['S1', 'P2', 's', 12],
['S2', 'P1', 'f', 9]
]
)
@measures = ['sales[sum]', 'sales[avg]']
@dimensions = ['store', 'product', 'season']
# 2) Build quotient cube data structure
cube = QuotientCube::Base.build(@table, @dimensions, @measures) do |table, pointers|
sum = 0
pointers.each do |pointer|
sum += table[pointer]['sales']
end
[sum, sum / pointers.length]
end
# 3) Write out Quotient Cube Tree (QC-Tree) to disk
database = BDB.new
database.open(data_path, BDB::OWRITER | BDB::OCREAT)
Tree::Builder.new(database, cube).build
database.close
# Querying the QC-Tree
database.open
tree = Tree::Base.new(database)
# Point Query
@tree.find('sales[avg]', :conditions => {'product' => 'P1', 'season' => 'f'})
#=> {'product' => 'P1', 'season' => 'f', 'sales[avg]' => 9.0}
# Range Query
@tree.find('sales[avg]', :conditions => {'product' => ['P1', 'P2', 'P3']})
#=> [{'product' => 'P1', 'sales[avg]' => 7.5},
{'product' => 'P2', 'sales[avg]' => 12.0}]
# Range Query
@tree.find(:all, :conditions => {'product' => :all})
#=> [
{'product' => 'P1', 'sales[avg]' => 7.5, 'sales[sum]' => 15.0},
{'product' => 'P2', 'sales[avg]' => 12.0, 'sales[sum]' => 12.0}
]About
An implementation of the quotient cube algorithm
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published