Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
54 commits
Select commit Hold shift + click to select a range
8335f11
new shiny create_model code that is more memory efficient
spreeker Oct 28, 2020
69f6ade
enable mistaken disabled code
spreeker Oct 28, 2020
d7bdcab
update readme
spreeker Oct 28, 2020
a34870a
add column validation
spreeker Nov 2, 2020
841bf49
reduce count was not registered fixed in template
spreeker Nov 12, 2020
aaa23b0
fix validation
spreeker Nov 19, 2020
a1bd933
start work on geo s2 stuff
spreeker Nov 19, 2020
622f46e
wip s2 test
spreeker Nov 27, 2020
9512c74
first working geo-selection
spreeker Dec 3, 2020
613bdff
working geo-selection
spreeker Dec 3, 2020
8ab6bb8
add curl test case for geo query
spreeker Dec 3, 2020
93f801d
wip test with labels
spreeker Dec 7, 2020
63d3429
geosearch works
spreeker Dec 8, 2020
c370cef
improve geojson handling
spreeker Dec 9, 2020
1562973
show http table
spreeker Dec 14, 2020
5059fef
keep lambda working for data without geometry
spreeker Jan 12, 2021
2ec0762
create model now has ingore and geocolumn options
spreeker Jan 14, 2021
b5c3a2f
use gzipped csv
spreeker Oct 8, 2020
0cc33c6
handle review remarks
spreeker Oct 27, 2020
507f79b
add groupby / reduce to query and validate parameters
spreeker Jan 19, 2021
7b83294
groupby cache experiment
spreeker Jan 19, 2021
8a59246
start factoring bitarray into templateable code
spreeker Jan 26, 2021
6b7be31
fix bugs using multiple bitarray keys
spreeker Jan 26, 2021
0d224b9
fix cache headers
spreeker Jan 26, 2021
b4be4ae
update model template
spreeker Jan 26, 2021
1fbfaf5
start with bitarray templateing
spreeker Jan 27, 2021
197e3dc
bit array model code generation is working now
spreeker Feb 3, 2021
fda371f
add custom groupby
spreeker Feb 4, 2021
0d525a5
add missing return
spreeker Feb 4, 2021
b893f60
validated and fixed reason for missing schools
spreeker Feb 10, 2021
280b26a
add woning equivalent reduce
spreeker Feb 15, 2021
f6e924f
try pgzip
spreeker Feb 15, 2021
ba495a2
added bouwjaar
spreeker Feb 17, 2021
fff9bcb
added readlock, moved custom code
spreeker Feb 17, 2021
aac4b78
update readme
spreeker Mar 9, 2021
041b486
fix bug returning raw item json
spreeker Mar 9, 2021
1c9f9d1
allow reduce without groupby
spreeker Mar 10, 2021
05a3627
add header column to csv
spreeker Mar 15, 2021
c2ff2d7
working merged build. labeledItems renamed to Items
spreeker Mar 22, 2021
d4a1ad6
remove merge mistake
spreeker Mar 22, 2021
ff690a7
remove merge mistake
spreeker Mar 22, 2021
7faf26b
wip working new storage / retrieve methods
spreeker Apr 19, 2021
a7fc206
first working tests
spreeker Apr 20, 2021
e74c533
first geo testing wip
spreeker Apr 20, 2021
c87ba16
working geojson tests, removed some code duplication
spreeker Apr 21, 2021
76c0d6b
add storage test, create example requests
spreeker Apr 21, 2021
cc247ce
wip: rewrite model creation, code, added column.go code
spreeker Apr 26, 2021
4a306f6
wip: fix model creation after rewrite
spreeker Apr 26, 2021
e6f1e4f
done: code generation now works correctly, added new model and model_…
spreeker Apr 27, 2021
5733650
fix: colom test
spreeker Apr 27, 2021
b51468a
docs: column.go
spreeker Apr 27, 2021
f8cb26e
docs: model.go creation
spreeker Apr 27, 2021
29daf04
production version, improved error reporting about bitarray usage
spreeker May 4, 2021
bf9d65e
added huisnummer / toevoegingen many small fixes to code generation
spreeker May 10, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
*.csv
*.csv2
.git
4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
extras/model.go
testdata/*
*.gz
lambdadb
3 changes: 2 additions & 1 deletion Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ RUN apk update && apk add --no-cache git
RUN apk --no-cache add ca-certificates

WORKDIR /app
COPY . /app/
COPY *.go /app/

# Fetch dependencies.
RUN go get -d -v
Expand All @@ -23,6 +23,7 @@ COPY --from=builder /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/
#COPY --from=builder /app/files/ITEMS.txt.gz /app/files/ITEMS.txt.gz

WORKDIR /app

# Run the binary.

ENV http_db_host "0.0.0.0:8000"
Expand Down
58 changes: 52 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,32 +1,78 @@
# LambdaDB
In memory database that uses filters to get the data you need.
Lambda DB has a tiny codebase which does a lot
Lambda is not ment as a persistance storage or a replacement for a traditional
Database but as fast analytics engine cache representation engine.

Can be used for your needs by changing the models.go file to your needs.
powers: https://dego.vng.nl

## Properties:

- Insanely fast API. 1ms respsonses
- Fast to setup.
- Easy to deploy.
- Easy to customize.
- Easy export data

- Implement custom authorized filters.

## Indexes

- S2 geoindex for fast point lookup
- Bitarrays
- Mapping

- Your own special needs indexes!

## Flow:

Generate a model and load your data.
The API is generated from your model.
Deploy.

Condition: Your dataset must fit in memory.

Can be used for your needs by changing the `models.go` file to your needs.
Creating and registering of the functionality that is needed.


### Steps
You can start the database with only a csv.
Go over steps below, And see the result in your browser.

1. place csv file, in dir extras.
2. `python3 create_model.py > ../model.go`
3. cd ../
4. go fmt
2. `python3 create_model_.py` answer the questions.
3. go fmt model.go
4. mv model.go ../
5. go build
6. ./lambda --help
7. ./lambda --csv assets/items.csv or `python3 ingestion.py -b 1000`
9. curl 127.0.0.1:8128/help/
10. browser 127.0.0.1:8128/


11. instructions curl 127.0.0.1:8128/help/ | python -m json.tool



### Running

sudo docker-compose up --no-deps --build

promql {instance="lambdadb:8000"}

python3 extras/ingestion.py -f movies_subset.tsv -format tsv -dbhost 127.0.0.1:8000
=======

1. instructions curl 127.0.0.1:8000/help/ | python -m json.tool

### Questions



### TODO

- load data directly from a database (periodic)
- document the `create_model.py` questions
- use a remote data source
- use some more efficient storage method (done)
- generate swagger API
- Add more tests
148 changes: 148 additions & 0 deletions column.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,148 @@
package main

import (
"errors"
"fmt"
"github.com/Workiva/go-datastructures/bitarray"
"log"
"strings"
)

type fieldIdxMap map[string]uint32
type IdxFieldMap map[uint32]string

/*
MappedColumn store fields in Idx
*/

type MappedColumn struct {
Idx fieldIdxMap // stores field to int values
Field IdxFieldMap // stores int to field values to recover actual fields
IdxTracker uint32
Name string
}

type ColumnRegister map[string]MappedColumn

var RegisteredColumns ColumnRegister

func init() {
RegisteredColumns = make(ColumnRegister)
}

func NewReapeatedColumn(column string) MappedColumn {
m := MappedColumn{
make(fieldIdxMap),
make(IdxFieldMap),
0,
column,
}
RegisteredColumns[column] = m
return m
}

// Store field name as idx value and idx as field value
func (m *MappedColumn) Store(field string) {

if _, ok := m.Idx[field]; !ok {
m.Idx[field] = m.IdxTracker
m.Field[m.IdxTracker] = field
m.IdxTracker++
}
}

// Store Array field (postgres Array).
func (m *MappedColumn) StoreArray(field string) []uint32 {

fieldsArray := make([]uint32, 0)

// parsing {a, b} array values
// string should be at least 2 example "{}" == size 2
if len(field) > 2 {
fields, err := ParsePGArray(field)

if err != nil {
log.Fatal(err, "error parsing array ")
}

for _, gd := range fields {
m.Store(gd)
}

for _, v := range fields {
fieldsArray = append(fieldsArray, Gebruiksdoelen.GetIndex(v))
}
}
return fieldsArray
}

func (m *MappedColumn) GetValue(idx uint32) string {
return m.Field[idx]
}

func (m *MappedColumn) GetArrayValue(idxs []uint32) string {

result := make([]string, 0)
for _, v := range idxs {
vs := m.GetValue(v)
result = append(result, vs)
}
return strings.Join(result, ", ")
}

func (m *MappedColumn) GetIndex(s string) uint32 {
return m.Idx[s]
}

// SetBitArray
func SetBitArray(column string, i uint32, label int) {

var ba bitarray.BitArray
var ok bool

// check if map of bitmaps is present for column
var map_ba fieldBitarrayMap

if _, ok = BitArrays[column]; !ok {
map_ba := make(fieldBitarrayMap)
BitArrays[column] = map_ba
}

map_ba = BitArrays[column]

// check for existing bitarray for i value
ba, ok = map_ba[i]

if !ok {
ba = bitarray.NewSparseBitArray()
map_ba[i] = ba
}
// set bit for item label.
ba.SetBit(uint64(label))
}

func GetBitArray(column, value string) (bitarray.BitArray, error) {

var ok bool

if _, ok = BitArrays[column]; !ok {
return nil, errors.New("no bitarray filter found for " + column)
}

// convert string value to actual indexed int.
i, ok := RegisteredColumns[column].Idx[value]

if !ok {
msg := fmt.Sprintf("no indexed int value found for %s %s", column, value)
return nil, errors.New(msg)
}

ba, ok := BitArrays[column][i]

if !ok {
msg := fmt.Sprintf("no bitarray found for %s %s %d", column, value, i)
return nil, errors.New(msg)
}

return ba, nil
}
Loading