Skip to content

Commit a761108

Browse files
authored
Adding record column (#73)
This PR contains some refactoring and a new `record` column that allows you to use a `BinaryMarshaler` and `BinaryUnmarshaler` type to be stored. As such, it supports types that implement this standard way of encoding, for example `time.Time`. ```go col := NewCollection() col.CreateColumn("timestamp", ForRecord(func() *time.Time { return new(time.Time) }, nil)) // Insert the time, it implements binary marshaler idx, _ := col.Insert(func(r Row) error { now := time.Unix(1667745766, 0) r.SetRecord("timestamp", &now) return nil }) // We should be able to read back the time col.QueryAt(idx, func(r Row) error { ts, ok := r.Record("timestamp") assert.True(t, ok) assert.Equal(t, "November", ts.(*time.Time).UTC().Month().String()) return nil }) ```
1 parent dbb0148 commit a761108

28 files changed

+1133
-590
lines changed

README.md

Lines changed: 63 additions & 79 deletions
Original file line numberDiff line numberDiff line change
@@ -40,53 +40,44 @@ The general idea is to leverage cache-friendly ways of organizing data in [struc
4040
- [Expiring Values](#expiring-values)
4141
- [Transaction Commit and Rollback](#transaction-commit-and-rollback)
4242
- [Using Primary Keys](#using-primary-keys)
43+
- [Storing Binary Records](#storing-binary-records)
4344
- [Streaming Changes](#streaming-changes)
4445
- [Snapshot and Restore](#snapshot-and-restore)
45-
- [Complete Example](#complete-example)
46+
- [Examples](#examples)
4647
- [Benchmarks](#benchmarks)
4748
- [Contributing](#contributing)
4849

4950
## Collection and Columns
5051

51-
In order to get data into the store, you'll need to first create a `Collection` by calling `NewCollection()` method. Each collection requires a schema, which can be either specified manually by calling `CreateColumn()` multiple times or automatically inferred from an object by calling `CreateColumnsOf()` function.
52-
53-
In the example below we're loading some `JSON` data by using `json.Unmarshal()` and auto-creating colums based on the first element on the loaded slice. After this is done, we can then load our data by inserting the objects one by one into the collection. This is accomplished by calling `InsertObject()` method on the collection itself repeatedly.
52+
In order to get data into the store, you'll need to first create a `Collection` by calling `NewCollection()` method. Each collection requires a schema, which needs to be specified by calling `CreateColumn()` multiple times or automatically inferred from an object by calling `CreateColumnsOf()` function. In the example below we create a new collection with several columns.
5453

5554
```go
56-
data := loadFromJson("players.json")
57-
58-
// Create a new columnar collection
59-
players := column.NewCollection()
60-
players.CreateColumnsOf(data[0])
61-
62-
// Insert every item from our loaded data
63-
for _, v := range data {
64-
players.InsertObject(v)
65-
}
66-
```
67-
68-
Now, let's say we only want specific columns to be added. We can do this by calling `CreateColumn()` method on the collection manually to create the required columns.
69-
70-
```go
71-
// Create a new columnar collection with pre-defined columns
55+
// Create a new collection with some columns
7256
players := column.NewCollection()
7357
players.CreateColumn("name", column.ForString())
7458
players.CreateColumn("class", column.ForString())
7559
players.CreateColumn("balance", column.ForFloat64())
7660
players.CreateColumn("age", column.ForInt16())
61+
```
7762

78-
// Insert every item from our loaded data
79-
for _, v := range loadFromJson("players.json") {
80-
players.InsertObject(v)
81-
}
63+
Now that we have created a collection, we can insert a single record by using `Insert()` method on the collection. In this example we're inserting a single row and manually specifying values. Note that this function returns an `index` that indicates the row index for the inserted row.
64+
65+
```go
66+
index, err := players.Insert(func(r column.Row) error {
67+
r.SetString("name", "merlin")
68+
r.SetString("class", "mage")
69+
r.SetFloat64("balance", 99.95)
70+
r.SetInt16("age", 107)
71+
return nil
72+
})
8273
```
8374

84-
While the previous example demonstrated how to insert many objects, it was doing it one by one and is rather inefficient. This is due to the fact that each `InsertObject()` call directly on the collection initiates a separate transacion and there's a small performance cost associated with it. If you want to do a bulk insert and insert many values, faster, that can be done by calling `Insert()` on a transaction, as demonstrated in the example below. Note that the only difference is instantiating a transaction by calling the `Query()` method and calling the `txn.Insert()` method on the transaction instead the one on the collection.
75+
While the previous example demonstrated how to insert a single row, inserting multiple rows this way is rather inefficient. This is due to the fact that each `Insert()` call directly on the collection initiates a separate transacion and there's a small performance cost associated with it. If you want to do a bulk insert and insert many values, faster, that can be done by calling `Insert()` on a transaction, as demonstrated in the example below. Note that the only difference is instantiating a transaction by calling the `Query()` method and calling the `txn.Insert()` method on the transaction instead the one on the collection.
8576

8677
```go
8778
players.Query(func(txn *column.Txn) error {
88-
for _, v := range loadFromJson("players.json") {
89-
txn.InsertObject(v)
79+
for _, v := range myRawData {
80+
txn.Insert(...)
9081
}
9182
return nil // Commit
9283
})
@@ -356,6 +347,49 @@ players.QueryKey("merlin", func(r column.Row) error {
356347
})
357348
```
358349

350+
## Storing Binary Records
351+
352+
If you find yourself in need of encoding a more complex structure as a single column, you may do so by using `column.ForRecord()` function. This allows you to specify a `BinaryMarshaler` / `BinaryUnmarshaler` type that will get automatically encoded as a single column. In th example below we are creating a `Location` type that implements the required methods.
353+
354+
```go
355+
type Location struct {
356+
X float64 `json:"x"`
357+
Y float64 `json:"y"`
358+
}
359+
360+
func (l Location) MarshalBinary() ([]byte, error) {
361+
return json.Marshal(l)
362+
}
363+
364+
func (l *Location) UnmarshalBinary(b []byte) error {
365+
return json.Unmarshal(b, l)
366+
}
367+
```
368+
369+
Now that we have a record implementation, we can create a column for this struct by using `ForRecord()` function as shown below.
370+
371+
```go
372+
players.CreateColumn("location", ForRecord(func() *Location {
373+
return new(Location)
374+
}, nil)) // no merging
375+
```
376+
377+
In order to manipulate the record, we can use the appropriate `Record()`, `SetRecord()` methods of the `Row`, similarly to other column types.
378+
379+
```go
380+
// Insert a new location
381+
idx, _ := players.Insert(func(r Row) error {
382+
r.SetRecord("location", &Location{X: 1, Y: 2})
383+
return nil
384+
})
385+
386+
// Read the location back
387+
players.QueryAt(idx, func(r Row) error {
388+
location, ok := r.Record("location")
389+
return nil
390+
})
391+
```
392+
359393
## Streaming Changes
360394

361395
This library also supports streaming out all transaction commits consistently, as they happen. This allows you to implement your own change data capture (CDC) listeners, stream data into kafka or into a remote database for durability. In order to enable it, you can simply provide an implementation of a `commit.Logger` interface during the creation of the collection.
@@ -429,59 +463,9 @@ if err != nil {
429463
err := players.Restore(src)
430464
```
431465

432-
## Complete Example
466+
## Examples
433467

434-
```go
435-
func main(){
436-
437-
// Create a new columnar collection
438-
players := column.NewCollection()
439-
players.CreateColumn("serial", column.ForKey())
440-
players.CreateColumn("name", column.ForEnum())
441-
players.CreateColumn("active", column.ForBool())
442-
players.CreateColumn("class", column.ForEnum())
443-
players.CreateColumn("race", column.ForEnum())
444-
players.CreateColumn("age", column.ForFloat64())
445-
players.CreateColumn("hp", column.ForFloat64())
446-
players.CreateColumn("mp", column.ForFloat64())
447-
players.CreateColumn("balance", column.ForFloat64())
448-
players.CreateColumn("gender", column.ForEnum())
449-
players.CreateColumn("guild", column.ForEnum())
450-
451-
// index on humans
452-
players.CreateIndex("human", "race", func(r column.Reader) bool {
453-
return r.String() == "human"
454-
})
455-
456-
// index for mages
457-
players.CreateIndex("mage", "class", func(r column.Reader) bool {
458-
return r.String() == "mage"
459-
})
460-
461-
// index for old
462-
players.CreateIndex("old", "age", func(r column.Reader) bool {
463-
return r.Float() >= 30
464-
})
465-
466-
// Load the items into the collection
467-
loaded := loadFixture("players.json")
468-
players.Query(func(txn *column.Txn) error {
469-
for _, v := range loaded {
470-
txn.InsertObject(v)
471-
}
472-
return nil
473-
})
474-
475-
// Run an indexed query
476-
players.Query(func(txn *column.Txn) error {
477-
name := txn.Enum("name")
478-
return txn.With("human", "mage", "old").Range(func(idx uint32) {
479-
value, _ := name.Get()
480-
println("old mage, human:", value)
481-
})
482-
})
483-
}
484-
```
468+
Multiple complete usage examples of this library can be found in the [examples](https://github.com/kelindar/column/tree/main/examples) directory in this repository.
485469

486470
## Benchmarks
487471

codegen/main.go

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ func main() {
2020
panic(err)
2121
}
2222

23-
dst, err := os.OpenFile("column_numbers.go", os.O_RDWR|os.O_CREATE, os.ModePerm)
23+
dst, err := os.OpenFile("column_numbers.go", os.O_RDWR|os.O_CREATE|os.O_TRUNC, os.ModePerm)
2424
defer dst.Close()
2525
if err != nil {
2626
panic(err)

codegen/numbers.tpl

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -33,27 +33,27 @@ func make{{.Name}}s(opts ...func(*option[{{.Type}}])) Column {
3333
)
3434
}
3535

36-
// {{.Type}}Writer represents a read-write accessor for {{.Type}}
37-
type {{.Type}}Writer struct {
38-
numericReader[{{.Type}}]
36+
// rw{{.Name}} represents a read-write cursor for {{.Type}}
37+
type rw{{.Name}} struct {
38+
rdNumber[{{.Type}}]
3939
writer *commit.Buffer
4040
}
4141

4242
// Set sets the value at the current transaction cursor
43-
func (s {{.Type}}Writer) Set(value {{.Type}}) {
43+
func (s rw{{.Name}}) Set(value {{.Type}}) {
4444
s.writer.Put{{.Name}}(commit.Put, s.txn.cursor, value)
4545
}
4646

4747
// Merge atomically merges a delta to the value at the current transaction cursor
48-
func (s {{.Type}}Writer) Merge(delta {{.Type}}) {
48+
func (s rw{{.Name}}) Merge(delta {{.Type}}) {
4949
s.writer.Put{{.Name}}(commit.Merge, s.txn.cursor, delta)
5050
}
5151

5252
// {{.Name}} returns a read-write accessor for {{.Type}} column
53-
func (txn *Txn) {{.Name}}(columnName string) {{.Type}}Writer {
54-
return {{.Type}}Writer{
55-
numericReader: numericReaderFor[{{.Type}}](txn, columnName),
56-
writer: txn.bufferFor(columnName),
53+
func (txn *Txn) {{.Name}}(columnName string) rw{{.Name}} {
54+
return rw{{.Name}}{
55+
rdNumber: readNumberOf[{{.Type}}](txn, columnName),
56+
writer: txn.bufferFor(columnName),
5757
}
5858
}
5959

collection.go

Lines changed: 3 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -17,9 +17,6 @@ import (
1717
"github.com/kelindar/smutex"
1818
)
1919

20-
// Object represents a single object
21-
type Object = map[string]interface{}
22-
2320
const (
2421
expireColumn = "expire"
2522
rowColumn = "row"
@@ -127,25 +124,6 @@ func (c *Collection) findFreeIndex(count uint64) uint32 {
127124
return idx
128125
}
129126

130-
// InsertObject adds an object to a collection and returns the allocated index.
131-
func (c *Collection) InsertObject(obj Object) (index uint32) {
132-
c.Query(func(txn *Txn) error {
133-
index, _ = txn.InsertObject(obj)
134-
return nil
135-
})
136-
return
137-
}
138-
139-
// InsertObjectWithTTL adds an object to a collection, sets the expiration time
140-
// based on the specified time-to-live and returns the allocated index.
141-
func (c *Collection) InsertObjectWithTTL(obj Object, ttl time.Duration) (index uint32) {
142-
c.Query(func(txn *Txn) error {
143-
index, _ = txn.InsertObjectWithTTL(obj, ttl)
144-
return nil
145-
})
146-
return
147-
}
148-
149127
// Insert executes a mutable cursor transactionally at a new offset.
150128
func (c *Collection) Insert(fn func(Row) error) (index uint32, err error) {
151129
err = c.Query(func(txn *Txn) (innerErr error) {
@@ -181,9 +159,9 @@ func (c *Collection) createColumnKey(columnName string, column *columnKey) error
181159
return nil
182160
}
183161

184-
// CreateColumnsOf registers a set of columns that are present in the target object.
185-
func (c *Collection) CreateColumnsOf(object Object) error {
186-
for k, v := range object {
162+
// CreateColumnsOf registers a set of columns that are present in the target map.
163+
func (c *Collection) CreateColumnsOf(value map[string]any) error {
164+
for k, v := range value {
187165
column, err := ForKind(reflect.TypeOf(v).Kind())
188166
if err != nil {
189167
return err

0 commit comments

Comments
 (0)