Recently I had to do some advanced filtering using the database/sql
package found in the Golang standard library. In this article I share how I handled advanced filtering. If you are a beginner in Golang wanting to learn how to improve your understanding of querying the database, this article is for you.
This article is broken up as follows:
- Context
- Setting up the problem in Python
- Setting up the problem in Golang
- Understand the problem
- First attempt (Static filtering)
- Second attempt (Too complicated)
- The “variadic function” to the rescue
- The Solution
- Extra Solutions
Let’s begin!
Context
Before using Golang to write web-applications, I’ve was using Django Web Framework
which is written in Python. Django provides a object relational-mapping (ORM) system to handle your python code interacting with the database. Through all my years, I still think the Django ORM is the best I have ever seen.
Now that I switched my programming from Python to Golang, I no longer have access to this phenomenal library. There are other ORM solutions but none that I’ve found to my liking; as a result, I am using only the database/sql
package.
The transition from Django to Golang has not been easy due to fact of how much Django spoiled me. Django has abstracted away a lot algorithms and computer science theory for me to use in a easy to use plug-in-play code! With the transition I am going through, this situation reminds me of a great article called Is Software Abstraction Killing Civilization? in which it details talks about how programmers lose certain knowledge through time because we keep using abstracted code; as a result, I’d I am writing this article to detail my learning.
Setting up the problem in Django
Let’s assume we have the following data-structure and you populated the table with some dummy data:
CREATE TABLE things (
tenant_id BIGINT NOT NULL,
id BIGSERIAL PRIMARY KEY,
manufacturer_id BIGINT NOT NULL,
part_id BIGINT NOT NULL,
state SMALLINT NOT NULL,
name VARCHAR (127) NOT NULL
);
If we want to lookup “3D Printer” in the name
then we would write:
from django.db.models import Q
series = Thing.objects.filter(name="3D Printer")
Or if we want to lookup a manufacturer_id
which does NOT have a particular part_id
, we would write:
from django.db.models import Q
series = Thing.objects.filter(
Q(manufacturer_id=1) &
~Q(part_id=444)
)
And how about if we lookup all the things with the name
of 3D printers with a particular part_id
?
from django.db.models import Q
series = Thing.objects.filter(
Q(name="3D Printer") &
Q(part_id=123456789)
)
So the Django ORM is quite good at being flexible enough to handle whatever filtering you can throw at it. How do we build something so flexible in Golang?
Setting up the problem in Golang
I will provide important code snippets to help demonstrate. Please note that this code is following the repository pattern so if you don’t understand then please read up before proceeding further.
When following the repository pattern, I break up my code into the models and repositories packages.
- models - Is responsible strictly for the data structure.
- repositories - Will implement the interface from models.
The code is as follows:
models/thing.go
// github.com/bartmika/hello-server/internal/models/thing.go
package models
import (
"context"
"time"
null "gopkg.in/guregu/null.v4"
)
// Structure used to hold filtering request by the user. If using `null` package
// then filter is optional, else the value is required!
type ThingFilter struct {
TenantId uint64 `json:"tenant_id"`
ManufacturerId null.Int `json:"manufacturer_id"`
PartId null.Int `json:"part_id"`
State null.Int `json:"state"`
LastSeenId uint64 `json:"last_seen_id"`
Limit uint64 `json:"limit"`
}
// The interface we will be using for our `Thing` model. Please note we will be
// using the `repository pattern` so we are adding `Repo` suffix to remind us.
type ThingRepo interface {
ListByFilter(ctx context.Context, filter *ThingFilter) ([]*Thing, error)
CountByFilter(ctx context.Context, filter *ThingFilter) (uint64, error)
// ...
}
// The model we will be using.
type Thing struct {
TenantId uint64 `db:"tenant_id" json:"tenant_id"`
Id uint64 `db:"id" json:"id"`
ManufacturerId uint64 `db:"manufacturer_id" json:"manufacturer_id"`
PartId uint64 `db:"part_id" json:"part_id"`
State uint `db:"state" json:"state"`
Name string `db:"state" json:"name"`
}
repositories/thing.go
// github.com/bartmika/hello-server/internal/repositories/thing.go
package repositories
import (
"context"
"database/sql"
"time"
"strconv"
// null "gopkg.in/guregu/null.v4"
"github.com/bartmika/hello-server/internal/models"
)
type ThingRepoImpl struct {
db *sql.DB
}
func NewThingRepoImpl(db *sql.DB) *ThingRepoImpl {
return &ThingRepoImpl{db: db}
}
func (s *ThingRepoImpl) ListByFilter(ctx context.Context, filter *models.ThingFilter) ([]*models.Thing, error) {
// TODO
return nil, nil
}
func (s *ThingRepoImpl) CountByFilter(ctx context.Context, filter *models.ThingFilter) (uint64, error) {
// TODO
return 0, nil
}
Example Usage
Filter only one column
Here is how we lookup a single column:
// Assume you setup the context
// ctx := ...
// Assume you connected to the database
// db := ...
thingRepo := repositories.NewThingRepoImpl(db)
// Lookup all the `things` for a particular manufacturer.
thingFilter := models.ThingFilter{
ManufacturerId: 1,
}
things, err := thingRepo.ListByFilter(ctx, thingFilter)
log.Println(things, err)
Filter more then one column
Here is how we lookup three columns:
// Assume you setup the context
// ctx := ...
// Assume you connected to the database
// db := ...
thingRepo := repositories.NewThingRepoImpl(db)
// Lookup all the `things` for a particular manufacturer and part
thingFilter := models.ThingFilter{
ManufacturerId: 1,
PartId: 123,
State: 1,
}
things, err := thingRepo.ListByFilter(ctx, thingFilter)
log.Println(things, err)
Now that we’ve seen filtering code prototype and usage, let’s discuss implementation of the TODO
items in our repository.
Understand the problem
If we look at our ThingFilter
structure we see that the user has the option of using three optional filters and 3 required:
- TenantId -> Required
- ManufacturerId -> Optional
- PartId -> Optional
- State -> Optional
- LastSeenId -> Required
- Limit -> Required
Therefore our filtering code should handle these cases.
First attempt (Static filtering)
My first attempt at solving the problem was to create the dedicated functions for whatever filtering we will have. Take a look at the repositories/thing.go
file:
// github.com/bartmika/hello-server/internal/repositories/thing.go
// ...
func (s *ThingRepoImpl) ListAll(ctx context.Context) ([]*models.Thing, error) {
// ...
}
func (s *ThingRepoImpl) ListByStateId(ctx context.Context, stateId uint64) ([]*models.Thing, error) {
// ...
}
func (s *ThingRepoImpl) ListByManufacturerId(ctx context.Context, manufacturerId uint64) ([]*models.Thing, error) {
// ...
}
func (s *ThingRepoImpl) ListByManufacturerIdAndState(ctx context.Context, manufacturerId uint64, stateId uint64) ([]*models.Thing, error) {
// ...
}
func (s *ThingRepoImpl) ListByManufacturerIdAndPartId(ctx context.Context, manufacturerId uint64, partId uint64) ([]*models.Thing, error) {
// ...
}
func (s *ThingRepoImpl) ListByManufacturerIdAndPartIdAndStateId(ctx context.Context, manufacturerId uint64, partId uint64, stateId uint64) ([]*models.Thing, error) {
// ...
}
// Etc, etc
// ...
As you can see there are a number of problems to take into consideration:
- Every time you want to add a new column to filter, a lot more functions need to be created.
- It’s not practical - The more filters you want to do, the more functions you have to write to maintain.
What pattern do you notice? This pattern does not work, an alternate solutions needs to exist.
Second attempt (Too complicated)
My second attempt involved creating a unified function which encapsulates all the filtering capability.
Take a look at the repositories/thing.go
file:
// github.com/bartmika/hello-server/internal/repositories/thing.go
// ...
func (s *ThingRepoImpl) queryRowsWithFilter(ctx context.Context, querySelect string, filter *models.ThingFilter) (*sql.Rows, error) {
if filter.ManufacturerId.IsZero() {
if filter.PartId.IsZero() {
if filter.State.IsZero() {
log.Println("Filtering by: Nothing")
query := querySelect + `
WHERE
tenant_id = $1
AND
id > $2
ORDER BY
id
DESC LIMIT $3`
return s.db.QueryContext(ctx, query, filter.TenantId, filter.LastSeenId, filter.Limit)
} else {
log.Println("Filtering by: Only State")
query := querySelect + `
WHERE
tenant_id = $1
AND
id > $2
AND
state = $3
ORDER BY
id
DESC LIMIT $4`
return s.db.QueryContext(ctx, query, filter.TenantId, filter.State, filter.LastSeenId, filter.Limit)
}
} else {
if filter.State.IsZero() {
// Not writing code ...
} else {
// Not writing code ...
}
}
} else {
if filter.PartId.IsZero() {
if filter.PartId.State() {
// Not writing code ...
} else {
// Not writing code ...
}
} else {
if filter.PartId.State() {
// Not writing code ...
} else {
// Not writing code ...
}
}
}
}
func (s *ThingRepoImpl) ListByFilter(ctx context.Context, filter *models.ThingFilter) ([]*models.Thing, error) {
ctx, cancel := context.WithTimeout(ctx, 7*time.Second)
defer cancel()
querySelect := `
SELECT
tenant_id,
id,
manufacturer_id,
part_id,
state,
name
FROM
things`
rows, err := s.queryRowsWithFilter(ctx, querySelect, filter)
if err != nil {
return nil, err
}
var arr []*models.Thing
defer rows.Close()
for rows.Next() {
m := new(models.Thing)
err := rows.Scan(
&m.TenantId,
&m.Id,
&m.ManufacturerId,
&m.PartId,
&m.State,
&m.Name,
)
if err != nil {
return nil, err
}
arr = append(arr, m)
}
err = rows.Err()
return arr, err
}
As you can see there are a number of problems to take into consideration:
- Every time you want to add a new column to filter, the code get’s super complicated quickly with repetition.
- It’s complicated
- It’s not practical - The more filters you want to do, the more complex it becomes
What pattern do you notice? The parameters we want to filter needs to change every time for the QueryContext
function and we need to update the sql statement
as well. The SQL statement is a string so we can easily modify the string, but how do we call the QueryContext
function multiple times with different parameters?
The “variadic function” to the rescue
In Golang, we have a variadic function which can be used to load up different parameters into the QueryContext
function.
The Solution
The solution which works is to utilize variadic functions and our code looks as follows.
repositories/thing.go
// github.com/bartmika/hello-server/internal/repositories/thing.go
package repositories
import (
"context"
"database/sql"
"time"
"strconv"
"github.com/bartmika/hello-server/internal/models"
)
type ThingRepoImpl struct {
db *sql.DB
}
func NewThingRepoImpl(db *sql.DB) *ThingRepoImpl {
return &ThingRepoImpl{db: db}
}
func (s *ThingRepoImpl) queryRowsWithFilter(ctx context.Context, query string, filter *models.ThingFilter) (*sql.Rows, error) {
// Array will hold all the unique values we want to add into the query.
var filterValues []interface{}
// The SQL query statement we will be calling in the database, start
// by setting the `tenant_id` placeholder and then append our value to
// the array.
filterValues = append(filterValues, filter.TenantId)
query += `WHERE tenant_id = $` + strconv.Itoa(len(filterValues))
//
// The following code will add our OPTIONAL filters
//
if !filter.ManufacturerId.IsZero() {
filterValues = append(filterValues, filter.ManufacturerId)
query += `AND manufacturer_id = $` + strconv.Itoa(len(filterValues))
}
if !filter.PartId.IsZero() {
filterValues = append(filterValues, filter.PartId)
query += `AND part_id = $` + strconv.Itoa(len(filterValues))
}
if !filter.State.IsZero() {
filterValues = append(filterValues, filter.State)
query += `AND state = $` + strconv.Itoa(len(filterValues))
}
//
// The following code will add our REQUIRED filters
// (Please note we use these for pagination purposes!)
//
if filter.LastSeenId > 0 {
filterValues = append(filterValues, filter.LastSeenId)
query += `AND id < $` + strconv.Itoa(len(filterValues))
}
query += `ORDER BY id `
filterValues = append(filterValues, filter.Limit)
query += `DESC LIMIT $` + strconv.Itoa(len(filterValues))
//
// Execute our custom built SQL query to the database.
// (Notice our usage of the `variadic function`?)
//
return s.db.QueryContext(ctx, query, filterValues...)
}
func (s *ThingRepoImpl) ListByFilter(ctx context.Context, filter *models.ThingFilter) ([]*models.Thing, error) {
ctx, cancel := context.WithTimeout(ctx, 5*time.Second)
defer cancel()
querySelect := `
SELECT
tenant_id,
id,
manufacturer_id,
part_id,
state,
name
FROM
things`
rows, err := s.queryRowsWithFilter(ctx, querySelect, filter)
if err != nil {
return nil, err
}
var arr []*models.Thing
defer rows.Close()
for rows.Next() {
m := new(models.Thing)
err := rows.Scan(
&m.TenantId,
&m.Id,
&m.ManufacturerId,
&m.PartId,
&m.State,
&m.Name,
)
if err != nil {
return nil, err
}
arr = append(arr, m)
}
err = rows.Err()
return arr, err
}
func (s *ThingRepoImpl) CountByFilter(ctx context.Context, filter *models.ThingFilter) (uint64, error) {
ctx, cancel := context.WithTimeout(ctx, 5*time.Second)
defer cancel()
// The result we are looking for.
var count uint64
// Array will hold all the unique values we want to add into the query.
var filterValues []interface{}
// The SQL query statement we will be calling in the database, start
// by setting the `tenant_id` placeholder and then append our value to
// the array.
filterValues = append(filterValues, filter.TenantId)
query := `
SELECT COUNT(id) FROM
things
WHERE
tenant_id = $` + strconv.Itoa(len(filterValues))
//
// The following code will add our filters
//
if !filter.ManufacturerId.IsZero() {
filterValues = append(filterValues, filter.ManufacturerId)
query += `AND manufacturer_id = $` + strconv.Itoa(len(filterValues))
}
if !filter.PartId.IsZero() {
filterValues = append(filterValues, filter.PartId)
query += `AND part_id = $` + strconv.Itoa(len(filterValues))
}
if !filter.State.IsZero() {
filterValues = append(filterValues, filter.State)
query += `AND state = $` + strconv.Itoa(len(filterValues))
}
//
// Execute our custom built SQL query to the database.
//
err := s.db.QueryRowContext(ctx, query, filterValues...).Scan(&count)
return count, err
}
Extra Solutions
2022-02-28 UPDATE: Recently I had someone email me the following question in regards to this article:
How do you apply that approach to check for other than “=” conditions?
I think that’s a great question! As a result, I’d like to expand this article to answer.
If you dear reader have any questions, don’t hesitate to reach out and contact me!
How do I filter by Date/time?
Let’s setup the following SQL structure. Let’s assume the metric_id
points to another table table with sensors (ex: UV Index), the implementation of that table doesn’t matter. Here is our time_series_data
table:
CREATE TABLE time_series_data (
metric_id BIGINT NOT NULL,
timestamp TIMESTAMPTZ NOT NULL,
value DOUBLE PRECISION NULL,
FOREIGN KEY (metric_id) REFERENCES metrics(id) ON DELETE CASCADE,
PRIMARY KEY(timestamp, metric_id)
);
If we want to lookup all records before 2022/02/28 date then we would write:
import datetime
series = TimeSeriesData.objects.filter(timestamp__lte=datetime.date(2022, 02, 28))
Furthermore if we want to find data within a range we can write the follow:
from datetime import date, timedelta
startdate = date.today()
enddate = startdate + timedelta(days=6)
series = TimeSeriesData.objects.filter(timestamp__range=[startdate, enddate])
Given the above cases, how do we support it? Let’s apply what we learned so far with a few additions. In Django’s ORM, if you want to filter, you have the following options:
lte
- Less then or equal to.lt
- Less then.gte
- Greater then or equal.gt
- Greater then.
As a result, we’ll create fields for each filter case so the code will look as follows:
models/time_series_data.go
// github.com/bartmika/hello-server/internal/models/time_series_data.go
package models
import (
"context"
"time"
null "gopkg.in/guregu/null.v4"
)
type TimeSeriesDataFilter struct {
// MetricID is used for filtering data from a particular sensor / instrument.
MetricID uint64 `json:"metric_id"`
// SortOrder is used for you to pick which ordering to use. Either use `asc` or `desc`.
SortOrder string `json:"sort_order"`
// SortField is used for the specific field to order by.
SortField string `json:"sort_field"`
Offset uint64 `json:"offset"`
Limit uint64 `json:"limit"`
// The following filters are to be used for specific date/time.
TimestampGreaterThen null.Time `json:"timestamp_gt,omitempty"`
TimestampGreaterThenOrEqual null.Time `json:"timestamp_gte,omitempty"`
TimestampLessThen null.Time `json:"timestamp_lt,omitempty"`
TimestampLessThenOrEqual null.Time `json:"timestamp_lte,omitempty"`
}
// TimeSeriesDatum is a row in the `time_series_data` table.
type TimeSeriesDatum struct {
MetricID uint64 `json:"metric_id"`
Timestamp time.Time `json:"timestamp"`
Value null.Float `json:"value"`
}
type TimeSeriesDatumRepository interface {
ListByFilter(ctx context.Context, filter *TimeSeriesDataFilter) ([]*TimeSeriesDatum, error)
}
repositories/time_series_data.go
// github.com/bartmika/hello-server/internal/repositories/time_series_data.go
package repositories
import (
"context"
"database/sql"
"strconv"
"time"
"github.com/bartmika/hello-server/internal/models"
)
type TimeSeriesDatumRepo struct {
db *sql.DB
}
func NewTimeSeriesDatumRepo(db *sql.DB) *TimeSeriesDatumRepo {
return &TimeSeriesDatumRepo{
db: db,
}
}
func (s *TimeSeriesDatumRepo) listQueryRowsWithFilter(ctx context.Context, query string, f *models.TimeSeriesDataFilter) (*sql.Rows, error) {
// Array will hold all the unique values we want to add into the query.
var filterValues []interface{}
// The SQL query statement we will be calling in the database, start
// by setting the `tenant_id` placeholder and then append our value to
// the array.
filterValues = append(filterValues, f.MetricID)
query += ` WHERE metric_id = $` + strconv.Itoa(len(filterValues))
//
// The following code will add our filters
//
if !f.TimestampGreaterThenOrEqual.IsZero() {
filterValues = append(filterValues, f.TimestampGreaterThenOrEqual)
query += ` AND timestamp >= $` + strconv.Itoa(len(filterValues))
}
if !f.TimestampGreaterThen.IsZero() {
filterValues = append(filterValues, f.TimestampGreaterThen)
query += ` AND timestamp > $` + strconv.Itoa(len(filterValues))
}
if !f.TimestampLessThenOrEqual.IsZero() {
filterValues = append(filterValues, f.TimestampLessThenOrEqual)
query += ` AND timestamp <= $` + strconv.Itoa(len(filterValues))
}
if !f.TimestampLessThen.IsZero() {
filterValues = append(filterValues, f.TimestampLessThen)
query += ` AND timestamp < $` + strconv.Itoa(len(filterValues))
}
//
// The following code will add our pagination.
//
if f.Offset > 0 {
// This step is necessary so please do not delete.
f.Offset = f.Offset - 1
}
query += ` ORDER BY ` + f.SortField + ` ` + f.SortOrder
filterValues = append(filterValues, f.Limit)
query += ` LIMIT $` + strconv.Itoa(len(filterValues))
filterValues = append(filterValues, f.Offset)
query += ` OFFSET $` + strconv.Itoa(len(filterValues))
//
// Execute our custom built SQL query to the database.
//
// // For debugging purposes only.
// log.Println("TimeSeriesDatumRepo | query:", query)
// log.Println("TimeSeriesDatumRepo | filterValues:", filterValues)
// log.Println("TimeSeriesDatumRepo | f:", f)
return s.db.QueryContext(ctx, query, filterValues...)
}
func (s *TimeSeriesDatumRepo) ListByFilter(ctx context.Context, filter *models.TimeSeriesDataFilter) ([]*models.TimeSeriesDatum, error) {
ctx, cancel := context.WithTimeout(ctx, 5*time.Second)
defer cancel()
querySelect := `
SELECT
metric_id,
timestamp,
value
FROM
time_series_data
`
rows, err := s.listQueryRowsWithFilter(ctx, querySelect, filter)
if err != nil {
return nil, err
}
var arr []*models.TimeSeriesDatum
defer rows.Close()
for rows.Next() {
m := new(models.TimeSeriesDatum)
err := rows.Scan(
&m.MetricID,
&m.Timestamp,
&m.Value,
)
if err != nil {
return nil, err
}
arr = append(arr, m)
}
err = rows.Err()
if err != nil {
return nil, err
}
if arr == nil {
return []*models.TimeSeriesDatum{}, nil
}
return arr, err
}
Example of Filtering with Dates
Filter by Less than or Equal to
The following example is how you would get all the data from the beginning to yesterdays date for a particular metric:
// ...
var mid uint64 = 1 // Pretend its a UV Index meter.
var nowDT time.Time = time.Now()
var yesterdayDT time.Time = time.Date(nowDT.Year(), nowDT.Month(), nowDT.Day()-1, 0, 0, 0, 0, nowDT.Location())
f := models.TimeSeriesDataFilter{
MetricID: mid,
SortField: "timestamp",
SortOrder: "DESC",
Offset: 0,
Limit: 100,
TimestampLessThenOrEqual: yesterdayDT,
}
arr, err := TimeSeriesDatumRepo.ListByFilter(ctx, &f)
if err != nil {
return err
}
log.Println(arr) // Return your data.
Filter by Range
The following example is how you would get all the data from yesterday to today (a.k.a. range):
// ...
var mid uint64 = 1 // Pretend its a UV Index meter.
var nowDT time.Time = time.Now()
var yesterdayDT time.Time = time.Date(nowDT.Year(), nowDT.Month(), nowDT.Day()-1, 0, 0, 0, 0, nowDT.Location())
f := models.TimeSeriesDataFilter{
MetricID: mid,
SortField: "timestamp",
SortOrder: "DESC",
Offset: 0,
Limit: 100,
TimestampGreaterThen: yesterdayDT,
TimestampLessThenOrEqual: nowDT,
}
arr, err := TimeSeriesDatumRepo.ListByFilter(ctx, &f)
if err != nil {
return err
}
log.Println(arr) // Return your data.