Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: database processor #441

Draft
wants to merge 60 commits into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
60 commits
Select commit Hold shift + click to select a range
4c32d87
man update
rusq Feb 20, 2025
afc9502
sqlite repo, first draft
rusq Feb 9, 2025
3377166
fix query building
rusq Feb 9, 2025
979d490
experimental flate
rusq Feb 9, 2025
1034e19
decouple worker logic from directory controller
rusq Feb 10, 2025
4c3d48c
split logically into files
rusq Feb 10, 2025
1606b03
rename residue
rusq Feb 10, 2025
ff6945d
Decouple directory storage from search workers
rusq Feb 11, 2025
5f6fefc
Fix panic in session.Insert
rusq Feb 11, 2025
8b4d6d6
fix marshalling in tests
rusq Feb 11, 2025
b133c97
move dbproc under chunk
rusq Feb 11, 2025
f748d59
move dbproc under chunk
rusq Feb 11, 2025
2462694
db command and db controller
rusq Feb 11, 2025
69b1bca
files and avatar download for db target
rusq Feb 12, 2025
ec2ba5a
add username and displayname fields to users
rusq Feb 12, 2025
cdbd509
fix database structure
rusq Feb 14, 2025
df4c5b0
implementing generic getter methods
rusq Feb 15, 2025
2235e73
source skeleton
rusq Feb 15, 2025
a7eff63
rename readonly to source.go
rusq Feb 15, 2025
2aeeff6
tests
rusq Feb 15, 2025
ac19f40
non-pointer type params
rusq Feb 15, 2025
7d0da37
fix generic tests
rusq Feb 15, 2025
3bd0d73
split repository into composable interfaces
rusq Feb 15, 2025
688a349
fix repository tests
rusq Feb 16, 2025
a08b21a
implementing sourcer + tests
rusq Feb 16, 2025
383abc8
backport #449
rusq Feb 17, 2025
f7adce9
fix spelling mistakes
rusq Feb 17, 2025
412604b
Implementing workspace and source
rusq Feb 17, 2025
45b26fa
user keys, add file columns
rusq Feb 18, 2025
62fe625
fix subtle bug where half of the SQL statement is ignored :-D
rusq Feb 19, 2025
58d6067
fix repository tests and channel getter
rusq Feb 19, 2025
8913f70
cleanup of a duplicate test
rusq Feb 19, 2025
7cd9a0d
add repo basic fn test
rusq Feb 19, 2025
7eb2724
adding tests
rusq Feb 19, 2025
f93e0d6
more repository tests
rusq Feb 20, 2025
75829ec
extractiong options
rusq Feb 21, 2025
cdd7a46
brush up
rusq Feb 21, 2025
e9f34cc
Source updated to use iterators
rusq Feb 21, 2025
257889f
generalising
rusq Feb 21, 2025
f96c187
universal converter
rusq Feb 22, 2025
09d6341
AllMessages to use "Sorted" method
rusq Feb 22, 2025
2f1ed49
cherrypick changes to file and chunk from i174-multichunk
rusq Feb 22, 2025
124fc3d
variadic chunkID
rusq Feb 22, 2025
ba7a064
fix export conversion and viewing
rusq Feb 22, 2025
7a4912d
Fix export
rusq Feb 22, 2025
506838e
bug fixes and performance optimisation
rusq Feb 23, 2025
7430b0d
errgroup with Context
rusq Feb 24, 2025
4fbc5d6
organise controller functions, reusable options
rusq Feb 24, 2025
09aaa5d
Feature parity between db and dir controllers + linter complaints
rusq Feb 24, 2025
01f002a
db backend for export
rusq Feb 25, 2025
b49a6de
align export timezone to slack (America/Los_Angeles)
rusq Feb 25, 2025
5370883
fix source detection, add tests, convert dump to universal source
rusq Feb 26, 2025
4bd7766
bump gha go versions
rusq Feb 27, 2025
3702fd3
make codespell happy
rusq Feb 27, 2025
31ac30d
fix the missing parent for the thread in cthreadmessages
rusq Feb 27, 2025
60aa1fd
Unify dump converter, partial.
rusq Feb 27, 2025
720f522
cleanup
rusq Feb 27, 2025
4934a32
prepare to replace the dump controller
rusq Feb 28, 2025
be1128c
make linter happier
rusq Feb 28, 2025
3fa44f9
make linter happier
rusq Feb 28, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .codespellrc
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,4 @@ skip = .git,go.sum,.codespellrc,.goreleaser.yaml
check-hidden = true
# ignore-regex =
# ignore some variable names
ignore-words-list = ser,ans,auther,Nd,FO,Nexted
ignore-words-list = ser,ans,auther,Nd,FO,Nexted,flate,efore,aci,fter
4 changes: 2 additions & 2 deletions .github/workflows/go.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ jobs:
- name: Set up Go
uses: actions/setup-go@v5
with:
go-version: "1.23"
go-version: "1.24"

- name: Build
run: go build -v ./...
Expand All @@ -36,7 +36,7 @@ jobs:
- name: Set up Go
uses: actions/setup-go@v5
with:
go-version: "1.23"
go-version: "1.24"

- name: Build
run: go build -v ./...
Expand Down
6 changes: 6 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -66,3 +66,9 @@ debug.test*
!internal/viewer/templates/*.html
!internal/viewer/renderer/templates/*.html
!internal/fixtures/**/*.json
*.sqlite*
*.db
# profiler files
*.pprof
*.prof
*.pf
5 changes: 4 additions & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ DISTFILES=README.md LICENSE
ZIPFILES=$(foreach s,$(OSES),$(OUTPUT)-$s.zip)


.PHONY: dist all test
.PHONY: dist all

# special guest.
$(OUTPUT)-windows.zip: EXECUTABLE=$(OUTPUT).exe
Expand Down Expand Up @@ -57,9 +57,12 @@ arm_%:

clean:
-rm slackdump slackdump.exe $(wildcard *.zip)
-rm -rf slackdump_$(shell date +%Y)*
.PHONY: clean

test:
go test -race -cover ./...
.PHONY: test

aurtest:
GOFLAGS="-buildmode=pie -trimpath -ldflags=-linkmode=external -mod=readonly -modcacherw" go build -o 'deleteme' ./cmd/slackdump
Expand Down
1 change: 0 additions & 1 deletion channels.go
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,6 @@ func (s *Session) getChannels(ctx context.Context, chanTypes []string, cb func(t
chans, nextcur, err = s.client.GetConversationsContext(ctx, params)
})
return err

}); err != nil {
return err
}
Expand Down
17 changes: 17 additions & 0 deletions cmd/slackdump/instruments_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
package main

import (
"path/filepath"
"testing"

"github.com/stretchr/testify/assert"
)

func Test_initTrace(t *testing.T) {
t.Run("initialises trace file", func(t *testing.T) {
testTraceFile := filepath.Join(t.TempDir(), "trace.out")
stop := initTrace(testTraceFile)
t.Cleanup(stop)
assert.FileExists(t, testTraceFile)
})
}
118 changes: 115 additions & 3 deletions cmd/slackdump/internal/archive/archive.go
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,14 @@ import (
"context"
_ "embed"
"errors"
"io"
"log/slog"
"os"
"path/filepath"
"strings"
"time"

"github.com/jmoiron/sqlx"
"github.com/rusq/fsadapter"

"github.com/rusq/slackdump/v3"
Expand All @@ -15,6 +20,8 @@ import (
"github.com/rusq/slackdump/v3/cmd/slackdump/internal/golang/base"
"github.com/rusq/slackdump/v3/internal/chunk"
"github.com/rusq/slackdump/v3/internal/chunk/control"
"github.com/rusq/slackdump/v3/internal/chunk/dbproc"
"github.com/rusq/slackdump/v3/internal/chunk/dbproc/repository"
"github.com/rusq/slackdump/v3/internal/chunk/transform/fileproc"
"github.com/rusq/slackdump/v3/internal/structures"
"github.com/rusq/slackdump/v3/stream"
Expand All @@ -40,6 +47,14 @@ func init() {
var errNoOutput = errors.New("output directory is required")

func RunArchive(ctx context.Context, cmd *base.Command, args []string) error {
if cfg.UseChunkFiles {
return runChunkArchive(ctx, cmd, args)
} else {
return runDBArchive(ctx, cmd, args)
}
}

func runChunkArchive(ctx context.Context, _ *base.Command, args []string) error {
start := time.Now()
list, err := structures.NewEntityList(args)
if err != nil {
Expand Down Expand Up @@ -67,7 +82,50 @@ func RunArchive(ctx context.Context, cmd *base.Command, args []string) error {
base.SetExitStatus(base.SApplicationError)
return err
}
cfg.Log.Info("Recorded workspace data", "filename", cd.Name(), "took", time.Since(start))
cfg.Log.Info("Recorded workspace data", "directory", cd.Name(), "took", time.Since(start))
return nil
}

func runDBArchive(ctx context.Context, _ *base.Command, args []string) error {
start := time.Now()
list, err := structures.NewEntityList(args)
if err != nil {
base.SetExitStatus(base.SUserError)
return err
}
sess, err := bootstrap.SlackdumpSession(ctx)
if err != nil {
base.SetExitStatus(base.SInitializationError)
return err
}

dirname := cfg.StripZipExt(cfg.Output)
if err := os.MkdirAll(dirname, 0o755); err != nil {
return err
}

conn, err := sqlx.Open(repository.Driver, filepath.Join(dirname, "slackdump.sqlite"))
if err != nil {
return err
}
defer conn.Close()

ctrl, err := DBController(ctx, conn, sess, dirname)
if err != nil {
return err
}

defer func() {
if err := ctrl.Close(); err != nil {
slog.ErrorContext(ctx, "unable to close database controller", "error", err)
}
}()

if err := ctrl.Run(ctx, list); err != nil {
base.SetExitStatus(base.SApplicationError)
return err
}
cfg.Log.Info("Recorded workspace data", "directory", dirname, "took", time.Since(start))

return nil
}
Expand All @@ -87,9 +145,63 @@ func NewDirectory(name string) (*chunk.Directory, error) {
return cd, nil
}

func DBController(ctx context.Context, conn *sqlx.DB, sess *slackdump.Session, dirname string, opts ...stream.Option) (RunCloser, error) {
lg := cfg.Log
dbp, err := dbproc.New(ctx, conn, dbproc.SessionInfo{
FromTS: &time.Time{},
ToTS: &time.Time{},
FilesEnabled: cfg.DownloadFiles,
AvatarsEnabled: cfg.DownloadAvatars,
Mode: "archive",
Args: strings.Join(os.Args, "|"),
})
if err != nil {
return nil, err
}
sopts := []stream.Option{
stream.OptLatest(time.Time(cfg.Latest)),
stream.OptOldest(time.Time(cfg.Oldest)),
stream.OptResultFn(resultLogger(lg)),
}
sopts = append(sopts, opts...)
// start attachment downloader
dl := fileproc.NewDownloader(
ctx,
cfg.DownloadFiles,
sess.Client(),
fsadapter.NewDirectory(dirname),
lg,
)
// start avatar downloader
avdl := fileproc.NewDownloader(
ctx,
cfg.DownloadAvatars,
sess.Client(),
fsadapter.NewDirectory(dirname),
lg,
)

ctrl, err := control.NewDB(
ctx,
sess.Stream(sopts...),
dbp,
control.WithFiler(fileproc.New(dl)),
control.WithAvatarProcessor(fileproc.NewAvatarProc(avdl)),
)
if err != nil {
return nil, err
}
return ctrl, nil
}

type RunCloser interface {
Run(context.Context, *structures.EntityList) error
io.Closer
}

// ArchiveController returns the default archive controller initialised based
// on global configuration parameters.
func ArchiveController(ctx context.Context, cd *chunk.Directory, sess *slackdump.Session, opts ...stream.Option) (*control.Controller, error) {
func ArchiveController(ctx context.Context, cd *chunk.Directory, sess *slackdump.Session, opts ...stream.Option) (*control.DirController, error) {
lg := cfg.Log

sopts := []stream.Option{
Expand All @@ -116,7 +228,7 @@ func ArchiveController(ctx context.Context, cd *chunk.Directory, sess *slackdump
lg,
)

ctrl := control.New(
ctrl := control.NewDir(
cd,
sess.Stream(sopts...),
control.WithLogger(lg),
Expand Down
30 changes: 16 additions & 14 deletions cmd/slackdump/internal/archive/search.go
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ var cmdSearchMessages = &base.Command{
Long: `Searches for messages matching criteria.`,
RequireAuth: true,
FlagMask: flagMask | cfg.OmitRecordFilesFlag,
Run: runSearchFn((*control.Controller).SearchMessages),
Run: runSearchFn((*control.DirController).SearchMessages),
PrintFlags: true,
}

Expand All @@ -54,7 +54,7 @@ var cmdSearchFiles = &base.Command{
Long: `Searches for messages matching criteria.`,
RequireAuth: true,
FlagMask: flagMask,
Run: runSearchFn((*control.Controller).SearchFiles),
Run: runSearchFn((*control.DirController).SearchFiles),
PrintFlags: true,
}

Expand All @@ -64,7 +64,7 @@ var cmdSearchAll = &base.Command{
Long: `Records search message and files results matching the given query`,
RequireAuth: true,
FlagMask: flagMask,
Run: runSearchFn((*control.Controller).SearchAll),
Run: runSearchFn((*control.DirController).SearchAll),
PrintFlags: true,
}

Expand All @@ -78,7 +78,7 @@ func init() {

var ErrNoQuery = errors.New("missing query parameter")

func runSearchFn(fn func(*control.Controller, context.Context, string) error) func(context.Context, *base.Command, []string) error {
func runSearchFn(fn func(*control.DirController, context.Context, string) error) func(context.Context, *base.Command, []string) error {
return func(ctx context.Context, cmd *base.Command, args []string) error {
if len(args) == 0 {
base.SetExitStatus(base.SInvalidParameters)
Expand All @@ -104,7 +104,11 @@ func runSearchFn(fn func(*control.Controller, context.Context, string) error) fu
if err != nil {
return err
}
defer ctrl.Close()
defer func() {
if err := ctrl.Close(); err != nil {
cfg.Log.Error("error closing controller", "err", err)
}
}()
defer stop()

query := strings.Join(args, " ")
Expand All @@ -116,7 +120,7 @@ func runSearchFn(fn func(*control.Controller, context.Context, string) error) fu
}
}

func searchController(ctx context.Context, cd *chunk.Directory, sess *slackdump.Session, terms []string) (*control.Controller, func(), error) {
func searchController(ctx context.Context, cd *chunk.Directory, sess *slackdump.Session, terms []string) (*control.DirController, func(), error) {
if len(terms) == 0 {
base.SetExitStatus(base.SInvalidParameters)
return nil, nil, errors.New("missing query parameter")
Expand Down Expand Up @@ -147,14 +151,12 @@ func searchController(ctx context.Context, cd *chunk.Directory, sess *slackdump.
sopts = append(sopts, stream.OptFastSearch())
}

var (
ctrl = control.New(
cd,
sess.Stream(sopts...),
control.WithLogger(lg),
control.WithFiler(fileproc.New(dl)),
control.WithFlags(control.Flags{RecordFiles: cfg.RecordFiles}),
)
ctrl := control.NewDir(
cd,
sess.Stream(sopts...),
control.WithLogger(lg),
control.WithFiler(fileproc.New(dl)),
control.WithFlags(control.Flags{RecordFiles: cfg.RecordFiles}),
)
return ctrl, func() { pb.Finish() }, nil
}
39 changes: 39 additions & 0 deletions cmd/slackdump/internal/bootstrap/database.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
package bootstrap

import (
"os"
"path/filepath"
"strings"
"time"

"github.com/jmoiron/sqlx"

"github.com/rusq/slackdump/v3/cmd/slackdump/internal/cfg"
"github.com/rusq/slackdump/v3/internal/chunk/dbproc"
"github.com/rusq/slackdump/v3/internal/chunk/dbproc/repository"
)

const defFilename = "slackdump.sqlite"

// Database returns the initialised database connection open for writing.
func Database(dir string, mode string) (*sqlx.DB, dbproc.SessionInfo, error) {
dbfile := filepath.Join(dir, defFilename)
// wconn is the writer connection
wconn, err := sqlx.Open(repository.Driver, dbfile)
if err != nil {
return nil, dbproc.SessionInfo{}, err
}
return wconn, sessionInfo(mode), nil
}

func sessionInfo(mode string) dbproc.SessionInfo {
si := dbproc.SessionInfo{
FromTS: (*time.Time)(&cfg.Oldest),
ToTS: (*time.Time)(&cfg.Latest),
FilesEnabled: cfg.DownloadFiles,
AvatarsEnabled: cfg.DownloadAvatars,
Mode: mode,
Args: strings.Join(os.Args, "|"),
}
return si
}
Loading
Loading