Skip to content

Commit

Permalink
Merge pull request #5 from buildnn/v0.1.2
Browse files Browse the repository at this point in the history
V0.1.2
  • Loading branch information
ggbaro authored Apr 7, 2020
2 parents 1134b1f + f2cc333 commit 31be59d
Show file tree
Hide file tree
Showing 8 changed files with 255 additions and 178 deletions.
66 changes: 66 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
.PHONY: build_wheel

## build package wheel
build_wheel:
python3 setup.py bdist_wheel --universal

#################################################################################
# Self Documenting Commands #
#################################################################################

.DEFAULT_GOAL := help

# Inspired by <http://marmelab.com/blog/2016/02/29/auto-documented-makefile.html>
# sed script explained:
# /^##/:
# * save line in hold space
# * purge line
# * Loop:
# * append newline + line to hold space
# * go to next line
# * if line starts with doc comment, strip comment character off and loop
# * remove target prerequisites
# * append hold space (+ newline) to line
# * replace newline plus comments by `---`
# * print line
# Separate expressions are necessary because labels cannot be delimited by
# semicolon; see <http://stackoverflow.com/a/11799865/1968>
.PHONY: help
help:
@echo "$$(tput bold)Available rules:$$(tput sgr0)"
@echo
@sed -n -e "/^## / { \
h; \
s/.*//; \
:doc" \
-e "H; \
n; \
s/^## //; \
t doc" \
-e "s/:.*//; \
G; \
s/\\n## /---/; \
s/\\n/ /g; \
p; \
}" ${MAKEFILE_LIST} \
| LC_ALL='C' sort --ignore-case \
| awk -F '---' \
-v ncol=$$(tput cols) \
-v indent=19 \
-v col_on="$$(tput setaf 6)" \
-v col_off="$$(tput sgr0)" \
'{ \
printf "%s%*s%s ", col_on, -indent, $$1, col_off; \
n = split($$2, words, " "); \
line_length = ncol - indent; \
for (i = 1; i <= n; i++) { \
line_length -= length(words[i]) + 1; \
if (line_length <= 0) { \
line_length = ncol - indent - length(words[i]) - 1; \
printf "\n%*s ", -indent, " "; \
} \
printf "%s ", words[i]; \
} \
printf "\n"; \
}' \
| more $(shell test $(shell uname) = Darwin && echo '--no-init --raw-control-chars')
41 changes: 23 additions & 18 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
# Welcome to `dsdb`

> Dsdb is a [Buildnn](https://www.buildnn.com) open source project.
Dsdb is a [Buildnn](https://www.buildnn.com) open source project.

Tired of having to manage thousands of unstructured .csv outputs for your small Data Science experiments? Would you like to experience a real SQL-like data management of yout datasets with a real database?
Tired of having to manage thousands of unstructured .csv outputs for your small Data Science experiments? Would you like to experience a real SQL-like data management of yout datasets with a real database?

Take a look at what you can do with [Postgres](https://www.pgadmin.org/screenshots/#7).

Expand Down Expand Up @@ -30,30 +30,31 @@ with DsDbConnect() as con:
df.to_sql_table('table', con=con, if_exist='append')

```

and... that's it. To load data from the db:

```python
with DsDbConnect() as con:
df_read = pd.read_sql_table('test', con)
```


## Quickstart using docker-compose

The following workflow launches a dockerized `jupyter` server with an underlying db.
Firs, retrieve our pre-made `docker-compose.yml` file:
Firs, retrieve our pre-made `docker-compose.yml` file:

```bash
$ cd my-project-dir
$ wget https://raw.githubusercontent.com/buildnn/dsdb/master/docker-compose.yml
$ wget https://raw.githubusercontent.com/buildnn/dsdb/master/notebooks/dsdb_test.ipynb
$ touch .env
cd my-project-dir
wget https://raw.githubusercontent.com/buildnn/dsdb/master/docker-compose.yml
wget https://raw.githubusercontent.com/buildnn/dsdb/master/notebooks/dsdb_test.ipynb
touch .env
```

Open the `.env` file and place the following text, filling the `{text under curly brackets}` as suggested:

_content of the `.env` file -->_

```
```ini
DSDB_USER=datascientist
DSDB_PASSWORD={your password}
DSDB_DB=dsdb
Expand All @@ -65,36 +66,40 @@ POSTGRES_DB=mydb
PGADMIN_DEFAULT_EMAIL={your email}
PGADMIN_DEFAULT_PASSWORD={another different password}
```

And then start the game
```
$ docker-compose up

```bash
docker-compose up
```

And... **that should be it**.

Visit:

* `https://localhost:8888` to see jupyter
* `https://localhost:5050` to visit the pgadmin panel (use the credentials in .env)


## Pip Installation

To pip-install this repo:

```bash
$ pip install git+https://github.com/buildnn/dsdb.git
pip install dsdb
```

## Connection to a custom DB server
## Connection to a custom DB server

`dsdb.DsDbConnect` uses a `DsDb`
object to connect to your db. It loads some
**environment variables** and uses them to perform
the connection. these are

* `DSDB_USER`: your username in the DB
* `DSDB_PASSWORD`: your password to access the DB
* `DSDB_DB`: The name of the DB
* `DSDB_DB`: The name of the DB
* `DSDB_HOST`: The address of the DB server
* `DSDB_DRIVER`: The driver. E.g. `'postgres+psycopg2'` for a standard postgres.
* `DSDB_DRIVER`: The driver. E.g. `'postgres+psycopg2'` for a standard postgres.

The following is a quick way to create
them directly inside yout python script:
Expand All @@ -110,7 +115,7 @@ os.environ['DSDB_HOST'] = 'localhost:5432' # server address
os.environ['DSDB_DRIVER'] = 'postgres+psycopg2'

...
```
```

another option is to create a custom `dsdb.DsDb` object
to pass to `dsdb.DsDbConnect`:
Expand All @@ -128,4 +133,4 @@ db = dsdb._utils_dsdb.DsDb(
with dsdb.DsDbConnect(db=db) as con:
df.to_sql_table('table', con=con)
...
```
```
2 changes: 0 additions & 2 deletions build_wheel.sh

This file was deleted.

10 changes: 7 additions & 3 deletions docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,18 +4,21 @@ services:
image: jupyter/scipy-notebook:latest
container_name: jupyter-dsdb
environment:
# Jupyter container variables
JUPYTER_ENABLE_LAB: 1
# DsDb variables
DSDB_USER: ${DSDB_USER}
DSDB_PASSWORD: ${DSDB_PASSWORD}
DSDB_DB: ${DSDB_DB}
DSDB_HOST: db:5432
DSDB_DRIVER: postgres+psycopg2
command: [
"/bin/bash", "-c", "conda install --yes --quiet psycopg2 sqlalchemy && pip install -q git+https://bitbucket.org/buildnn/dsdb.git && start-notebook.sh --notebook-dir /local_directory --ip 0.0.0.0 --no-browser"]
"/bin/bash", "-c", "conda install --yes --quiet psycopg2 sqlalchemy && pip install -q dsdb && start-notebook.sh --notebook-dir ~/local_directory --ip 0.0.0.0 --no-browser"]
ports:
- 8888:8888
volumes:
- ./:/local_directory
- ./:/home/jovyan/local_directory
- jupyter_conda:/opt/conda
depends_on:
- db

Expand Down Expand Up @@ -45,4 +48,5 @@ services:
- db

volumes:
postgres_data:
postgres_data:
jupyter_conda:
19 changes: 11 additions & 8 deletions dsdb/_utils_dsdb.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,15 +7,18 @@
import attr
from sqlalchemy import create_engine


def return_pwd(*args):
return "pwd is hidden"


@attr.s
class DsDb(object):
usr = attr.ib(default=None)
db = attr.ib(default=None)
host = attr.ib(default=None)
driver = attr.ib(default=None)
hide_parameters = attr.ib(default=True)
pwd = attr.ib(default=None, repr=False)

def create_engine(self):
Expand All @@ -27,13 +30,10 @@ def create_engine(self):
pwd = self.pwd if self.pwd else os.getenv("DSDB_PASSWORD")

self.engine = create_engine(
'{}://{}:{}@{}/{}'.format(
driver,
usr,
pwd,
host,
db,
), echo=False)
"{}://{}:{}@{}/{}".format(driver, usr, pwd, host, db,),
echo=False,
hide_parameters=self.hide_parameters
)
return self.engine

def connect(self):
Expand All @@ -47,8 +47,11 @@ def close(self):
self.con.close()
return self


@contextlib.contextmanager
def DsDbConnect(db=DsDb(), buf=print):
def DsDbConnect(db=DsDb(), buf=print, hide_parameters=True):
if not db:
db = DsDb(hide_parameters=hide_parameters)
buf("connecting to DSDB...")
t0 = datetime.now()
yield db.connect()
Expand Down
Loading

0 comments on commit 31be59d

Please sign in to comment.