Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add Dockerfile to containerize peepdb #68

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

bobo333
Copy link

@bobo333 bobo333 commented Feb 12, 2025

Happy to make adjustments if requested. I only pinned the version of peepdb itself, as the installation instructions don't mention specific versions of other dependencies.

Also happy to add docs for building if desired. I built this with docker build . -t peepdb:0.1.4. To specify a different version of peepdb it can be done as docker build --build-arg version=0.1.3 . -t peepdb:0.1.3

I did some testing with docker-compose and afaict it works

$ docker exec -it peepdb-peepdb-1 bash
root@b70985d82d30:/# peepdb list
Saved connections:
- mysql1 (mysql)
- postgres1 (postgres)
root@b70985d82d30:/# peepdb view postgres1
INFO:PostgreSQLDatabase:Connected to PostgreSQL database: steve1
Fetching all tables
Tables fetched: ['accounts']
INFO:PostgreSQLDatabase:Disconnected from PostgreSQL database: steve1
Table: accounts
+------+------------+-------+
|   id | username   |   age |
+======+============+=======+
|    1 | steve      |    36 |
+------+------------+-------+
|    2 | bob        |    38 |
+------+------------+-------+
|    3 | jude       |     1 |
+------+------------+-------+
Page 1 of 1 (Total rows: 3)

The ~/.peepdb/ directory will have to be volume mounted into the container to not lose saved connections across containers, but that's more of a user preference, rather than part of building the docker image itself. I have a docker-compose file and setup that I was testing with if that would be helpful to include.

I tried to keep the image as small as I could, but some of the python packages, as well as the libmariadb3 libmariadb-dev requirements add a good amount of size (~300MB).

$ docker images
REPOSITORY      TAG                  IMAGE ID       CREATED          SIZE
peepdb          0.1.4                ac619680def2   9 minutes ago    422MB
python          3.12-slim-bookworm   6f3c6367c5a3   7 days ago       124MB

One of the alpine images could possibly be used if there's a strong desire to make the image smaller, but considering the installed packages themselves account for most of the space, I'm not sure how beneficial that would be. It also uses apk instead of apt and I didn't look into how the mariadb dependencies map to that package manager.

Thanks!

resolves #48

@Aherontas
Copy link
Collaborator

Hello @bobo333, thanks a lot for the effort you put to write this detailed comment. We also wanted to containerize the peepdb as it would be more plug n play, so yes we could proceed with that feature. I have some comments from my side:

1) The ~/.peepdb/ directory will have to be volume mounted into the container to not lose saved connections across containers, but that's more of a user preference, rather than part of building the docker image itself. I have a docker-compose file and setup that I was testing with if that would be helpful to include.

  • I agree it should be mounted as a volume, but we can create a cli or a mechanism to ask the user if a volume is his preference or not. (in case he only wants it for ephimeral usage, there is no reason to create a volume)

2) I tried to keep the image as small as I could, but some of the python packages, as well as the libmariadb3 libmariadb-dev requirements add a good amount of size (~300MB).

  • This is a problem in general we shouldn't need to install each time mariadb cmds. Same goes for oracle and firebase-admin packages. We must either force the user to install them and update our readme inside the container if they need to use that specific packages or think of an alternative way including them. (I agree the size of the container is really big due to them and that's an issue). A possible solution could be to ask the user what DBs he want to use and create a dynamic docker-compose for each case.

3) One of the alpine images could possibly be used if there's a strong desire to make the image smaller, but considering the installed packages themselves account for most of the space, I'm not sure how beneficial that would be. It also uses apk instead of apt and I didn't look into how the mariadb dependencies map to that package manager.

  • I also agree that moving to a smaller OS will not fix the image size.

@Aherontas Aherontas added the feature A big implementation label Feb 12, 2025
include information about creating a volume for persisting connection data across containers
@bobo333
Copy link
Author

bobo333 commented Feb 13, 2025

Hi @Aherontas thanks for taking the time to review. I'll go in order of your comments

  1. I added some information to the readme, including instructions about how to create and use a volume to persist connections if the user wants. I think that might be easier than building a custom cli tool, but ultimately up to you!

2 and 3. The way the package is built right now, pip won't even install it unless the mariadb libraries are available on the host. One option would be to move the mariadb python dependencies to be part of an [extras] install, and then users could use that if they were connecting to mariadb, but wouldn't need the extra size if they weren't. That seems like pretty big changes to the library itself though, and I wonder if that would be better handled separately from this work to dockerize the library as it currently exists.

That being said, even if it was only included as part of the [extras], once in a docker container that would either require the user to build an additional container to add in mariadb support, which might be kind of clunky, especially for people not that familiar with docker. Or what might be a more user friendly option is to provide 2 docker images, one with extras and one (much smaller) without.

Ultimately up to you, my personal recommendation is containerize it as is (despite the large size) and then create a separate issue to work on reducing the size of the image by making mariadb an optional/extras part of the install. (Or ideally find a way to support mariadb without requiring those extra libraries to be on the host, it seems kind of odd that even client code requires all the binaries, but I am not a mariadb expert so maybe there's a reason)

peepDB is also available as a `docker` image

```
docker pull [docker repo]/peepdb:0.1.4
Copy link
Author

@bobo333 bobo333 Feb 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Aherontas I'm not sure what docker repo you plan to use, I can update this accordingly if you let me know what it should be

@Aherontas
Copy link
Collaborator

Thank you @bobo333 for the detailed answer, I tottally agree with the following:

Ultimately up to you, my personal recommendation is containerize it as is (despite the large size) and then create a separate issue to work on reducing the size of the image by making mariadb an optional/extras part of the install. (Or ideally find a way to support mariadb without requiring those extra libraries to be on the host, it seems kind of odd that even client code requires all the binaries, but I am not a mariadb expert so maybe there's a reason)

We may have to also rethink the way DB dependencies are handled. Probably we could do what you said, making a big container in size for now and think if MariaDB could be removed or be handled in another more efficient way. Let's wait also for @evangelosmeklis opinion on this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature A big implementation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Put it all in docker so it is easier for people to set it up
2 participants