Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updates the readme to show how to pip install and run a transform. #928

Open
wants to merge 25 commits into
base: dev
Choose a base branch
from

Conversation

daw3rd
Copy link
Member

@daw3rd daw3rd commented Jan 8, 2025

Why are these changes needed?

To show user how easy it is to run a transform.

Related issue number (if any).

#872

daw3rd added 19 commits August 13, 2024 17:56
Signed-off-by: David Wood <[email protected]>
@shahrokhDaijavad
Copy link
Member

@daw3rd I like the way you have written the Run a transform at the command line section and have moved the stuff about Create a Virtual Environment to the README in the examples folder. I think the paragraph titled Fastest way to experience Data Prep Kit in a Notebook should come first in the Getting Started section, i.e., ahead of running from the command line.

Copy link
Collaborator

@touma-I touma-I left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@daw3rd Let's start an issue and get peer input on what is the best way to solve this problem. We seem to have different stakeholders arguing for different things and also the more you put in the root readme the harder it is to maintain. cc: @shahrokhDaijavad

@daw3rd
Copy link
Member Author

daw3rd commented Jan 8, 2025

no consensus on utility of cli-based example, for now.

@daw3rd daw3rd closed this Jan 8, 2025
@touma-I touma-I reopened this Jan 9, 2025
@daw3rd
Copy link
Member Author

daw3rd commented Jan 9, 2025

@shahrokhDaijavad I have reordered the Getting Started sections.

@daw3rd daw3rd changed the title Updates the readme to show to to pip install and run a transform. Updates the readme to show how to pip install and run a transform. Jan 9, 2025
Signed-off-by: David Wood <[email protected]>
@shahrokhDaijavad
Copy link
Member

Thanks, @daw3rd. This updated README is definitely better than what we have now. Let's wait for others to chime in.

@shahrokhDaijavad
Copy link
Member

@daw3rd I tested your run-transform.ipynb on Google Colab successfully by uncommenting the pip install of data-prep-toolkit in the first line. Nice job! Clicking on the Open in Colab icon will only work when this notebook has been merged in dev.

@Bytes-Explorer
Copy link
Collaborator

Bytes-Explorer commented Jan 10, 2025

An alternate thought for consideration..
With this PR, we are doing dpk install for a specific transform. But the readme is supposed to help you to get accustomed with the toolkit, so that you can use any transform, learning from a given example. Is there a reason we dont install full dpk? Ideally a one liner like,

pip install dpk
two line example on how to use a given transform

would make the experience simple and easy.

The conda setup etc helps a user not having to struggle with install issues or python mismatch errors. Removing it may cause some extra steps of debugging for the user which may go against the whole effort of simplification. What are we trying to achieve by removing the steps from readme? Understanding that perspective can help us come up with other solutions as well.

@sujee @agoyal26 since this will directly impact first time users, you guys should also chime in.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants