-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds Data Team Request form under SQL Migrations section #7
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a couple questions about this process change - curious to hear your thoughts, Jack, @ParryChen, and/or anyone else on @nicheinc/data-enablement!
@@ -33,12 +33,13 @@ | |||
<!-- | |||
Migration Script: <link to migration script file> | |||
Description: <a description of the specific schema or data changes made by this migration script> | |||
|
|||
Keep this tag - the data enablement team will be notified via email of any changes: @nicheinc/data-enablement |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❓️ Wasn't the purpose of this tag to notify @nicheinc/data-enablement (which includes @ParryChen) that there's been a schema migration that might need to be reflected in Snowflake? I'm not necessarily opposed to submitting a Data Team Request in these cases, but it would be slightly more cumbersome for the PR author, vs. just linking the migration and uncommenting a tag.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey so I think the structure of the data-enablement team has changed a lot, where I'm not part of the data enablement team anymore and these changes are now being shifted to the DEs(whom get all their work tracked through the data team request form). So I think these make sense!
<!-- Does this migration affect tables in Snowflake? If yes, submit a Data Team Request to have the schema migration reflected in Snowflake. | ||
https://form.asana.com/?k=uNnfTyZZVsxifMCOIPdwiw&d=684757491145461 --> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❓️/🔧 Is the expectation that not all schema migrations that add/remove tables need to be reflected in Snowflake - and that it's up to the dev to decide on a per-migration basis? Whether we're tagging a GH team or opening an Asana ticket, IMO it should still be the Data Team's call, and in that case, I think it might make more sense for this callout to be grouped with the schema migration comment above. 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah so schema changes don't necessarily need to happen in Snowflake if the affected columns/schemas only serve backend services and not Snowflake stakeholders.
I think we could get more eyes on this from backend and data leadership so they can decide what's needed and what's not
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a strong reason not to say "all schema migrations should be reflected in snowflake, even if they're trivial"? I feel like I'm missing something here.
It seems like we're setting ourselves up for future problems if we decide there are migrations that don't need to be reflected in snowflake because:
- We can always be wrong when we make this decision - regardless of who is involved or how many people or automated systems we use
- Adding more layers of review will slow down velocity, either of updates to snowflake (best case) or actual deployments (will make everyone sad).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah so the schema migration process in snowflake often times involves re-snapshotting a table to backfill data from added columns. This re-snapshotting process needs to happen manually if the column data is populated with default values, because the Postgres WAL doesn't identify those as DML, and Streamkap won't pick these up as new events. It often takes some time to process/takes snowflake compute resources that cost money.
So depending on how often schema migrations occur, @jack-mcquown said it can happen a lot for certain services. It may be inefficient to add all schema migrations to Snowflake. Unless we add some note in backend schema migrations to use DML to update new columns rather than using default values.
@jack-mcquown @wmarshall @ParryChen, this PR has been hanging out for a while. What are folks thinking here? Do we need to have a wider discussion between BE and DE to hash out these organizational details? |
Providing some more context as we've learned more about how schema migrations work for Snowflake under Streamkap. So basically, there are two instances where manual work is currently necessary on our side.
We don't really have a way right now to automate the above instances. So that is a potential blocker for having all schema migrations be reflected in Snowflake(assuming there's a lot). |
Dependencies
Documentation
Description
Adds a new comment about submitting a Data Team Request for having a schema migration reflected in Snowflake.
This change is based on the conversation in this data team request related to getting a new column into a
funnel
Snowflake table.Testing Considerations
SQL Migrations
Deployment
Versioning