Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Seemingly pointless insertion of DB records in Contentful sync worker #564

Open
5 tasks done
hirurana opened this issue May 19, 2024 · 0 comments
Open
5 tasks done

Comments

@hirurana
Copy link

hirurana commented May 19, 2024

Describe the bug

Was looking through the contentful async worker code (packages/workers/content/services.py) and noticed that the sync of contentful entries seems to add new entries twice (if I understand this correctly).

The worker will call the sync() method within the ContentfulSync class which calls self.sync_entries().

def sync(self):
        self.sync_entries()
        self.unpublish_missing_entries()

This is defined as:

def sync_entries(self):
        skip = 0
        limit = 1000

        instances = []
        while True:
            entries = self.client.entries({'skip': skip, 'limit': limit, 'include': 0})
            skip += 1

            if not entries:
                break

            page_instances = [self.sync_entry(entry) for entry in entries]
            instances += list(filter(lambda x: x is not None, page_instances))

        self.session.add_all(instances)

we can see that this is essentially a wrapper to add multiple new entries into the different content models as it calls self.sync_entry(entry)

But this class method is defined as:

def sync_entry(self, entry: contentful.Entry):
        Model = self.get_db_model(entry.content_type)
        if Model is None:
            return None

        fields = {}
        for field_name, field in entry.fields().items():
            if isinstance(field, contentful.Link):
                fields[field_name] = field.sys
            else:
                fields[field_name] = field

        instance = Model(id=entry.sys['id'], fields=fields, is_published=True)
        instance = self.session.merge(instance)
        self.session.commit()
        self.entry_ids.add(entry.id)
        return instance

So question here is isn't the self.session.add_all(instances) in sync_entries() redundant as these records are already being added in sync_entry()?

Steps to reproduce

Go to file packages/workers/content/services.py within ContentfulSync

System Info

n/a

Logs

No response

Validations

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant