Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Contact deletion may delete a contact with existing messages #4792

Open
link2xt opened this issue Oct 5, 2023 · 10 comments
Open

Contact deletion may delete a contact with existing messages #4792

link2xt opened this issue Oct 5, 2023 · 10 comments

Comments

@link2xt
Copy link
Collaborator

link2xt commented Oct 5, 2023

Here SQL code checks that the contact is not a member of any chat:

let deleted_contacts = transaction.execute(
"DELETE FROM contacts WHERE id=?
AND (SELECT COUNT(*) FROM chats_contacts WHERE contact_id=?)=0;",
(contact_id, contact_id),
)?;

However, even if the contact is not a member of the chat, they may be a former member of the chat and have messages there. In this case these messages will start pointing to non-existent contact. The contact ID may still be used in msgs.from_id, msgs.to_id, locations.from_id, msgs_mdns.contact_id. Address may still be in acpeerstates.addr and acpeerstates.verifier.

See previous comment here: #4791 (review)

Maintaining the code checking that contact is not referenced from anywhere is error-prone. Trying to delete the contact in housekeeping as proposed in #4775 may never actually delete the contact if they sent us a message in another chat, tried to add us to the group, sent us a location etc.

As an alternative solution, maybe we should actually never delete an account, but anonymize it instead, similarly to how Discourse and GitHub do this when user deletes an account? As long as we are not going to write to this contact, we can replace its address, name, authname and hide it.

@iequidoo
Copy link
Collaborator

iequidoo commented Oct 6, 2023

I'm unsure we should never delete garbage from the db. Better to change all places in the code so that they expect that a contact could be already deleted. F.e. for messages referencing a deleted account DC could show "Deleted-<contact_id>" as a contact name so that a user can see which messages are from the same contact, and so on. Do we really need some contact info to be preserved after that contact deletion? If so, maybe it's better to store it in other tables where it's actually needed, instead of contact ID?

@r10s
Copy link
Member

r10s commented Oct 6, 2023

Better to change all places in the code so that they expect that a contact could be already deleted

that sounds like a big refactoring, including new and other issues.

(housekeeping as such was introduced to avoid re-downloading of just deleted or not shown message - and then, as it is there, used for other things)

wondering what the goal of the physical contact deletion is - eg. the deletion of an entry in "new chat suggestion" (or the "address list" if you will) does not imply ux-wise that all older messages are deleted or their contact data are anonymized. this was never the idea.
(@Septias pointed out similar things at #4791 (comment) )

so, maybe there is just no reasonable actionale item :)

@Septias
Copy link
Collaborator

Septias commented Oct 6, 2023

I believe, for the deleting problem the proposed solution from @link2xt is pretty good as there are very few things that can break with this approach (even though we need to think about how to handle the verifier information as users may still want to see the formerly deleted email in verifier information).
As to if this is actually needed, I think the answer may well be not. The initial problem with this issue is that the bot needs to have a way to erase all user information as part of the privacy notice, maybe I can rigorously delete all the possible references on the bot side because the bot only needs a subset of the deltachat provided functions. That would be a good per use-case solution without needing a general solution for this very specific problem.

@link2xt
Copy link
Collaborator Author

link2xt commented Oct 7, 2023

so, maybe there is just no reasonable actionale item :)

If we do nothing, there is a real bug remaining:

  1. Receive a message from user A in a group.
  2. Remove A from all groups and delete 1:1 chat with A.
  3. Remove contact A. DELETE from contacts gets executed.
  4. Open group where message from A was received.
    What should be shown as the avatar and name of the user if contact ID in the from_id is not valid anymore?

I would expect either contact A getting only hidden in this case, or become an anonymous/ghost/tombstone contact, like what happens when a user deletes their account on GitHub or gets anonymized in Discourse forum.

@link2xt
Copy link
Collaborator Author

link2xt commented Oct 7, 2023

The initial problem with this issue is that the bot needs to have a way to erase all user information as part of the privacy notice, maybe I can rigorously delete all the possible references on the bot side because the bot only needs a subset of the deltachat provided functions. That would be a good per use-case solution without needing a general solution for this very specific problem.

Maybe there should be a separate core API, like forget_contact that deletes the contact 1:1 chat, locally removes the contact from all groups without sending any message, removes all associated messages, removes associated peerstate and locations and then deletes the contact.

@Septias
Copy link
Collaborator

Septias commented Oct 8, 2023

I think we should only have two functions: hide_contact which only sets origin to Origin::Hidden so it will not be displayed and then delete_contact or forget_contact to remove one contact completely, also deleting all messages which were sent by this contact.

@iequidoo
Copy link
Collaborator

iequidoo commented Oct 8, 2023

Open group where message from A was received.
What should be shown as the avatar and name of the user if contact ID in the from_id is not valid anymore?

I'd prefer to see "Deleted-<from_id>" (in grayish) as the name and some hash image like GitHub's identicon as the avatar.

Maybe there should be a separate core API, like forget_contact that deletes the contact 1:1 chat, locally removes the contact from all groups without sending any message, removes all associated messages, removes associated peerstate and locations and then deletes the contact.

I'm not sure that deleting a contact should delete the 1:1 chat. At least at the core level these can be separate entities. And as for removing the contact from groups, is it possible at all? What if new message arrives, of course containing the contact address in "To"? Why do we want to remove all associated messages? Anyway messages can be quoted

@link2xt
Copy link
Collaborator Author

link2xt commented Oct 8, 2023

And as for removing the contact from groups, is it possible at all? What if new message arrives, of course containing the contact address in "To"?

forget_contact is only for the bots which only chat in 1:1 chats and want to forget about users who have not interacted with them for more than 90 days or so: https://codeberg.org/webxdc/xstore/pulls/286
If this breaks group chats etc. we do not care that much.
This is not a function for the UI.
We should probably track this forget_contact in a separate issue.

I think we should only have two functions: hide_contact which only sets origin to Origin::Hidden so it will not be displayed and then delete_contact or forget_contact to remove one contact completely, also deleting all messages which were sent by this contact.

UIs already use delete_contact, do you propose to add two options in the UI (hide_contact and delete_contact)? I would rather keep only the delete_contact API and fix what it does.

@iequidoo
Copy link
Collaborator

iequidoo commented Oct 9, 2023

forget_contact is only for the bots which only chat in 1:1 chats and want to forget about users who have not interacted with them for more than 90 days or so: https://codeberg.org/webxdc/xstore/pulls/286

Then it's indeed better to delete the contact completely from the db to avoid db bloat for bots. And just to fix all places in the code that misbehave if contact_id refers to an unexistent contact. Even if this takes some time, it's the problem for bots, not for all users.

@Septias
Copy link
Collaborator

Septias commented Oct 9, 2023

UIs already use delete_contact, do you propose to add two options in the UI (hide_contact and delete_contact)? I would rather keep only the delete_contact API and fix what it does.

In the UI, I would only show the hide_contact but with a label that tells the user this is a "deleting" action. The core should then only reset the origin to the most basic case as if we just saw some message from him in some random chat. This way, there are no messages with removed contact_ids and it should still handle the use case deleting a contact for a user. (This opinion might be based upon some misconceptions about how the core handles contacts though)

The other API delet_contact should be the bot-only API that actually removes every trace of the contact_id so that we can confidently say, that we deleted all user data in our privacy notice.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants