Done
Details
Details
Priority
Assignee
Charles Langlois
Charles LangloisReporter
Charles Langlois
Charles LangloisPair
Sébastien Duthil
Fix versions
Sprint
None
Story Points
0.5
Zendesk Support
Zendesk Support
Created November 21, 2023 at 3:04 PM
Updated December 13, 2023 at 7:49 PM
Resolved December 8, 2023 at 10:18 AM
A syncdb implementation to cleanup deleted tenants and their associated data in call-logd introduced a regression that deleted call-logd data from all tenants.
This was caused by a datatype mismatch in the comparison between uuid values in a set difference operation(“comparing apples to oranges”). One set contained uuids in string representation, while the other contained uuids in python
UUID
object representation. Even though some values where semantically equivalent, the difference in representation(data type) resulted in a mismatch which misclassified some uuids as corresponding to deleted tenants. Those tenants and their associated data(through database foreign key constraints) were then promptly deleted.Some of that data(e.g. call logs) could be regenerated, but some could not(e.g. recordings).
Causes & factors
limited automated and manual testing during development and review(only covering the intended behavior and not considering unintended consequences)
missing type-level feedback during development(part of the code involved relies on opaque interfaces that leave the developer guessing for the data types involved unless time is taken to investigate beyond abstraction boundaries, in separate libraries) and missing automated typing analysis(this is a bug that could have been caught by static type analysis if some interfaces involved were properly typed)
limited regression testing during release phase(was the syncdb code run in test environments during release testing, and was the impact on call log data investigated?)
database design does not support any recoverable “soft-delete” which would limit severity of such regressions
syncdb implementation design relies on comparing data from another service, wazo-auth, as source of truth. If wazo-auth fails to expose all existing tenants as expected(because of API bug, permission issue, problem in formulation of request) then this can result in deleting data that shouldn’t be deleted
Resolution
Direct fix of superficial cause is trivial and involves converting data from one data type(`uuid.UUID` python type) to another(`str`) to allow for consistent “apple to apple” comparison.
Proper resolution must include proper automated test coverage that considers the regression scenario and verifies that the
syncdb
execution does not affect data from any other tenant than those expected to be affected.