-
Notifications
You must be signed in to change notification settings - Fork 1
Add a Synapse Quarantined Media exchange #7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The significant change is in the Dockerfile, where we install poetry so we can install the new exchange (and possibly future ones as well). We try to keep the number of layers relatively small to make the image easier to download/less bloated. There's also some indirection via Docker Compose Secrets here for local testing/debugging. Essentially, we're just trying to run the environment variable from the machine down to the container, which requires secrets for reasons. Unfortunately, secrets can only write to files inside the container, so we have to extract the contents of that file back into the environment. That extraction is done in `startup.sh`.
H-Shay
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in a perfect pythonic world it would be cool to see docstrings but it's not a huge deal
|
I've added some basic docstrings and fixed the |
|
|
||
| # Now process remote media | ||
| remote_token = checkpoint.remote_from_token if checkpoint else 0 | ||
| remote_media_mxcs, next_batch = self._fetch(False, remote_token) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since _fetch (and _hash) can raise an error, would it make sense to try/catch those calls?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there's not much we can do if we do catch it, so we let the fetcher inside HMA catch+log it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds good!
|
Just one question, otherwise looks good, thanks for the docstrings! |
Reviewable commit-by-commit (recommended - some commit messages have additional context)
Requires element-hq/synapse#19308 on the server.
The ExchangeAPI comes from python-threatexchange, which is the underlying code behind HMA. The interface is a bit awkward at first glance, but the short version is:
fetchwhenever it wants more data. We don't want to block this for too long, but as long as required is fine._hashand_fetchMuch of the code is inspired by existing exchanges in python-threatexchange.
This has been tested locally. One day we should figure out how to put CI on this.