Skip to content

Refactor Crawler Errors #46

@ddxv

Description

@ddxv

Current logic is quite basic and developed overtime so needs a proper refactor and better error handling.

A few desires:

  1. The same crawl_result table is used for App-Ads.txt (crawling domains), downloading apps, unpacking sdks, running waydroid and querying the app stores. These are very different. It's been simple having them shared, but maybe the logical mess of "404 / not found" and what it means for each completely different situation is not worth it?

Regardless of that these would be easier fixes:

  1. crawl_result 4 change to 404 app not found. Makes it easier to remember when seeing in logs and just makes logical sense so would be nice to resolve.
  2. 401 or other error from appstore -> This perhaps should NOT be logged as a crawl error? For example, if crawling a healthy app and adscrawler hits a rate limit it should not affect that apps status. This app still needs to be crawled and should not be updated.
  3. 5 "to be deleted" are actually mac or chrome OS apps. These should be added to stores and fixed on they fly. As yet they would 'never' be crawled again, but better than removing since they do exist, just not in the way adscrawler expects.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions