-
Notifications
You must be signed in to change notification settings - Fork 1
Description
Summary: The biocyc (metacyc version 25.1; humancyc version 23.0) data
that I have is inconsistent (or not used correctly?). I do not know how
to correctly fix the issue so I need to rely on you to decide how to
deal with this.
Bug description: When rendering a result for a pathway containing
RIBOSE-5P, the renderer crashes with a KeyError (dictionary lookup) in
Dries rendering code, and my rendering code crashes with a failed
defensive assertion. So no result pathway image is generated.
Details: The failing pathway defines a reaction:
| INPUTS | REACTION | OUTPUTS |
|---|---|---|
| CARBON-DIOXODE | RIB5PISOM-RXN | RIBOSE-5P |
| RIBULOSE-5P |
Data from the HumanCyc compounds.dat file is used to map the left-
and right- sides of reactions. The ONLY reference to RIBOSE-5P in
HumanCyc/compounds.dat is:
UNIQUE-ID - CPD-15895
TYPES - RIBOSE-5P
So your function generate_reaction_instances() replaces the
left side inputs ['RIBOSE-5P'] with
['CPD-15895'] which has no overlap with the pathway (from
the attached pathway file) resulting in NO reaction instances being
generated for ANY reaction containing RIBOSE-5P in
determine_instance_reaction_path().
Questions:
- Is this just a biocyc curation issue?
- If so, can the integrity of biocyc data be verified by checking that
any compound references in a 'TYPES' field also has an associated
'UNIQUE-ID' entry? - Why not use the more complete metacyc version of compounds.dat? (I
tried doing this naively but apparently you generate some reaction
instances manually. Is that the case for any reaction involving
RIBOSE-5P; and any other compound with no entry in compounds.dat?) - How should such data inconsistencies be handled? Use Metacyc? Lookup
data in Metacyc when it's missing from humancyc? Something else? I would
like to catch these kinds of errors as early in the processing pipeline
as possible.