Help Understanding Character-to-Issue Linking in GCD Schema

26 views
Skip to first unread message

Joerdie Driscoll

unread,
May 8, 2025, 2:23:40 PMMay 8
to gcd-tech

Hi all,
I've been importing the full GCD SQLite schema into SQL Server for a personal project, and I've run into a wall trying to extract a complete list of appearances for a given character — in this case, Charles Xavier.

I've read the GCD Django wiki and the visual schema diagram. I understand that:

  • gcd_character_name_detail contains character names, each tied to a character_id.

  • gcd_story_character.character_id refers to a gcd_character.id.

  • gcd_story_character links characters to stories, and those stories link to issues.

  • Multiple rows in gcd_character_name_detail can share the same name but have different character_ids.

  • A single character_id can appear in multiple character_name_detail rows.

All clear so far.

❓ The Problem

I'm trying to get a complete and accurate list of issues where Charles Xavier appears, including:

  • Tie-ins

  • Cameos

  • Crossovers

  • Appearances outside of "X-Men"-titled series (which is the most important part — I don't want to hardcode series filters)

Here’s what I’ve tried:

✅ Attempted queries:
  • Filtering gcd_story_character.character_id IN (808, 6269) — these two seem to be associated with Charles Xavier based on gcd_character_name_detail, but they return mostly garbage: Disney, Western Comics, etc.

  • Filtering by gcd_character_name_detail.name = 'Charles Xavier' (regardless of character_id) — still mostly junk, still missing 90% of known appearances.

  • Joining story_character → story → issue → series cleanly — same outcome.

  • Even trying all character_ids associated with name = 'Charles Xavier' — still only ~139 issues returned, many incorrect.

I expect ~1000 appearances based on manually reviewed data. These queries give me about 10% of that, and most aren't even Marvel books.

🔍 What I’m NOT Trying To Do
  • I'm not trying to limit the query to a list of "X-Men" series.

  • I'm not trying to build a curated list by hand.

  • I'm not assuming the data is bad — I’m assuming I don’t understand how the character linkage should be used.


🧠 What I Think Might Be Happening
  • gcd_character.character_id values are reused across unrelated characters with the same name.

  • There’s no canonical “mainline” Xavier that I can lock onto.

  • Contributors may create a new character_id for every context where Charles Xavier appears (e.g., different publisher, language edition, AU).

  • There may not be a way to get a full, reliable set of appearances just by querying character_id or name.


❓ My Questions
  1. Is there a canonical/primary character entry for someone like Charles Xavier? If yes, how do I find it?

  2. Is the character_id supposed to uniquely represent a single character concept across all appearances?

  3. If not, how do contributors link a name like “Charles Xavier” to the correct appearances in Spider-Man, New Mutants, etc.?

Is there any way — short of manually maintaining a character-to-issue mapping — to get reliable data for this?
Example Query(s) That Fail(s)

SELECT id, character_id, name

FROM gcd_character_name_detail

WHERE name = 'Charles Xavier'


SELECT DISTINCT

i.id AS IssueId,

i.number,

s.name AS SeriesTitle,

i.publication_date

FROM gcd_story_character sc

JOIN gcd_character_name_detail n ON sc.id = n.id

JOIN gcd_story st ON sc.story_id = st.id

JOIN gcd_issue i ON st.issue_id = i.id

JOIN gcd_series s ON i.series_id = s.id

WHERE n.name = 'Charles Xavier'

ORDER BY s.name, i.number




EXEC sp_help 'gcd_story_character'


SELECT DISTINCT

i.id AS IssueId,

i.number,

s.name AS SeriesTitle,

i.publication_date

FROM gcd_story_character sc

JOIN gcd_story st ON sc.story_id = st.id

JOIN gcd_issue i ON st.issue_id = i.id

JOIN gcd_series s ON i.series_id = s.id

WHERE sc.character_id IN (808, 6269)

ORDER BY s.name, i.number



SELECT DISTINCT

i.id AS IssueId,

i.number,

s.name AS SeriesTitle,

i.publication_date,

n.character_id

FROM gcd_character_name_detail n

JOIN gcd_story_character sc ON sc.character_id = n.character_id

JOIN gcd_story st ON sc.story_id = st.id

JOIN gcd_issue i ON st.issue_id = i.id

JOIN gcd_series s ON i.series_id = s.id

WHERE n.name = 'Charles Xavier'

ORDER BY s.name, i.number

Mike O'Regan

unread,
May 9, 2025, 2:32:03 PMMay 9
to gcd-tech
Working on a project you care about is a great opportunity to contribute missing data and links!

> gcd_story_character.character_id refers to a gcd_character.id.

gcd_story_character.character_id actually referrs to gcd_character_name_detail.id. That might be your only technical problem. (I ran into the same thing when figuring out the schema.)
Once you've identified a character (i.e. a gcd_character.id), it's easy to find all appearances:  gcd_character -> gcd_character_name_detail -> gcd_story_character -> gcd_story -> gcd_issue. The interesting part is discovering all the Charles Xavier characters. I.e. identifying every gcd_character.id you care about.

To fully achieve your goal, you're probably going to have to do at least some curation. Discovery via code is always going to be imperfect, so I would expect to need to include and exclude some characters by hand, and to adjust those tweaks a bit over time. (Thankfully, lots of the work can, indeed, be done in SQL, like excluding languages or universes you don't want.)
For instance, if you query for "Charles Xavier", one of the characters you'll get is from a reality in What If? https://www.comics.org/character/807/ .  If you don't want that one, you'll have to exclude it, either by id, by universe_id, or some other mechanism that gets you the results you want.

One way to discover characters that you care about without hand-tweaking is to walk the some of the gcd_character_relation trees: generalization of/specialization of (type 6), younger version of/older version of (type 5), alternate universe version of (type 3). If you want to find the character in other languages check out the republished as/originally published as (type 1) relation. I don't know whether the alias of/secret identity of (type 2) relation will help you in your quest or not.

You'll soon have to decide whether you want to exclude trade paperbacks, omnibuses, and the like. (Attributes in gcd_series will probably help.)

You'll find that the data has more gaps than you expect. For instance, characters appearing in much of the the 1991 X-Men series do not show up in gcd_story_character. (In GCD parlance, they are not "linked".) They only appear as free text in gcd_story.characters.

> I expect ~1000 appearances based on manually reviewed data

It looks like there are more than 500 *series*, let alone issues :-)

Enjoy your search,

Mike

Jochen G.

unread,
May 10, 2025, 6:31:22 AMMay 10
to gcd-...@googlegroups.com
gcd_story_character.character_id refers to a gcd_character_name_detail.id

gcd_story_character.character.character_id IN (808, 6269)
should do what you want

The python/django could give you pointers on how the tables are used.


Am 08.05.25 um 20:23 schrieb Joerdie Driscoll:
> Hi all,
> I've been importing the full GCD SQLite schema into SQL Server for a
> personal project, and I've run into a wall trying to extract a complete
> list of appearances for a given character — in this case, Charles Xavier.
>
> I've read theGCD Django wiki <https://github.com/GrandComicsDatabase/
> gcd-django/wiki>and the visual schema diagram. I understand that:
>
> *
>
> gcd_character_name_detailcontains character names, each tied to a
> character_id.
>
> *
>
> gcd_story_character.character_idrefers to a gcd_character.id.
>
> *
>
> gcd_story_characterlinks characters to stories, and those stories
> link to issues.
>
> *
>
> Multiple rows in gcd_character_name_detailcan share the same namebut
> have different character_ids.
>
> *
>
> A single character_idcan appear in multiple character_name_detailrows.
>
> All clear so far.
>
> ❓ The Problem
>
> I'm trying to get a complete and accurate list of issues where Charles
> Xavier appears, including:
>
> *
>
> Tie-ins
>
> *
>
> Cameos
>
> *
>
> Crossovers
>
> *
>
> Appearances outside of "X-Men"-titled series (which is the most
> important part — I don't want to hardcode series filters)
>
> Here’s what I’ve tried:
>
> ✅ Attempted queries:
>
> *
>
> Filtering gcd_story_character.character_id IN (808, 6269)— these two
> seem to be associated with Charles Xavier based on
> gcd_character_name_detail, but they return mostly garbage: Disney,
> Western Comics, etc.
>
> *
>
> Filtering by gcd_character_name_detail.name = 'Charles
> Xavier'(regardless of character_id) — still mostly junk, still
> missing 90% of known appearances.
>
> *
>
> Joining story_character → story → issue → seriescleanly — same outcome.
>
> *
>
> Even trying all character_ids associated with name = 'Charles
> Xavier'— still only ~139 issues returned, many incorrect.
>
> I expect ~1000 appearances based on manually reviewed data. These
> queries give me about 10% of that, and most aren't even Marvel books.
>
> 🔍 What I’m NOT Trying To Do
>
> *
>
> I'm nottrying to limit the query to a list of "X-Men" series.
>
> *
>
> I'm nottrying to build a curated list by hand.
>
> *
>
> I'm notassuming the data is bad — I’m assuming I don’t understand
> how the character linkage shouldbe used.
>
> ------------------------------------------------------------------------
>
> 🧠 What I ThinkMight Be Happening
>
> *
>
> gcd_character.character_idvalues are reused across unrelated
> characters with the same name.
>
> *
>
> There’s no canonical “mainline” Xavier that I can lock onto.
>
> *
>
> Contributors may create a new character_idfor every context where
> Charles Xavier appears (e.g., different publisher, language edition,
> AU).
>
> *
>
> There may not be a way to get a full, reliable set of appearances
> just by querying character_idor name.
>
> ------------------------------------------------------------------------
>
> ❓ My Questions
>
> 1.
>
> Is there a canonical/primary character entry for someone like
> Charles Xavier?If yes, how do I find it?
>
> 2.
>
> Is the character_idsupposed to uniquely represent a single character
> concept across all appearances?
>
> 3.
>
> If not, how docontributors link a name like “Charles Xavier” to the
> --
> You received this message because you are subscribed to the Google
> Groups "gcd-tech" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to gcd-tech+u...@googlegroups.com <mailto:gcd-
> tech+uns...@googlegroups.com>.
> To view this discussion visit https://groups.google.com/d/msgid/gcd-
> tech/dc8dccc8-7610-4869-8819-ee480b8a7695n%40googlegroups.com <https://
> groups.google.com/d/msgid/gcd-tech/dc8dccc8-7610-4869-8819-
> ee480b8a7695n%40googlegroups.com?utm_medium=email&utm_source=footer>.

Joerdie Driscoll

unread,
May 10, 2025, 12:10:00 PMMay 10
to gcd-tech
Thank you for this explanation. It was super helpful and I think I much further along. What's odd is how there are multiple "Charles Xavier" even in the main 616 Marvel universe. 2 that I can see are ID 6269 and 808. The below query gets me MUCH further than I was. It seems to work perfectly with 6269. After some filtering I am seeing exactly what I am looking for. But 808, no rows. I am still playing with my query to find where the fall off happens. It just seems to not be as bulletproof as I thought it would be using the data!

SELECT
-- gcd_character_name_detail d
    d.id AS d_id,
    d.character_id AS d_character_id,

-- gcd_story_character sc
    sc.id AS sc_id,
    sc.character_id AS sc_character_id,
    sc.story_id AS sc_story_id,

-- gcd_story st
    st.id AS st_id,
    st.issue_id AS st_issue_id,

-- gcd_issue i
    i.id AS i_id,
    i.number AS i_number,
    i.volume AS i_volume,
    i.series_id AS i_series_id,
    i.on_sale_date AS i_on_sale_date,

-- gcd_series s
    s.id AS s_id,
    s.name AS s_name,
    s.year_began AS s_year_began,
    s.first_issue_id AS s_first_issue_id,
    -- REPLACED: s.publisher_id,
    p.name AS publisher_name,
    -- REPLACED: s.country_id,
    s.issue_count AS s_issue_count

FROM gcd_character c
JOIN gcd_character_name_detail d ON d.character_id = c.id
JOIN gcd_story_character sc ON sc.character_id = d.id
JOIN gcd_story st ON st.id = sc.story_id
JOIN gcd_issue i ON i.id = st.issue_id
JOIN gcd_series s ON s.id = i.series_id
LEFT JOIN gcd_publisher p ON s.publisher_id = p.id
LEFT JOIN stddata_country ctry ON s.country_id = ctry.id

WHERE c.id = 6269
  AND s.language_id = 25
  AND i.variant_of_id IS NULL
  AND ctry.name IN ('United States', 'United Kingdom')
  AND s.publishing_format NOT IN ('collected edition', 'collected editions')
  AND s.binding <> 'trade paperback'
  AND s.name NOT LIKE 'True Believers:%'
  AND s.name NOT LIKE 'Toyfare: The Toy Magazine%'
  AND s.name NOT LIKE 'Official Handbook%'
ORDER BY i.publication_date;
Reply all
Reply to author
Forward
0 new messages