Crime Incidents - Duplicate DC_KEY Records

88 views
Skip to first unread message

Brian Blacker

unread,
Oct 24, 2022, 12:06:31 PM10/24/22
to Open Data Philly
Hello-
We are regular users of the Crime Incidents data and have been using this data for a number of years. We have always used [DC_KEY] as a unique id/primary key on this dataset without issues. However, a recent download of the full Crime Incidents dataset has ~43 DC_KEY ids with multiple records. I've attached them here as a csv.

Some of these look like they may just be duplicate records as the rest of the incident details appear to be the same - for example DC_KEY = '202235038769'. Others have incident dates that are a couple days apart. And then others seem to be quite different with incidents that appear to be in similar locations but years apart - for example DC_KEY = '201025043366'. Spot checking most of these, it seems that most are Homicides.

Thanks,
Brian Blacker
CrimeIncidents_Dup_DC_KEY.xlsx

Linn, Justin A.

unread,
Oct 24, 2022, 12:27:24 PM10/24/22
to opendat...@googlegroups.com
Hi Brian:

It is interesting that you brought this up today.

This morning, while researching a double homicide, I noticed that there were two records in the dataset for the homicide. Again, this in fact was a double homicide.

I've also been a long time user of the dataset and haven't noticed this before.

Sincerely,

Justin Linn

--
You received this message because you are subscribed to the Google Groups "Open Data Philly" group.
To unsubscribe from this group and stop receiving emails from it, send an email to opendataphill...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/opendataphilly/405719e6-ca60-4588-b58e-297841c22a59n%40googlegroups.com.

Robert Cheetham

unread,
Oct 24, 2022, 12:30:36 PM10/24/22
to opendat...@googlegroups.com
Brian,

This does indeed sound like an unusual situation. I have forwarded to some folks on the data analysis team at the Phila PD and will post here if I hear back from them.

Best,

Robert

Linn, Justin A.

unread,
Oct 24, 2022, 12:46:10 PM10/24/22
to opendat...@googlegroups.com
Hello Brian and Robert:

To confirm this, I looked up the only triple homicide that I could think of, which was on 3/5/2022 at 6900 Cedar Park Ave to see if there were (3) records for this incident.

There are indeed (3) records, but the one has a typo. (2) entries for 202214010740 and (1) for 202214010470.

Sincerely,

Justin Linn



Robert Cheetham

unread,
Oct 28, 2022, 5:51:50 PM10/28/22
to opendat...@googlegroups.com
I received a reply from the Phila PD this week. 

The Phila PD is transitioning to a new reporting standard, known as NIBRS. Once of the consequences is that the DC KEY field will no longer be unique for homicide incidents in which there are multiple victims.

Robert
 

---------- Forwarded message ---------

Brian,

 

Due to our transition to NIBRS, DC Numbers are no longer a primary key; multiple victim homicide incidents will have multiple, duplicated DC numbers.

 

Michael Urciuoli

GIS Specialist

215.897.0813

michael....@phila.gov

Delaware Valley Intelligence Center (DVIC)

2810 S 20th St

Building #6

Philadelphia, PA 19145

 

From: Robert Cheetham <chee...@azavea.com>
Sent: Monday, October 24, 2022 12:12 PM
To: Michael Urciuoli <Michael....@phila.gov>; Kevin Thomas <Kevin....@Phila.gov>
Subject: Fwd: [OpenDataPhilly] Crime Incidents - Duplicate DC_KEY Records

 

External Email Notice. This email comes from outside of City government. Do not click on links or open attachments unless you recognize the sender.


Another data issue report, this time with the Incidents data set.


Best,


Robert

 

--

Alex Arafat

unread,
Dec 26, 2024, 9:56:03 AM12/26/24
to Open Data Philly

It sounds like the recent discrepancies in the Crime Incidents dataset with duplicate or varying records under the same DC_KEY could be due to a mix of data entry errors or updates to older records. For example, cases like DC_KEY '201025043366' showing years-apart incidents in similar locations might hint at historical record adjustments or merging of related cases.

I came across a helpful resource while researching similar issues in crime datasets. Websites like https://emilyandblair.com/webcrims-nassau/ often provide insights into tracking and managing criminal case records, which could be beneficial for understanding such anomalies in datasets.

Reply all
Reply to author
Forward
0 new messages