2  Data Description

We delve into the analysis of four pivotal datasets which has been collected from multiple sources – Hate Crime by County and Bias Type, Adult Arrests (18 and older) by County, Index Crimes by County and Agency, and Jail Population by County. Through these lenses, we seek to unravel the complexities of crime, justice, and societal biases that define the state of New York.

2.1 Technical Description

2.1.1 Data Dimensions

Index Crimes by County and Agency :

The crime data is categorized into violent and property crimes, providing insights into different aspects of criminal activity. The “Region” column classifies counties into either New York City or Non-New York City, allowing for regional comparisons.

Dimension Count
No of Observations 22048
No of Variables 15

Adult Arrests (18 and older) by County :

The dataset consists of crime data categorized into felonies and misdemeanors, detailing counts for various crime types, including drug-related offenses and Driving While Intoxicated (DWI). It spans multiple years and provides a comprehensive view of criminal activities across different categories in various counties.Each row represents a specific county and year combination, and columns contain different crime-related metrics.

Dimension Count
No of Observations 3286
No of Variables 13

Hate Crime by County and Bias Type :

The Hate Crime by County and Bias Type dataset includes information on hate crime incidents reported across 62 unique counties in New York from 2010 to 2022. It covers a range of attributes, such as crime type, race, color, national origin, ancestry, gender, religion, age, disability, and sexual orientation, providing detailed insights into the characteristics of reported incidents. The dataset records total incidents, victims, and offenders, offering a comprehensive view of hate crimes and bias types in the specified regions over the given period.

Dimension Count
No of Observations 822
No of Variables 44

Jail Population by County :

The Jail Population by County dataset spans 67 unique counties in New York from 1997 to 2022, providing insights into the average daily census, boarding status, and inmate categorization. Attributes like Sentenced, Civil, Federal, Technical Parole Violators, State Readies, and Other Unsentenced offer a nuanced understanding of the diverse inmate populations and their statuses within county facilities. This dataset contributes to a comprehensive analysis of the evolving dynamics of correctional populations over the years.

Dimension Count
No of Observations 1723
No of Variables 12

2.1.2 Data Frequency

Each dataset, including Hate Crime by County and Bias Type, Adult Arrests by County, Index Crimes by County and Agency, and Jail Population by County, is updated on a yearly basis, providing an annual snapshot of the respective categories. This consistent yearly frequency enables a longitudinal analysis, offering insights into the trends and variations in these datasets over time by county.

2.1.3 Data Format

Note: For a comprehensive understanding of the Format of the dataset, it is advisable to refer to the metadata in the link below : Data set Description

2.2 Research Plan

Regional Disparities: Understanding the significant variations in crime rates across different counties and regions in New York

Crime Type Analysis: Analyzing the crimes exhibited by notable trends or fluctuations. Understanding the prevalence of different offenses, such as violent crimes, property crimes, or specific criminal activities, is crucial for targeted intervention.Information on

Law Enforcement Effectiveness: How effective have law enforcement agencies been in curbing specific types of crimes. Felony Breakdown, Misdemeanor Breakdown, type of felony (Drug, Violent, DWI, Other) and Misdemeanor.

Jail Population Analysis : Any correlation with the sentenced, Civil, Federal, Technical Parole Violators, State Readies.

Hate Crimes : Includes analysis on demographics, religions, Ethnicity , sexual orientation , disabilities , incidents and victims.

Impact of Societal Changes: Are there correlations between crime rates and broader societal changes, economic conditions, or demographic shifts.

2.3 Our Consideration of Period

Now, let’s dive into the nitty-gritty of our project. Imagine a time machine whisking us through the years from 1970 to 2022, exploring how things have changed on the crime scene. Picture this: we’re in 1996, peeking into why adults are getting into trouble and landing in jail more often. Fast forward to 2022, and we’re mixing and matching different time periods to uncover how adult crimes and jail arrests have evolved.

But our journey doesn’t stop there – we’re also digging into how hate crimes fit into the whole puzzle, especially during those times when arrests and crimes overlap. No fancy jargon, just a look at how things have shifted over the years. So, join us as we untangle this web of data, piece by piece!

2.4 Missing Value Analysis

Major issues which we have faced are missing values :

Index Crimes by County and Agency, where months’ data is absent, the chosen approach for addressing this issue is to opt for data exclusion.

2.4.1 Handling the missing values

The decision has been made to drop observations with missing values. This strategy ensures transparency in the data handling process and prioritizes a dataset without gaps, contributing to more robust analyses. The rationale for this approach lies in maintaining the integrity of the temporal and categorical dimensions of the data for comprehensive insights.

2.4.1.1 Jail Population by County

Facility.Name..ORI. Year Census Boarded.Out Boarded.In In.House.Census Sentenced Civil Federal Technical.Parole.Violators State.Readies Other.Unsentenced
Length:1723 Min. :1997 Min. : 0.0 Min. : 0.00 Min. : 0.00 Min. : 0.0 Min. : 0.0 Min. : 0.000 Min. : 0.00 Min. : 0.00 Min. : 0.00 Min. : -45.0
Class :character 1st Qu.:2003 1st Qu.: 66.0 1st Qu.: 0.00 1st Qu.: 0.00 1st Qu.: 66.5 1st Qu.: 19.0 1st Qu.: 0.000 1st Qu.: 0.00 1st Qu.: 2.00 1st Qu.: 1.00 1st Qu.: 31.0
Mode :character Median :2010 Median : 123.0 Median : 1.00 Median : 1.00 Median : 127.0 Median : 34.0 Median : 0.000 Median : 1.00 Median : 6.00 Median : 2.00 Median : 64.0
NA Mean :2010 Mean : 480.5 Mean : 11.24 Mean : 11.16 Mean : 480.4 Mean : 130.6 Mean : 2.449 Mean : 30.12 Mean : 28.83 Mean : 10.75 Mean : 277.6
NA 3rd Qu.:2016 3rd Qu.: 330.0 3rd Qu.: 5.00 3rd Qu.: 7.00 3rd Qu.: 329.0 3rd Qu.: 76.0 3rd Qu.: 2.000 3rd Qu.: 18.00 3rd Qu.: 17.00 3rd Qu.: 6.00 3rd Qu.: 173.0
NA Max. :2022 Max. :17016.0 Max. :612.00 Max. :619.00 Max. :17020.0 Max. :5436.0 Max. :109.000 Max. :1344.00 Max. :1275.00 Max. :1288.00 Max. :10154.0

2.4.1.2 Adult Crimes By County & Bias

County Year Total Felony.Total Drug.Felony Violent.Felony DWI.Felony Other.Felony Misdemeanor.Total Drug.Misdemeanor DWI.Misdemeanor Property.Misdemeanor Other.Misdemeanor
Length:3286 Min. :1970 Min. : 22.0 Min. : 6 Min. : 0.0 Min. : 0.0 Min. : 0.00 Min. : 1.0 Min. : 7 Min. : 0.0 Min. : 0.0 Min. : 0.0 Min. : 1
Class :character 1st Qu.:1983 1st Qu.: 867.2 1st Qu.: 191 1st Qu.: 19.0 1st Qu.: 39.0 1st Qu.: 17.00 1st Qu.: 104.0 1st Qu.: 662 1st Qu.: 25.0 1st Qu.: 178.0 1st Qu.: 154.0 1st Qu.: 241
Mode :character Median :1996 Median : 1516.0 Median : 368 Median : 49.5 Median : 77.0 Median : 37.00 Median : 197.5 Median : 1180 Median : 72.0 Median : 328.0 Median : 322.0 Median : 454
NA Mean :1996 Mean : 6572.3 Mean : 2250 Mean : 492.8 Mean : 688.2 Mean : 66.91 Mean : 1002.5 Mean : 4322 Mean : 859.7 Mean : 616.4 Mean : 1317.2 Mean : 1529
NA 3rd Qu.:2009 3rd Qu.: 4198.2 3rd Qu.: 1102 3rd Qu.: 183.8 3rd Qu.: 264.8 3rd Qu.: 76.00 3rd Qu.: 579.8 3rd Qu.: 3034 3rd Qu.: 252.8 3rd Qu.: 666.0 3rd Qu.: 836.8 3rd Qu.: 1167
NA Max. :2022 Max. :107786.0 Max. :44632 Max. :17442.0 Max. :16217.0 Max. :613.00 Max. :15467.0 Max. :73365 Max. :29471.0 Max. :8954.0 Max. :33334.0 Max. :24875

2.4.1.3 Hate Crime by County and Bias Type

County Year Crime.Type Anti.Male Anti.Female Anti.Transgender Anti.Gender.Non.Conforming Anti.Age. Anti.White Anti.Black Anti.American.Indian.Alaskan.Native Anti.Asian Anti.Native.Hawaiian.Pacific.Islander Anti.Multi.Racial.Groups Anti.Other.Race Anti.Jewish Anti.Catholic Anti.Protestant Anti.Islamic..Muslim. Anti.Multi.Religious.Groups Anti.Atheism.Agnosticism Anti.Religious.Practice.Generally Anti.Other.Religion Anti.Buddhist Anti.Eastern.Orthodox..Greek..Russian..etc.. Anti.Hindu Anti.Jehovahs.Witness Anti.Mormon Anti.Other.Christian Anti.Sikh Anti.Hispanic Anti.Arab Anti.Other.Ethnicity.National.Origin Anti.Non.Hispanic. Anti.Gay.Male Anti.Gay.Female Anti.Gay..Male.and.Female. Anti.Heterosexual Anti.Bisexual Anti.Physical.Disability Anti.Mental.Disability Total.Incidents Total.Victims Total.Offenders
Length:822 Min. :2010 Length:822 Min. :0.000000 Min. :0.0000 Min. :0.0000 Min. :0.00000 Min. :0.0000 Min. : 0.000 Min. : 0.000 Min. :0.000000 Min. : 0.0000 Min. :0 Min. :0.00000 Min. :0 Min. : 0.000 Min. : 0.0000 Min. :0.00000 Min. : 0.000 Min. : 0.00000 Min. :0 Min. :0.000000 Min. :0.00000 Min. :0.00000 Min. :0.00000 Min. :0.00000 Min. :0.00000 Min. :0.000000 Min. :0.00000 Min. :0.000000 Min. : 0.0000 Min. :0.00000 Min. : 0.0000 Min. :0 Min. : 0.00 Min. :0.0000 Min. :0.0000 Min. :0.000000 Min. :0.000000 Min. :0.00000 Min. :0.000000 Min. : 1.00 Min. : 1.00 Min. : 1.00
Class :character 1st Qu.:2013 Class :character 1st Qu.:0.000000 1st Qu.:0.0000 1st Qu.:0.0000 1st Qu.:0.00000 1st Qu.:0.0000 1st Qu.: 0.000 1st Qu.: 0.000 1st Qu.:0.000000 1st Qu.: 0.0000 1st Qu.:0 1st Qu.:0.00000 1st Qu.:0 1st Qu.: 0.000 1st Qu.: 0.0000 1st Qu.:0.00000 1st Qu.: 0.000 1st Qu.: 0.00000 1st Qu.:0 1st Qu.:0.000000 1st Qu.:0.00000 1st Qu.:0.00000 1st Qu.:0.00000 1st Qu.:0.00000 1st Qu.:0.00000 1st Qu.:0.000000 1st Qu.:0.00000 1st Qu.:0.000000 1st Qu.: 0.0000 1st Qu.:0.00000 1st Qu.: 0.0000 1st Qu.:0 1st Qu.: 0.00 1st Qu.:0.0000 1st Qu.:0.0000 1st Qu.:0.000000 1st Qu.:0.000000 1st Qu.:0.00000 1st Qu.:0.000000 1st Qu.: 1.00 1st Qu.: 1.00 1st Qu.: 1.00
Mode :character Median :2016 Mode :character Median :0.000000 Median :0.0000 Median :0.0000 Median :0.00000 Median :0.0000 Median : 0.000 Median : 1.000 Median :0.000000 Median : 0.0000 Median :0 Median :0.00000 Median :0 Median : 0.000 Median : 0.0000 Median :0.00000 Median : 0.000 Median : 0.00000 Median :0 Median :0.000000 Median :0.00000 Median :0.00000 Median :0.00000 Median :0.00000 Median :0.00000 Median :0.000000 Median :0.00000 Median :0.000000 Median : 0.0000 Median :0.00000 Median : 0.0000 Median :0 Median : 0.00 Median :0.0000 Median :0.0000 Median :0.000000 Median :0.000000 Median :0.00000 Median :0.000000 Median : 3.00 Median : 3.00 Median : 3.00
NA Mean :2016 NA Mean :0.006083 Mean :0.0219 Mean :0.1436 Mean :0.03893 Mean :0.0365 Mean : 0.365 Mean : 1.765 Mean :0.006083 Mean : 0.4501 Mean :0 Mean :0.07908 Mean :0 Mean : 4.062 Mean : 0.2092 Mean :0.01582 Mean : 0.382 Mean : 0.05475 Mean :0 Mean :0.009732 Mean :0.07056 Mean :0.00365 Mean :0.00365 Mean :0.01095 Mean :0.00365 Mean :0.001216 Mean :0.02433 Mean :0.006083 Mean : 0.3151 Mean :0.08151 Mean : 0.3078 Mean :0 Mean : 1.26 Mean :0.1764 Mean :0.1095 Mean :0.001216 Mean :0.004866 Mean :0.01217 Mean :0.006083 Mean : 10.04 Mean : 10.41 Mean : 11.33
NA 3rd Qu.:2020 NA 3rd Qu.:0.000000 3rd Qu.:0.0000 3rd Qu.:0.0000 3rd Qu.:0.00000 3rd Qu.:0.0000 3rd Qu.: 0.000 3rd Qu.: 2.000 3rd Qu.:0.000000 3rd Qu.: 0.0000 3rd Qu.:0 3rd Qu.:0.00000 3rd Qu.:0 3rd Qu.: 3.000 3rd Qu.: 0.0000 3rd Qu.:0.00000 3rd Qu.: 0.000 3rd Qu.: 0.00000 3rd Qu.:0 3rd Qu.:0.000000 3rd Qu.:0.00000 3rd Qu.:0.00000 3rd Qu.:0.00000 3rd Qu.:0.00000 3rd Qu.:0.00000 3rd Qu.:0.000000 3rd Qu.:0.00000 3rd Qu.:0.000000 3rd Qu.: 0.0000 3rd Qu.:0.00000 3rd Qu.: 0.0000 3rd Qu.:0 3rd Qu.: 1.00 3rd Qu.:0.0000 3rd Qu.:0.0000 3rd Qu.:0.000000 3rd Qu.:0.000000 3rd Qu.:0.00000 3rd Qu.:0.000000 3rd Qu.: 9.00 3rd Qu.: 10.00 3rd Qu.: 11.00
NA Max. :2022 NA Max. :1.000000 Max. :6.0000 Max. :8.0000 Max. :3.00000 Max. :9.0000 Max. :16.000 Max. :18.000 Max. :1.000000 Max. :68.0000 Max. :0 Max. :4.00000 Max. :0 Max. :90.000 Max. :12.0000 Max. :1.00000 Max. :10.000 Max. :10.00000 Max. :0 Max. :2.000000 Max. :4.00000 Max. :1.00000 Max. :1.00000 Max. :3.00000 Max. :1.00000 Max. :1.000000 Max. :4.00000 Max. :3.000000 Max. :17.0000 Max. :4.00000 Max. :21.0000 Max. :0 Max. :36.00 Max. :8.0000 Max. :6.0000 Max. :1.000000 Max. :2.000000 Max. :1.00000 Max. :1.000000 Max. :148.00 Max. :148.00 Max. :160.00

2.4.1.4 Adult Arrests (18 and older) by County

County Agency Year Months.Reported Index.Total Violent.Total Murder Rape Robbery Aggravated.Assault Property.Total Burglary Larceny Motor.Vehicle.Theft Region
Length:22048 Length:22048 Min. :1990 Min. : 1.0 Min. : 1 Min. : 0.0 Min. : 0.000 Min. : 0.000 Min. : 0.00 Min. : 0.0 Min. : 0 Min. : 0.0 Min. : 0.0 Min. : 0.0 Length:22048
Class :character Class :character 1st Qu.:1997 1st Qu.:12.0 1st Qu.: 33 1st Qu.: 2.0 1st Qu.: 0.000 1st Qu.: 0.000 1st Qu.: 0.00 1st Qu.: 1.0 1st Qu.: 29 1st Qu.: 4.0 1st Qu.: 22.0 1st Qu.: 0.0 Class :character
Mode :character Mode :character Median :2006 Median :12.0 Median : 139 Median : 10.0 Median : 0.000 Median : 1.000 Median : 1.00 Median : 6.0 Median : 127 Median : 20.0 Median : 99.0 Median : 4.0 Mode :character
NA NA Mean :2006 Mean :11.9 Mean : 1282 Mean : 201.1 Mean : 2.099 Mean : 9.987 Mean : 79.26 Mean : 109.8 Mean : 1081 Mean : 199.9 Mean : 769.5 Mean : 111.2 NA
NA NA 3rd Qu.:2014 3rd Qu.:12.0 3rd Qu.: 492 3rd Qu.: 40.0 3rd Qu.: 0.000 3rd Qu.: 4.000 3rd Qu.: 5.00 3rd Qu.: 27.0 3rd Qu.: 445 3rd Qu.: 84.0 3rd Qu.: 336.0 3rd Qu.: 16.0 NA
NA NA Max. :2022 Max. :12.0 Max. :217786 Max. :63087.0 Max. :786.000 Max. :1159.000 Max. :36341.00 Max. :24828.0 Max. :173352 Max. :39041.0 Max. :122704.0 Max. :50300.0 NA
NA NA NA NA’s :9557 NA NA NA NA NA NA NA NA NA NA NA

2.4.1.5 100% Non Missing Values

2.4.1.6 Index Crimes by County and Agency