Late to the party: Str and Summary function in R

I admit I am very late to the party with these functions but after dealing with enough messy data sets they are now required before any analysis. These functions will help you understand the underlying structure of your data sets. Below is a data set of NFL Stats and you can see how easy it to utilize and what the output looks like.

STR and Summary Function

# Read the data into R
Nfl <- read.csv("PtsLM.csv")
# Exploring your data. The str function will you tell you the type of data
# you have and that is handy when you can't figure out why a variable is
# givving you a hard time.
str(Nfl)
## 'data.frame':    64 obs. of  13 variables:
##  $ Year        : num  2012 2011 2012 2011 2012 ...
##  $ Number      : num  1 1 2 2 3 3 4 4 5 5 ...
##  $ Team        : Factor w/ 32 levels "Arizona Cardinals",..: 1 1 2 2 3 3 4 4 5 5 ...
##  $ G           : num  16 16 16 16 16 16 16 16 16 16 ...
##  $ Pts.G       : num  15.6 19.5 26.2 25.1 24.9 23.6 21.5 23.2 22.3 25.4 ...
##  $ TotPts      : num  250 312 419 402 398 378 344 372 357 406 ...
##  $ Yds.G       : num  263 324 369 377 352 ...
##  $ Yds.P       : num  4.1 5.2 5.8 5.6 5.4 5.2 5.6 5.7 5.8 6.2 ...
##  $ X1st.G      : num  15.4 17.9 21.4 21.8 19.6 19.5 18.8 19.6 20.5 21.6 ...
##  $ PassingAvg  : num  5.6 7.2 7.7 7.3 7.1 6.7 6.7 6.7 8 7.9 ...
##  $ PassYds.G   : num  188 223 282 262 234 ...
##  $ RusingAvg   : num  3.4 4.2 3.7 4 4.3 4.3 5 4.9 4.5 5.4 ...
##  $ RushingYds.G: num  75.2 101.6 87.3 114.6 118.8 ...

names(Nfl)
##  [1] "Year"         "Number"       "Team"         "G"           
##  [5] "Pts.G"        "TotPts"       "Yds.G"        "Yds.P"       
##  [9] "X1st.G"       "PassingAvg"   "PassYds.G"    "RusingAvg"   
## [13] "RushingYds.G"
# The names function provides you with column/variable names.
names(Nfl)
##  [1] "Year"         "Number"       "Team"         "G"           
##  [5] "Pts.G"        "TotPts"       "Yds.G"        "Yds.P"       
##  [9] "X1st.G"       "PassingAvg"   "PassYds.G"    "RusingAvg"   
## [13] "RushingYds.G"
# The summary function will give a nice summary breakdown on all the
# variables in the data frame whether the variable is a numeric variable or
# not.
summary(Nfl)
##       Year          Number                     Team          G     
##  Min.   :2011   Min.   : 1.00   Arizona Cardinals: 2   Min.   :16  
##  1st Qu.:2011   1st Qu.: 8.75   Atlanta Falcons  : 2   1st Qu.:16  
##  Median :2012   Median :16.50   Baltimore Ravens : 2   Median :16  
##  Mean   :2012   Mean   :16.50   Buffalo Bills    : 2   Mean   :16  
##  3rd Qu.:2012   3rd Qu.:24.25   Carolina Panthers: 2   3rd Qu.:16  
##  Max.   :2012   Max.   :32.00   Chicago Bears    : 2   Max.   :16  
##                                 (Other)          :52               
##      Pts.G          TotPts        Yds.G         Yds.P          X1st.G    
##  Min.   :12.1   Min.   :193   Min.   :259   Min.   :4.10   Min.   :15.4  
##  1st Qu.:19.2   1st Qu.:307   1st Qu.:314   1st Qu.:5.00   1st Qu.:17.9  
##  Median :22.8   Median :364   Median :343   Median :5.35   Median :19.3  
##  Mean   :22.5   Mean   :360   Mean   :347   Mean   :5.42   Mean   :19.6  
##  3rd Qu.:24.9   3rd Qu.:399   3rd Qu.:375   3rd Qu.:5.80   3rd Qu.:21.3  
##  Max.   :35.0   Max.   :560   Max.   :467   Max.   :6.70   Max.   :27.8  
##                                                                          
##    PassingAvg     PassYds.G     RusingAvg     RushingYds.G  
##  Min.   :5.40   Min.   :136   Min.   :3.40   Min.   : 75.2  
##  1st Qu.:6.60   1st Qu.:194   1st Qu.:3.90   1st Qu.:100.5  
##  Median :7.05   Median :226   Median :4.20   Median :114.5  
##  Mean   :7.13   Mean   :230   Mean   :4.25   Mean   :116.5  
##  3rd Qu.:7.72   3rd Qu.:256   3rd Qu.:4.50   3rd Qu.:128.5  
##  Max.   :9.30   Max.   :334   Max.   :5.40   Max.   :169.3  
## 
Advertisements
This entry was posted in Data Cleaning, data quality., R, Uncategorized and tagged , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s