We will use kaggleCarAuction.csv
dataset from one of the Kaggle competitions.
https://jhudatascience.org/intro_to_R_class/data/kaggleCarAuction.csv
cars <- jhur::read_kaggle()
head(cars)
# A tibble: 6 x 34
RefId IsBadBuy PurchDate Auction VehYear VehicleAge Make Model Trim SubModel
<dbl> <dbl> <chr> <chr> <dbl> <dbl> <chr> <chr> <chr> <chr>
1 1 0 12/7/2009 ADESA 2006 3 MAZDA MAZD… i 4D SEDA…
2 2 0 12/7/2009 ADESA 2004 5 DODGE 1500… ST QUAD CA…
3 3 0 12/7/2009 ADESA 2005 4 DODGE STRA… SXT 4D SEDA…
4 4 0 12/7/2009 ADESA 2004 5 DODGE NEON SXT 4D SEDAN
5 5 0 12/7/2009 ADESA 2005 4 FORD FOCUS ZX3 2D COUP…
6 6 0 12/7/2009 ADESA 2004 5 MITS… GALA… ES 4D SEDA…
# … with 24 more variables: Color <chr>, Transmission <chr>, WheelTypeID <chr>,
# WheelType <chr>, VehOdo <dbl>, Nationality <chr>, Size <chr>,
# TopThreeAmericanName <chr>, MMRAcquisitionAuctionAveragePrice <chr>,
# MMRAcquisitionAuctionCleanPrice <chr>,
# MMRAcquisitionRetailAveragePrice <chr>,
# MMRAcquisitonRetailCleanPrice <chr>, MMRCurrentAuctionAveragePrice <chr>,
# MMRCurrentAuctionCleanPrice <chr>, MMRCurrentRetailAveragePrice <chr>,
# MMRCurrentRetailCleanPrice <chr>, PRIMEUNIT <chr>, AUCGUART <chr>,
# BYRNO <dbl>, VNZIP1 <dbl>, VNST <chr>, VehBCost <dbl>, IsOnlineSale <dbl>,
# WarrantyCost <dbl>