Using Data Analysis to Power the Car-Buying Process
Last year, I researched and purchased my first vehicle. It was a long process that taught me a lot about the used car market. Through my research, I also learned how to use a powerful analytical tool in my notes app to back up my decisions with data. What I learned reduced the anxiety of browsing through endless listings on many sites, not knowing which options were good deals. This was incredibly valuable in taking some of the edge off an inherently stressful decision.
My main criteria for purchasing a car were reliability and utility, with fuel economy a close third. To me, reliability meant that I wouldn’t need to drop lots of cash on expensive repairs within the first three years of owning the vehicle. I also wanted a car that could carry things and my family around, with my mountain bike being one of the main things I had in mind. Lastly, fuel economy was important because it would save me money in the long run.
All in all, my search took about a month and a half. I learned a lot along the way, and discovered some interesting ways to use data to gain insights into the car-comparison process. In this article, I’ll walk through some of the resources I used and metrics I leveraged to make my purchase decision.
Researching the available options
My budget changed as I searched for used cars because many of those available had poor reliability ratings or were overpriced. The most valuable resource I had for discovering that information was Consumer Reports. They have comprehensive reliability data on every major make and model on the market. Using that information, I was able to quickly narrow my search to a set of less risky purchases, saving my time and hopefully money on repairs down the road.
The two things that I found most useful from Consumer Reports were their reliability rating for each car and the market price range. The reliability rating was summarized with a single number, making it easier to compare different models of the same body type, though they also provide much more detailed information about the points that make up the rating. This helped me quickly identify which models carried the lowest long-term risk. The market price range for each vehicle was incredibly helpful in gauging how reasonable online listings were, and gave me a good idea where to start negotiations when it came time to purchase.
Another helpful resource was carcomplaints.com. Car Complaints compiles publicly available records on reported issues for cars sold in the United States. They have an all-time leaderboard of best and worst vehicles based on National Highway Traffic Safety Administration (NHTSA) complaints. The top three worst cars are the 2002 Ford Explorer, 2003 Honda Accord, and 2019 Toyota RAV4. It’s not surprising to see such old cars on the top of the list, but the Toyota is a surprise because of its reputation for reliability. While it wasn’t my primary resource for reliability statistics on potential purchases, it was helpful as a way to confirm what other sources were already telling me, including information on the specific problems being reported for different vehicles.
Tracking and evaluating candidates
Instead of using a spreadsheet, I used the note-taking software Obsidian to track my candidate car purchases. While I certainly could have done the analysis in Excel or Google Sheets, Obsidian’s portability meant that I could edit and view my data on the road more conveniently. This proved useful on-site during the negotiation process.
Obsidian recently added a new feature called Bases, which gives you a way to list and summarize your notes much like a database. Each note becomes a row in a table, and you can do calculations on each file’s metadata to create metrics from basic data points. It’s similar to what you can do in a spreadsheet. Obsidian Bases was powerful enough to provide interesting insights into my candidate vehicles. I’ll go into detail about how I used Obsidian in another article, but regardless of the tool, the ability to list, filter, and summarize information about each vehicle was incredibly helpful during the search.
What I learned about car shopping is that inventory turns over fast. I found that many of the better candidates at the start of my search were gone by the time I was ready to purchase. Some people might say that’s because I’m pokey, but I prefer to think that I’m thorough. At any rate, it meant that I had to be flexible in considering my options, knowing that opportunities come and go quickly.
For each car I wanted to seriously consider, I noted the following details:
- A link to the listing
- Make, model, and year
- The last odometer reading
- Gas mileage
- Asking price
- VIN
I also recorded several numbers from Consumer Reports:
- The minimum and maximum market price
- The reliability score
- Whether the particular model year had the “Consumer Reports Recommended” indication
Finally, if a vehicle history report was available (CARFAX, etc.), then I also looked at that.
- Whether the car was in an accident
- Whether the car was used as a commercial vehicle
- Whether the car had any major or repeated repairs noted
All in all, I collected information on about 30 different vehicles. To help with the evaluation process, I created several metrics.
First, I created a simple indicator to show whether the asking price was overvalued based on the market price range given by Consumer Reports. This helped me filter out options that would be hard to get a good deal on. It also helped me identify vehicles that might be too good to be true.
Second, I looked up the average yearly mileage for a car in Idaho and compared that to the car’s actual mileage. This gave me an indication of how the vehicle was used. If the miles were much higher than expected, then it might be a concern, especially if the vehicle was driven commercially. Without more detailed information, this number served as a directional indicator, but wasn’t necessarily a deal-breaker. Importantly, I used this information during negotiation for a car that had nearly 30% higher miles than expected for its age.
Next, I estimated the remaining useful mileage on each vehicle, and created a metric, price per remaining useful mile, or PRM. For example, my research indicated that an SUV from a reliable brand can expect to reach close to 200,000 miles before costly repeated repairs outweigh the value of continuing to own the vehicle. Of course, there are many variables that go into how long a vehicle runs, but this was a good ballpark estimate, useful for comparing similar years and body types.
I also created an adjusted PRM based on the variance in the average reliability score for all vehicles in my comparison pool. Vehicles with higher than average reliability counted as having higher usable miles, and vice versa. This way, I was able to factor in a vehicle’s expected reliability into the price as a measure of how many miles I could expect to drive it without major issue. Again, this isn’t a concrete prediction, but rather a ballpark estimate that helps compare similar vehicles.
Data to the rescue
The data points and metrics I created gave me a lot of decision power. At a glance, I was able to see which vehicles were recommended, which had fewer miles than expected, and how their expected reliability factored into their price. In the end, I was able to use this information to find a vehicle that was high value for money, had good gas mileage, and was less risky to own long-term.
So far, I’m quite content with my purchase. I know that I’ve gotten a fair price on it, and I feel a sense of comfort knowing that the car will likely be reliable in the long run. This is something I wouldn’t be able to say if I hadn’t done the proper research and analysis. While my methods evolved over the course of this first-time purchase, what I’ve learned about the process will help make my next purchase faster and more informed from the start. It’ll make the process even less stressful the next time around, and that’s worth a lot.