
City of Dallas Water Quality Analysis
Spring 2020
Disclaimer
All projects completed under the purview of a business during my tenure as an MSBA student required a Non-Disclosure Agreement. In the interest of serving the public and furthering the Open Data Initiative, the City of Dallas expressly waived this requirement. However, I have chosen to keep the insights and proprietary knowledge private. What follows is a generalized example of the product we created for The City.
Final Product
Excel Data Preparation
It is not uncommon for businesses to possess large quantities of data in Excel that, while visually appealing, may not be conducive to manipulation and analysis in R. We streamlined the data preparation steps using advanced techniques to filter, edit, and validate data in Excel. A detailed step-by-step guide was created for the City to duplicate this process in future analysis.
Data Analysis in R
After the data was cleaned, we used R to perform analysis. Our code included computation of basic descriptive statistics and OLS regression models. We also created our own model to compare watersheds with a single “score.”
Tableau Visualizations
We built 6 fully functional dashboards in Tableau to provide a holistic of analysis. This included: a GIS map with polygon outlines of each watershed boundary; a KPI chart that updates when the user hovers his/her mouse over a specific watershed; dynamic bar charts to visualize summary statistics; time series and prediction; anomaly detection chart; and a table of composite scores.