Data Portfolio

This section of my website holds my data work with self-contained projects that showcase several skills in data management, analytics/science and visualization, mostly using R, but there are also projects which use Power BI and Tableau. Use the headers to your right to select different projects.

Most projects have GitHub repositories with open-source code (with Creative Commons licenses), which are reacheable through the included GitHub links. Enjoy!

Sentiment analysis of music lyrics

Part of a personal blog post which uses several musical data sources to perfom analyses of musical data on the dutch symphonic metal band, Epica. I perform tidy text analysis with different lexicons after tokenizing all lyrics from Epica’s discography, sourced from the Genius API. It involved complex data cleaning through table reshaping, record linkage and fuzzy joins.

Superstore Delivery Time Analytics

I give the classic Superstore dataset a twist by analyzing delivery times instead of sales in this Power BI dashboard. Be sure to turn on maps from your security options in Power BI desktop to see the full functionality of the file.

  • Link to Power BI Dashboard files (.pbix) ^[It is not possible to openly deploy Power BI dashboards as it is on Tableau or Shiny. However, to visualize this dashboard you only need to download Power BI desktop on your computer, which is free. The only issue is that it is only available for Windows.

Football statistics from Latin American regional tournaments

Part of a blog post which analyzes scraped data from FBRef and Transfermarkt from LDU Quito’s performance in the Copa Sudamericana and Libertadores, the most important Latin American football tournaments. The post includes a detailed analysis of the team’s performance in the tournaments, using data from the R worldfootballR package.

Clean technology companies in Canada

This is a series of work produced during my time as a Junior Data Analyst in the Clean Technology Economic Analysis Unit team at Natural Resources Canada, which I cannot claim as mine but part of a team effort of highly skilled professionals in NRCan.

Canadian Pureplay Clean Technology Companies Dashboard

A Power BI Dashboard that visualizes the number of Canadian pureplay clean technology companies across provinces and subindustries in the country.

British Columbia Clean Technology Companies Profile

A sector profile on British Columbia’s clean technology sector. Developed in collaboration with Trade & Invest British Columbia.

Analyzing public opinion from survey data

After a series of protests developed in Ecuador in July 2022, I analyzed public opinion variables to try to understand the public sentiment behind the protest. I implement survey design adjustments from the AmericasBarometer survey to correct for oversampling of some regions and to present survey-robust confidence intervals in visualizations.

Note

This post is sourced from my contribution to El Quantificador, a blog on social issues created by LIDE, an Ecuadorian research nonprofit organization. See the original post in Spanish here.

Research: Differences-in-differences with COVID-19 administrative data in Ecuador

- Sánchez Pazmiño, D. H. (2022). Immunizing the Economy: A Causality Discussion on Vaccines and Economic Recovery. X-Pedientes Económicos, 6(14), 65–82.

In 2022 I was able to publish a paper (in English) in an Ecuadorian peer-reviewed journal. The paper used a two-way fixed effects differences-in-differences estimator to estimate the causal effect of COVID-19 vaccination on business creation on Ecuadorian provinces. The key result was that provinces which showed apparent reluctancy to vaccinate saw about 25% less business creation than provinces that didn’t.

Research: Survey-weighted logistic regression to investigate corruption at the individual level

In early 2023, I was able to publish another peer-reviewed paper in Virginia Tech’s peer-reviewed journal, The Philosophy, Politics, and Economics Review. In this paper, I use survey-weighted logistic regression to investigate the drivers of a mysterious increase in corruption tolerance in Ecuador between 2014 and 2016, finding that political identification and presidential approval drove the change.

Web development with R

I have been able to collaborate on a few website development efforts using R. See below for details.

Personal website

This website (daniel-ec.netlify.app) is built using Quarto, next generation’s RMarkdown, and R. I built it from scratch with the help of the great documentation available in the Quarto website and the many examples available in the open source community. The website is hosted through a free Netlify domain. It includes both a typical “website” section and a blog section, which mimics using the blogdown format.

El Quantificador

El Quantificador is a blog on social issues created by the Ecuadorian Development Research Lab (LIDE, for its initials in Spanish), an Ecuadorian research nonprofit. I have been able to collaborate on the development of the blog, which is built using blogdown and a Hugo theme.

The website deploys several RMarkdown files, with complex data analysis and visualization components, including a Shiny app The website is hosted through a free Netlify domain.

LIDE has produced other websites using R and blogdown, where I’ve also collaborated, including its main website, and a project website for New Dimensions.

Projects in Spanish

These projects are short articles with significant data analysis and/or visualization components that I have written in Spanish. They are either published in El Quantificador, a blog on social issues created by the Ecuadorian Research Development Lab, an Ecuadorian research nonprofit or in my personal blog. I have listed the links to the direct text in Spanish and to the GitHub repository where the code is hosted. I plan to work on AI-assisted translations of these articles in the future.

The throne in Ponceano: Liga de Quito’s statistics in R

In late 2023, the best football club in Ecuador, Liga Deportiva Universitaria, was able to pass to the Copa Sudamericana final stage after more than a decade of no international titles. I analyzed the team’s statistics in R to understand how the team has been performing in the last few years, using R to access football statistics providers like Transfermarkt or FBref.

Homosexuality: what do Ecuadorians think about it? (coauthored with Alonso Quijano-Ruiz)

Along with Alonso Quijano, we analyzed public opinion data from the World Values Survey to understand how homosexuality is viewed in Ecuador.

Three problems for the quantitative analysis of femicides in Ecuador (coauthored with Alejandra Marchán)

A quantitative perspective on the Ecuadorian protests

The original version of Section 6, written originally in Spanish for El Quantificador and reposted in Spanish and English in my personal blog.

Footnotes

  1. The publisher’s website is often down, so I include a link to the exact same version of the paper, uploaded to this site.↩︎