Corrections

If you see mistakes, have follow-up questions, or want to suggest changes, please create an issue on the source repository.

Terms of Use

This website and its contents herein, including all data, mapping, and analysis (“Website”), copyright 2020, Texas 2036, all rights reserved, is provided solely for non-profit public health, educational, and academic research purposes. You should not rely on this Website for medical advice or guidance.
Use of the Website by commercial parties and/or in commerce is strictly prohibited. Redistribution of the Website or the aggregated data set underlying the Website is strictly prohibited.
When linking to the website, attribute the Website as the “Texas COVID-19 Data Resource by Texas 2036”.
The Website relies upon publicly available data from multiple sources that do not always agree. Texas 2036 hereby disclaims any and all representations and warranties with respect to the Website, or any content or data thereon, including accuracy, fitness for use, reliability, completeness, and non-infringement of third party rights.
Any use of Texas 2036 names, logos, trademarks, and/or trade dress in a factually inaccurate manner or for marketing, promotional or commercial purposes is strictly prohibited.
These terms and conditions are subject to change. Your use of the Website constitutes your acceptance of these terms and conditions and any future modifications thereof.

Reopening Analysis

Data Sources

Symptom data (CLI & ILI) state & county level comes from Texas DSHS
Hospital Data (state and trauma-service area level) comes from Texas DSHS
Daily case data (state and county level) comes from the NYTimes.
Testing data (state only) from the COVID Tracking Project.

Statistical Model

Due to potential sources of error in the data, we chose to use a robust regression¹ to model the 14-day linear trend. Robust linear models are well suited to this problem since they can account for outliers in the data. For all of the 14-day trend lines shown in the report, we used the statsmodels² implementation of robust linear models with the Huber M-estimator³. The Huber loss results in a squared error for inliers and an absolute error for outliers (up to a constant factor and as specified by the threshold and scale parameter). We chose this function since it does not completely ignore the effect of outliers (like Tukey’s biweight) but instead just downweights their influence.


import statsmodels.api as sm
import numpy as np

def fit(X, y):
    features = sm.add_constant(np.arange(len(X)))
    rlm_model = sm.RLM(y, features, M=sm.robust.norms.HuberT())
    model = rlm_model.fit()

We only create models for counties which have had at least 28 reported cases. For counties with low case counts, the fit of a linear trend may not be meaningful.

Sources of Error

Sampling error

Due to a shortage of tests and backlog of results, the total number of COVID-19 cases may be severely underestimated. However, since at-risk or suggestively symptomatic individuals are prioritized first, the percentage of positive cases may also be an overestimate. Additionally, the biased sampling of who is tested may skew the demographics those diagnosed.

Measurement error

Due to inherent uncertainty in RT-PCR ⁴ as well as serological antibody ⁵ tests, even multiple repeated tests for a positive patient can give an erroneous diagnosis ⁶. For COVID-19, the most dangerous of these errors are the false negative tests ⁷, since these individuals may not receive required medical treatment but also continue to spread the virus.

Asymptomatic and presymptomatic cases

In addition to the methodological sources of error above, reliable COVID-19 diagnoses are complicated by the temporal dynamics of the disease.⁸ Not only are there exogenous errors that can arise during sample collection (nose and throat swabs), especially if a location is understaffed for the amount of samples they need to collect, but specificity of detection can also fluctuate with the time since infection and the severerity of a case.

One of the earliest natural experiments for studying COVID-19 that occurred during the Diamond Princess cruise ship quarantine in Japan⁹ ¹⁰ has shown that many asymptomatic individuals at the time of testing can in fact be positive and contagious. These silent spreaders thus can have an outsized effect on the spread of the virus and often evade testing.

State Explorer

A Note Population Estimates and Per Capita

In many of metrics, we use statistical methods to control for the size of the Texas population. That method is indicated by the phrase “Per Capita”, which indicates Per 100,000 people whenever you see it mentioned. For us, there were two major issues we encountered while determining how to adjust for the populations.

First, the standard best practice when calculating something based on populations in the United States is to use the Decennial Census counts that occur every 10 years and to use the Census Bureau’s American Community Survey estimates for anything related to social and economic characteristics of that population. Given that the state has grown significantly ¹¹ in the past decade and the pending Decennial count is occurring in 2020, we felt that using the 2010 count for Texas(roughly ~25 million) would skew a “Per Capita” figure.

Second, when we began sourcing data from widely cited COVID-19 datasets, we realized that while raw metrics—such as confirmed cases, deaths, recovered, tested, etc.—were almost always synchronized with the Texas Department of State Health Service’s own numbers ¹², that some derived metrics that attempted to create a per capita rate for various appeared to be using older population counts for Texas¹³ which, in the case of something like “Testing Per Capita”, made it appear that our testing rate is much higher than it actually is. In on instance, that discrepancy at the time of discovery, was producing a per capita difference upwards of 200 tests per capita. In our current moment, even though typical convention suggests using the decennial population counts, we felt that risked overestimating what was really happening.

In the end, we opted to use the population estimate utilized and recommended by the U.S. Census Bureau’s own COVID-19 hub as any denominator when calculating a “Per Capita” rate of anything. That population data from the 2018 American Community Survey 5-year series lives here. In the following equations, you will see this figure referenced as \(\text{Population}_\text{ACS_2018}\).

Current Case Data

The total cases, deaths, and active cases metrics were reported “as is” from the Johns Hopkins University dataset. Case Growth Rate chart was derived from the NYTimes time series data by taking the current day’s reported case count for Texas and dividing it by the previous day’s reported case count and converting it a percentage format, which can be represented using the following equation to calculate daily growth rate:

\[\text{Case Growth Rate}=((\dfrac{cases_\text{today}}{cases_\text{yesterday}})-1)*100 \]

We also filtered to start the chart at March 6th, because prior to that date, there was hardly any data available to generate a case growth rate. Once we had established the case growth rates for each day, we attempted to generate trend lines using a rolling 7-day average of Case Growth Rates in Texas.

Current Testing Data

Metric	Equation
`test_per_capita`	\[\text{Test Per Capita}=(\dfrac{\text{COVID-19 Tests}_\text{tot}}{\text{Population}_\text{ACS_2018}})*\text{100,000} \]
`daily_test_per_capita`	\[\text{Daily Test Per Capita}=(\dfrac{\text{COVID-19 Tests}_\text{daily_tot}}{\text{Population}_\text{ACS_2018}})*\text{100,000} \]
`daily_test_pos_rate`	\[\text{Test Positive Rate}=(\dfrac{\text{+ COVID-19 Tests}_\text{daily_increase}}{\text{All COVID-19 Tests}_\text{daily_increase}})*100 \]

Current Syndromic Data

All data visualized here is reported as is from the Texas Department of State Health Services (DSHS). No derived metrics were generated from this data for the charts seen here.

Current Hospital Data

Hospital data shown here are derived metrics using hospital capacity data produced by the Texas Department of State Health Services (DSHS). They were calculated as follows:

Metric	Equation
`pct_hospitalized`	\[\text{Pct. Hospitalized}=(\dfrac{\text{COVID-19}_\text{Hospitalized}}{\text{COVID-19}_\text{Active Cases}})*100 \]
`gen_bed_avail_rate`	\[\text{General Bed Availability}=(\dfrac{\text{General Beds}_\text{available}}{\text{General Beds}_\text{Total}})*100 \]
`icu_bed_avail_rate`	\[\text{ICU Bed Availability}=(\dfrac{\text{ICU Beds}_\text{available}}{\text{ICU Beds}_\text{Total}})*100 \]
`vent_avail_rate`	\[\text{Ventilator Availability}=(\dfrac{\text{Ventilators}_\text{available}}{\text{Ventilators}_\text{Total}})*100 \]

Trends Over Time

The trends over time section just takes daily increases of each topic (cases, tests, and deaths) and maps out the daily increase of each while also calcualting rolling 7-day averages for each chart. As the charts indicate, the data sourced for them is the time series dataset from the New York Times.

Economy + Society

All data visualized here is reported as is from the Bureau of Labor Statistics and Homebase. No derived metrics were generated from this data for the charts seen here.

Context About Homebase Data

Homebase is an incredible group that has provided their data for the benefit of small businesses they work service. Their dataset is derived from over 5,000 small businesses in Texas who utilize their services. We mention this because that context is critical to drawing meaningful “take aways” from their data. By “small business”, we mean that Homebase mostly serves businesses of 100 employees or less, which makes it an incredibly valueable asset for understanding small business dynamics. Second, Homebase’s metrics shown here are marked as “estimates”, because they do not reflect the entire universe of small businesses in Texas even though they have have a sufficient sample size to represent small business dynamics in Texas.

Metric	Description
`Est. Local Businesses Open`	This represents the change in the number of businesses compared to the beginning of January, which is what Homebase uses as their benchmark to calculate the change figures. The number below shows the change relative to a week or ago.
`Est. Reduction In Hours Worked`	This represents the change in the number of hours worked by hourly employees compared to the beginning of January, which is what Homebase uses as their benchmark to calculate the change figures. The number below shows the change relative to a week or ago.
`Est. Hourly Employees Working`	This represents the change in the number of businesses compared the beginning of January, which is what Homebase uses as their benchmark to calculate the change figures. The number below shows the change relative to a week or ago.

County Explorer

Coming Soon.

Robust Statistics, Peter J. Huber. John Wiley and Sons, Inc. 1981.↩
Statsmodels: Econometric and Statistical Modeling with Python, Skipper Seabold and Josef Perktold. Proceedings of the 9th Python in Science Conference. 2010.↩
Robust Estimation of a Location Parameter, Peter J. Huber. Annals of Mathematical Statistics. 1964.↩
Stability Issues of RT‐PCR Testing of SARS‐CoV‐2 for Hospitalized Patients Clinically Diagnosed with COVID‐19, Li, Yafang, Lin Yao, Jiawei Li, Lei Chen, Yiyan Song, Zhifang Cai, and Chunhua Yang. Journal of Medical Virology. March 26, 2020.↩
Test performance evaluation of SARS-CoV-2 serological assays, Whitman, Jeffrey D., Joseph Hiatt, Cody T. Mowery, Brian R. Shy, Ruby Yu, Tori N. Yamamoto, Ujjwal Rathore et al. medRxiv. April 29, 2020.↩
False‐negative of RT‐PCR and prolonged nucleic acid conversion in COVID‐19: Rather than recurrence, Xiao, Ai Tang, Yi Xin Tong, and Sheng Zhang. Journal of Medical Virology. April 9, 2020.↩
A case report of COVID-19 with false negative RT-PCR test: necessity of chest CT, Feng, Hao, Yujian Liu, Minli Lv, and Jianquan Zhong. Japanese Journal of Radiology. April 7, 2020.↩
Temporal dynamics in viral shedding and transmissibility of COVID-19, He, Xi, Eric HY Lau, Peng Wu, Xilong Deng, Jian Wang, Xinxin Hao, Yiu Chung Lau et al. Nature medicine. April 15, 2020.↩
Chronology of COVID-19 Cases on the Diamond Princess Cruise Ship and Ethical Considerations: A Report From Japan, Nakazawa, Eisuke, Hiroyasu Ino, and Akira Akabayashi. Disaster Medicine and Public Health Preparedness. March 24, 2020.↩
Public health responses to COVID-19 outbreaks on cruise ships—worldwide, February–March 2020, Moriarty LF, Plucinski MM, Marston BJ, et al. MMWR Morbidity and mortality weekly report. March 27, 2020.↩
In this September 2019 brief, the state demographer’s projections suggest definitive growth in Texas, regardless of the migration scenario, including zero net migration.↩
Even though the most raw numbers usually synced, the figures reflected were usually reported on a lag. For example, if the state publishes new numbers on the evening of May 1st, which is intended to reflect data for May 1st, then, on May 2nd, the numbers for May 1st get published in their datasets, even if the state went ahead and published those numbers on their own website.↩
This github issue has a clear explanation https://github.com/CSSEGISandData/COVID-19/issues/2185 ↩

About Our Data

Table of Contents