VDH released LTCF Names of COVID-19 outbreaks dataset on in an executive press release data dump. This data is only available as a Tableau dashboard, which only exports to PDF, Image, and Powerpoint, and does not offer any machine-readable download. It is also curiously buried on Virginia’s Long Term Care Task Force page and neither visible nor linked to on VDH’s Coronavirus landing page.
I mistakenly only downloaded the Current Outbreaks dataset on , and it didn’t really dawn on me that this data would be updating daily in this format, as its entirely different from the updates VDH has been doing so far. Checking back on it was clear to me that not only had I grabbed only part of the data, but it is updating daily. Moving forward, I’ll be collecting this data daily in PDF from the dashboard and converting to CSV so the data is actually usable and useful. This Google spread sheet is where I’ll be tracking these until I come up with a better solution. I would like to automate the data collection and scrape the Tableau deck, I’m just not sure if its worth my time to do so, considering how often VDH changes its data procedures and protocols.
Tableau is consistently making its case to be the worst data platform for users and I’ve found new lows with it while wrangling out the LTCF data. There’s an error that occurs while exporting the data into PDF, wherein the characters “ti” (in that order) are swapped out with some new entity. I think its a character entity issue, but I’m not 100%. Perhaps they are using OCR for export? I really don’t know, nor care enough to look into it. Tableau consistently puts DX > UX and this is just par for the course. Add manual data clean up to the COVID data daily task list under LTCF data tasks. LTCF data tasks would not be a thing if VDH offered this data in a machine readable format, and I’m not very excited about spending 20-40 minutes a day cleaning up third party vendor garbage output.
This dataset is also essentially non-archivable using web tools: the Wayback Machine doesn’t handle Tableau at all; archive.is does, and I have been using it to archive VHHA’s dashboards, but they are simple, static datasets. The LTCF dataset’s default setting is not on show all, so upon archiving in archive.is, we only get the Ongoing Outbreaks dataset saved. If you needed yet another reason to never use Tableau and avoid it like the anti-web tool that it is.
As with the rest of COVID-19 data collections I am undertaking, this dataset will be archived and live in my Virginia COVID-19 repository.
: Tableau Deck Offers Data Download
In the process of doing daily COVID-19 data collection duties, I noticed that VDH was gracious enough to offer this dataset as a data download option in their Tableau deck. Still no CSV option with the rest of their datasets, which just shows how inconsistent VDH has been across the board with data releases and follow through.
While this makes data collection easier, it doesn’t make up for the time I’ve spent performing data ops on this dataset over five days. In discussions within the COVID Tracking Project yesterday, I pitched creating a scraper for automating this data; I spend about an hour and a half on it before having to move on to something else. In said discussion, I specifically pointed out how I was hesitant to do anything because there is no rhyme or reason in regards to VDH’s data policies and protocols; it is incredibly tiresome, frustrating, and annoying to spend time on solutions that can and will be disregarded the next time VDH decides to do whatever they feel like. I’ve implemented the data download option into this dataset’s workflow, but it still does not provide an archiving solution, nor a direct, machine-readable solution to the data.
Tableau Data Download Inconsistencies
Don’t get too excited about VDH flipping the data download switch on; its not a direct download. Moreover, downloading it adds three new columns: “Blank”, “Number of Records” and “Report Date”. “Blank” is exactly what you think it is; “Report Date” is reflected in my archive’s file naming conventions: if you have a daily data set, including the date in the name offers an easy, sortable, delimiter. I don’t want to give VDH any credit here, but I’m assuming this is a Tableau “feature” in regards to the date column. “Number of Records” is meaningless because it literally has zero meaning to me in this context. It is not visible in the Tableau dashboard, and every table data cell value is “1”; perhaps this is another Tableau “feature”? Whatever it is, it is most certainly not helpful, descriptive, informative, and its presence actually makes this data less valuable.
Tableau Data Download is Invalid
This dataset exports from Tableau in
.csv, even though Tableau’s UI refers to it as a text file. This is a minute detail, however it does help to paint the bigger picture of what a bad platform Tableau is for users. The real issue with this particular dataset export from Tableau is that it doesn’t export as a valid
Tableau apparently cannot even properly encase table data cells for export into
.csv; this is Data Ops 101. It is ridiculous that an organization that sells services built around data visualization and data consumption has not crossed this path yet. I’m willing to bet as an organization they should have already, and I’ll bet this issue is simply no one at Tableau dogfooding their dashboards and exporting dataset.