953 stories

Dados diários mais recentes do coronavírus por município brasileiro

1 Share

Nota: essa publicação será atualizada conforme tivermos mais detalhes sobre esse dataset.

Nesse ano a Lei de Acesso à Informação completou 8 anos e ainda temos problemas graves na divulgação de dados abertos. Os dados sobre o coronavírus disponibilizados pelo Ministério da Saúde não são suficientes para que possamos agir localmente com eficácia, pois:

  • O processo de atualização é lento pouco frequente;
  • O site sai do ar frequentemente;
  • Os dados não estão estruturados.

Para conter a epidemia precisamos de atitudes locais bem específicas, ou seja, temos que agir a nível municipal e estadual, o mais rápido possível. Para resolver essas questões dos dados, comecei uma catalogação manual e pedi ajuda a voluntários e nesse fim de semana, junto com 32 outras pessoas incríveis, catalogamos manualmente os dados presentes em centenas de boletins epidemiológicos das secretarias de saúde estaduais incluindo o histórico (esse processo colaborativo intenso merece uma publicação dedicada, além de um agradecimento especial a todos os voluntários - em breve).

Ainda não terminamos todos os estados (faltam: AM e TO), mas já temos os dados atuais para cada um dos municípios (para os estados que conseguimos catalogar) e nossos dados já são maiores/mais atuais que os divulgados pelo Ministério da Saúde (1550 versus 1546 para a versão que liberamos agora pela manhã de 23/03).

Para nos ajudar nesse trabalho, você pode:

Se você trabalha numa secretaria, veja exemplos do que não deve ser feito:

Outros links que podem ser úteis:

Ah, e fica aqui o lembrete: depois da pandemia precisamos voltar à questão da Lei de Acesso à Informação (sites fora do ar, informação oficial não constando no site, dados não estruturados, dados incompletos), mas enquanto ela não termina, nossa equipe de voluntários tem uma batalha diária: atualizar esses dados - manualmente.

Read the whole story
10 days ago
Share this story

Local Government Employee Fined For Illegally Deleting Item Requested Under Freedom Of Information Act

1 Share

Techdirt writes about freedom of information matters often enough. Sadly, many of the stories are about governments and other official bodies refusing to comply with local Freedom of Information Act (FOIA) laws for various reasons, and using a variety of tricks. In other words, rights to FOI may exist in theory, but the practice falls woefully short. That makes the following story from the UK a welcome exception.

It concerns Nicola Young, a local government employee in the English market town of Whitchurch, in Shropshire. Part of her job as town clerk was to handle FOIA requests for the local council. One such request asked for a copy of the audio recording of a council meeting. Apparently the person requesting the file believed that the written minutes of the meeting had been fabricated, and wanted to check them against the recording. However, the reply came back that the file had already been deleted, as was required by the official council policy.

Undeterred, the person requesting the file sent a complaint to the UK's main Information Commissioner's Office (ICO), which carried out an investigation. The ICO discovered that the town clerk had not only claimed that the audio file had already been deleted when it actually existed, but that she personally deleted it a few days after the FOI request was made. Quite why is not clear, but as a result:

On Wednesday 11 March, Young, of Shrewsbury Street, Whitchurch, Shropshire, was convicted at Crewe Magistrates after pleading guilty to blocking records with the intention of preventing disclosure and was fined £400 [about $490], ordered to pay costs of £1,493 [$1,835] and a victim surcharge £40 [$50].

In its press release on the case, the ICO comments that it "marks the first ever successful conviction under the [UK's] FOIA.". It may be a small victory, but we'll take it.

Follow me @glynmoody on Twitter, Diaspora, or Mastodon.

Permalink | Comments | Email This Story
Read the whole story
16 days ago
Share this story

Data package is valid!

1 Share

This blog is the second in a series done by the Frictionless Data Fellows, discussing how they created Frictionless Data Packages with their research data. Learn more about the Fellows Programme here http://fellows.frictionlessdata.io/.

By Ouso Daniel

The last few months have been exciting, to say the least. I dug deep into seeking to understand how to minimise friction in data workflows and promote openness and reproducibility. I have been able to Know of various FD software for improving data publishing workflows through the FD Field Guide. We’ve looked at a number of case studies where FD synergised well for reproducibility, an example is on the eLife study. We also looked at contributing and coding best practices. Moreover, I found Understanding JSON schemas (by json-schema.org) a great guide in understanding the data package schema, which is JSON-based. It all culminated in the creation of a data package, which I now want to share my experience.

To quality-check the integrity of your data package creation, you must validate it before downloading it for sharing, among many things. The best you can get from that process is “Data package is valid!”. What about before then?

Data package

Simply, I would say, it is data coupled to its associated attributes in a JSON format. To marry the data to its attributes you will require an FD tool. Here is the one I created.

Data Package Creator (DPC)

A DPC gives you a data package. The good news is that it takes care of both realms of users; programmers and GUI users. I will describe the latter case. It is a web app with three main components: One, the Metadata pane on the left. Two, the Resources (a data article) pane in the middle and the third is the Schema on the right pane (usually hidden, but can be exposed by clicking the three-dots-in-curly-brackets icon).

The Data

I used my project data in which I was evaluating the application of a molecular technique, high-resolution melting analysis, in the identification of wildlife species illegally targeted as bushmeat. I had two files containing tabular data: one with sample information on samples analysed and sequences deposited in GenBank and the other on species identification blind validation across three mitochondrial markers. My data package thus had two resources. This data was contained in my local repository, but I shipped it into GitHub in the CSV format for easy accessibility.

Creating the Data Package

You may follow along, in details, with this data package specifications. On the resources pane tab, from left to right, I entered a name for my resource and the path. I pasted the raw GitHub link to my data on the provided path field and clicked the load button to the right. Locally, you may click the load button that will pop your local file system. DPC automatically inferred the data structure and prompted me to load the inferred fields (columns). I counter checked that the data types for each field were correctly inferred, and added titles and descriptions. The data format for each field was left as default. From the gear-wheel (settings) in the resource tab, I gave each of the two resources titles, descriptions, format and encoding. The resource profile is also automatically inferred. All the field and resource metadata data that I inputted are optional, except we want to intentionally be reproducible and open. On the other hand, there is compulsory metadata information for the general data package, in the metadata pane. They are name and title. Be sure to get the name right, it must match the pattern ^([-a-z0-9._/])+$ for the data package to be valid, it is the most probable error you might encounter.

The data package provides for very rich metadata capturing, which is one of its strengths for data reusability. There are three metadata categories, which must not be confused; data package metadata, resource metadata and field (column) metadata, respectively nested. After inputting all the necessary details in the DPC you have to validate your data package before downloading it. The two click-buttons for these purposes are at the bottom of the metadata pane. Any error(s) will be captured and described at the very top of the resources pane. Alternatively, you will see the title of this post, upon which you can download your data package and rename it accordingly, retaining the .json extension.


I applied DPC first-hand in my research, so can you. We created a data package starting from and ending with the most widely used data organisation formats, CSV and JSON respectively (interoperability). We gave it adequate metadata to allow a stranger to comfortably make sense of the data (reusability) and provided licence information, CC-BY-SA-4.0 (accessibility). The data package is also uniquely identified and made available on a public repository in GitHub (findability). A FAIR data package. Moreover, the data package is very light (portable) making it easily sharable, and open and reproducible. The package is holistic, containing metadata, data and a schema (a blueprint for data structure and metadata). How do I use the data package? You may ask.

Way forward

Keep in memory the term goodtables, I will tell you how it is useful with the data package we just created. Until then you may keep in touch by reading periodic blogs regarding the Frictionless Data fellowship, where you will also find works by my colleagues Sele, Monica and Lily. Follow meOKF on twitter for flash updates.

Read the whole story
35 days ago
Share this story

Open Data Day 2020: vamos falar sobre dados abertos?

1 Share

A décima edição do Open Data Day está chegando – e com novidades!

Em mais um ano, pessoas interessadas em discutir e analisar bases de dados abertos se reunirão em diversas cidades do mundo para um dia inteiro de atividades. A data escolhida para o Open Data Day em 2020 foi 7 de março.

No Brasil, o ODD 2020 contará com a participação da rede de pessoas Embaixadoras de Inovação Cívica na organização de alguns eventos. Afinal, a utilização de dados abertos é parte fundamental da atuação do projeto, que busca promover soluções de interesse público por meio da tecnologia em todo o território nacional.

Por que participar?

A oportunidade de integrar uma edição do ODD é inspiradora por uma série de motivos: envolvimento com a comunidade local de tecnologia e políticas públicas, resolução de demandas de dados abertos junto a uma equipe qualificada, prática, aprendizado e muito mais. Falamos sobre isso neste Guia Embaixadoras para o Open Data Day 2020.

Onde será?

Já é possível conferir a lista de cidades que sediarão eventos e saber mais detalhes sobre como participar. Confira abaixo as cidades brasileiras que terão eventos do ODD 2020:




Evento: Open Street Map Data Day – UFBA: Comunidades Mapeando Comunidades*

Organização: Universidade Federal da Bahia (grupos de pesquisa)



Belo Horizonte

Evento: Open Data Day – Belo Horizonte

Organização: Alexandre Gomes (embaixador de inovação cívica)




Evento: Code for Curitiba Open Data Day 2020

Organização: Code for Curitiba




Evento: Open Data Day Recife

Organização: Women in Data Science Recife e IP.Rec


Evento: Open Data Day Recife: Dados Abertos para um Futuro Melhor

Organização: Pernambuco Transparente



Rio de Janeiro

Evento: Cerveja com Dados #08 – Edição Open Data Day

Organização: Escola de Dados e Marcus Vinicius Roque (embaixador de inovação cívica)


Evento: Dia dos Dados Abertos no Arquivo Nacional – Open Data Day 2020

Organização: Arquivo Nacional




Evento: Open Data Day Natal

Organização: Dados Abertos RN e Tiago José (embaixador de inovação cívica)



Porto Alegre

Evento: Dia dos Dados Abertos POA 2020*

Organização: Afonte Jornalismo de Dados




Evento: Open Data Dextra Day

Organização:  Dextra Digital


Evento: OpenDataDay Santos 2020

Organização: Py013 – Comunidade de desenvolvedores Python da Baixada Santista

São Paulo

Evento: Open Data Day São Paulo – Explorando dados do Legislativo local

Organização: Open Knowledge Brasil


* As edições organizadas pelos grupos de pesquisa da UFBA, em Salvador, e pela iniciativa Afonte, em Porto Alegre, foram contempladas com as mini-bolsas da Open Knowledge Foundation, contando com uma ajuda de até U$300,00 para realizar os eventos locais.


Soube de algum evento que não está na lista? Registre-o no mapa oficial do ODD 2020 – é bem rápido! Vamos manter esta lista atualizada nos próximos dias, então fique de olho.


Flattr this!

Read the whole story
35 days ago
Share this story

Combating other people’s data

1 Share

This blog is the first in a series done by the Frictionless Data Fellows, discussing how they created Frictionless Data Packages with their research data. Learn more about the Fellows Programme here http://fellows.frictionlessdata.io/.

By Monica Granados

Follow the #otherpeoplesdata on Twitter and in it you will find a trove of data users trying to make sense of data they did not collect. While the data may be open, having no metadata or information about what variables mean, doesn’t make it very accessible. As a staunch advocate of open science, I have made my data open and accessible by providing context and the data on Github.

A screengrab of my GitHub repository with data files

Data files are separate in GitHub.

And while all the data and a ReadMe is available and the R code allows you to download the data through the R console – a perfect setting to reproduce the analysis, the reuse of the data is questionable. Without definitions and an explanation of the data, taking the data out of the context of my experiment and adding it to something like a meta-analysis is difficult. Enter Data packages.

What are Data Packages?

Data packages is a tool created by the Open Knowledge Foundation to be able to bundle your raw data and your meta-data so that it becomes more usable and shareable.

For my first data package I will use data from my paper in Freshwater Biology on variation in benthic algal growth rate in experimental mesocosms with native and non-native omnivores. I will use the Data Package Creator online tool for this package creation. The second package will be done in the R programming language.

Presently, the data is distributed in my GitHub repo but the Data Package Creator will allow me to combine the algae, snail and tile sampling data together in one place.

Write a Table Schema

A schema is a blueprint that tells us how your data is structured, and what type of content is to be expected in it. I will start by loading the algae data. The data is already available on GitHub so I will use the hyperlink option in the Data Package Creator. To create the schema, I have to load my data using the Raw link from GitHub. Since my data has the first row as the column headings the Data Package Creator recognizes them. Once loaded, I can add addition information about my different variables in the “Title” and “Description.” For example, for the variable “Day” I added a more explicit description in the of “Experimental day” in the Title and more information about the length of the experiment in the Description.

To add the snail and tile sampling datasets I will click on “Add a resource” for each and add the titles and descriptions.

Add dataset’s metadata

Next I will add the data’s metadata. I added a title, author and description to the metadata and I choose tabular data package since its just CSV files. I also added a CC-BY license so that anyone can use the data as well.

Then I validated the data (see note below) and downloaded the package which is available here.

Tu data es mi data

The Golden Rule states: Do unto others as you would have them do unto you. I think we have all been subject to other people’s data, the frustration and the disappointment that follows when we determine that the data is unusable. By adopting the use of data packages, you can make your data more re-usable and accessible. Most importantly prevent another #otherpeoplesdata tweet.

A screen grab of the pilot project blog

Find out more about the scientist’s pilot.

Are you interested in learning more about data packages and Frictionless data? The Open Knowledge Foundation is looking for pilot collaborations with scientists now. Find out more here.

Read the whole story
41 days ago
Share this story

Gopher: When Adversarial Interoperability Burrowed Under the Gatekeepers' Fortresses

1 Share

When Apple's App Store launched in 2008, it was widely hailed as a breakthrough in computing, a "curated experience" that would transform the chaos of locating and assessing software and replace it with a reliable one-stop-shop where every app would come pre-tested and with a trusted seal of approval.

But app stores are as old as consumer computing. From the moment that timeshare computers started to appear in research institutions, college campuses, and large corporations, the systems' administrators saw the "curation" of software choices as a key part of their duties.

And from the very start, users chafed against these limitations, and sought out ways to express their desire for technological self-determination. That self-determination was hard to express in the locked-down days of the mainframe, but as personal computers started to appear in university labs, and then in students' dorm rooms, there was a revolution.

The revolution began in 1991, in the very birthplace of the supercomputer: Minneapolis-St Paul. It was named after the University of Minnesota's (UMN) mascot, the gopher.


In the early 1990s, personal computers did not arrive in an "Internet-ready" state. Before students could connect their systems to UMN's network, they needed to install basic networking software that allowed their computers to communicate over TCP/IP, as well as dial-up software for protocols like PPP or SLIP. Some computers needed network cards or modems, and their associated drivers.

That was just for starters. Once the students' systems were ready to connect to the Internet, they still needed the basic tools for accessing distant servers: FTP software, a Usenet reader, a terminal emulator, and an email client, all crammed onto a floppy disk (or two). The task of marshalling, distributing, and supporting these tools fell to the university's Microcomputer Center.

For the university, the need to get students these basic tools was a blessing and a curse. It was labor-intensive work, sure, but it also meant that the Microcomputer Center could ensure that the students' newly Internet-ready computers were also configured to access the campus network and its resources, saving the Microcomputer Center thousands of hours talking students through the configuration process. It also meant that the Microcomputer Center could act like a mini App Store, starting students out on their online journeys with a curated collection of up-to-date, reliable tools.

That's where Gopher comes in. While the campus mainframe administrators had plans to selectively connect their systems to the Internet through specialized software, the Microcomputer Center had different ideas. Years before the public had heard of the World Wide Web, the Gopher team sought to fill the same niche, by connecting disparate systems to the Internet and making them available to those with little-to-no technical expertise—with or without the cooperation of the systems they were connecting.

Gopher used text-based menus to navigate "Gopherspace" (all the world's public Gopher servers). The Microcomputer Center team created Gopher clients that ran on Macs, DOS, and in Unix-based terminals. The original Gopher servers were a motley assortment of used Macintosh IIci systems running A/UX, Apple's flavor of Unix. The team also had access to several NeXT workstations.

Gopher had everything a student needed to navigate complex information spaces—except for information! The Gopher team cast about for resources that they could connect to their Gopher servers and thus make available to the entire network. They hit on Apple's Tech Info Library (AKA the "Knowledgebase"), a technical documentation database that came on CD-ROMs that could only be accessed by programmers who physically traveled to the lab where they were kept (or who paid for subscriptions to Apple's Applelink service). The Gopher team also answered student support questions and used Apple's Tech Info Library to do their jobs. Why not make it self-serve? They loaded the Knowledgebase into some NeXT workstations, and realized that they could use NeXT's built-in full-text indexing to make the complete set of documentation both accessible and searchable by anyone connected to a Gopher server.

Full-text indexing via NeXT workstations turned out to be one of Gopher's superpowers: soon, Gopherspace included fully indexed and searchable Usenet feeds, competing with WAIS to bring much-needed search to the Internet's largest, busiest social space. Gopher used the NeXT indexer to ingest massive quantities of recipes, creating the first-ever full-text search for cook-books.

But there were many other tricks up Gopher's sleeve. Many of the Internet's resources were available via text-based terminal connections that could only be accessed if you could remember their addresses and the quirky syntax required by each of these services. The Gopher team brought these resources into Gopherspace through the magic of terminal automation, whereby a terminal program could be programmed to login to a service, execute a command or series of commands, capture the output, format it, and put it in a distant user's Gopher client.

An early case for terminal automation was the Weather Underground service, which would give users who knew its address and syntax a realtime weather report for any place on Earth. The Gopher team created a Weather Underground gateway that used terminal automation to simplify weather retrieval and it quickly became so popular that it overwhelmed the Weather Underground's servers. However, the collegial spirit that prevailed online in those days meant that the Weather Underground's administrators could settle the matter by contacting the Gopher team. (Later on, the Weather Underground's administrators at the University of Michigan asked the Gopher team for usage data so they could include it in their application to renew the NSF grant that funded the project!)

Terminal automation allowed the Gopher team to rip the doors off of every information silo on campus and beyond. Libraries had put their card catalogs online, but few of the library vendors supported Z39.50, the standard for interconnecting these catalogs. Terminal scripting brought all the library catalogs into one searchable interface, and as Gopherspace proliferated to other campuses, it was possible for the first time to search collections of research libraries around the world.

The Gopher team consolidated many of these one-off hacks and bodges into a unified Gopher gateway server, with pre-assembled software ready to be customized and connected to the network by people running their own Gopher servers. These were popping up all over the world by this point, being run by children, universities, hobbyists, corporations, and even MTV's most tech-savvy VJ, Adam Curry. The team called their ethic "Internet duct-tape": a rough-and-ready way to connect all the 'Net's services together.

The expanded universe of Gopher hackers brought even more resources to Gopherspace. Soon, Gopher could be used to search Archie, a tool that indexed the world's public FTP servers, home to all the world's shareware, text-files, free and open source software, and digital miscellanea.

The FTP-Gopher gateway was a godsend for Internet newbies, who struggled with FTP's own obscure syntax. Some FTP servers were so overwhelmed by inbound connections from FTP-Gopher gateways that they scrapped their FTP servers and installed Gopher servers instead!

Soon, researchers at the University of Nevada at Reno had made their own search tool for Gopherspace, called Veronica (Very Easy Rodent-Oriented Net-wide Index to Computer Archives), which crawled every menu of every known Gopher server and allowed users to search all of Gopherspace. Veronica spawned a competing search tool from the University of Utah called Jughead (Jonzy's Universal Gopher Hierarchy Excavation And Display), later changed to "Jugtail" after a trademark scare.

The Gopher team made some tentative moves to commercialize their invention, asking for payments from commercial users of the Gopher server software (in practice, these payments were often waived, as they were for Adam Curry after he agreed to wear a Gopher t-shirt during an MTV broadcast).


The Gopher story is a perfect case history for Adversarial Interoperability. The pre-Gopher information landscape was dominated by companies, departments, and individuals who were disinterested in giving users control over their own computing experience and who viewed computing as something that took place in a shared lab space, not in your home or dorm room.

Rather than pursuing an argument with these self-appointed Lords of Computing, the Gopher team simply went around them, interconnecting to their services without asking for permission. They didn't take data they weren't supposed to have—but they did make it much easier for the services' nominal users to actually access them.

And since the Gopher team was working in the early years of the networked world, they had a steady supply of new services to integrate into Gopherspace—so many that other people came and did an Adversarial Interoperability number on them, building multiple, competing search tools to make users' lives easier still.

A modern Gopher project would face innumerable—and possibly insurmountable—legal hurdles. In the early 1990s, violations of terms of service led to friendly negotiations with the likes of Weather Underground. Try to do that today with a big interactive service and you might find yourself charged with multiple felonies. Big, proprietary databases often use "access controls" that can't be bypassed without risking criminal and civil charges, and that goes double for distributing a "gateway server" to make it easier for others to connect their own proprietary resources to an open network.

Today's tech giants—and both their apologists and their critics—insist that their dominance is the inevitable consequence of "network effects," and so nothing we do will recapture the diversity that once defined the Internet. But adversarial interoperability is judo for network effects.

Armed with tools that relied on adversarial interoperability, the Gopher team was able to turn the installed bases of users for each of the services they interconnected into an advantage, merging these constituencies in an ever-larger pool, until Gopher became the most exciting thing on the net, the killer app that every newscast about the exciting new digital realm featured.

Gopher was born before the rise of severe penalties for crossing invisible legal lines, and it meant that Gopher could experiment with new ways of making information available without worrying that a single misstep would result in their utter ruination.

For example, the Gopher team put added support for a protocol called websterd, which would allow remote users to reach into the team's NeXT workstations to query the "DigitalWebster" edition of the Ninth Webster's Dictionary that came bundled with the systems, so that anyone on the Internet could look up English-language dictionary definitions. This led to a complaint from the dictionary's copyright holders, and Webster was modified to access alternative dictionaries, and it served language-learners and students for years afterward.

Ironically, perhaps, adversarial interoperability was also Gopher's downfall. Even as Gopher was rising to prominence, an English physicist at the CERN research institute in Switzerland named Tim Berners-Lee was inventing something called the "World Wide Web," and with it, the first browser. With browsers came URLs, identifiers that could be used to retrieve any document on any Web server in the world. The Gopher team quickly integrated URLs into Gopherspace, adding more flexibility and ease to their service.

But it wasn't enough. The Web proved to be more popular—and doubly so, once browser vendors began to build Gopher clients into the browser themselves, so you could link from any Web page to any Gopher resource and vice-versa. Adversarial Interoperability allowed the Web to treat Gopherspace as a conveniently organized pool of resources to absorb into Webspace, and it made the Web unstoppable.

The Gopher team tried many things to revitalize their service, including a valiant attempt to remake Gopherspace as a low-resolution VR environment, but the writing was on the wall. Gopher, it turned out, was an intermediate phase of our networked world, a booster rocket that the Web used to attain a higher orbit than Gopher could have ever reached on its own.

Today's Web giants want us to believe that they and they alone are suited to take us to wherever we end up next. Having used Adversarial Interoperability as a ladder to attain their rarefied heights, they now use laws to kick the ladder away and prevent the next Microcomputer Center or Tim Berners-Lee from doing to them what the Web did to Gopher, and what Gopher did to mainframes.

Legislation to stem the tide of Big Tech companies' abuses, and laws—such as a national consumer privacy bill, an interoperability bill, or a bill making firms liable for data-breaches—would go a long way toward improving the lives of the Internet users held hostage inside the companies' walled gardens.

But far more important than fixing Big Tech is fixing the Internet: restoring the kind of dynamism that made tech firms responsive to their users for fear of losing them, restoring the dynamic that let tinkerers, co-ops, and nonprofits give every person the power of technological self-determination.

(Many thanks to Gopher co-inventor Paul Lindner for invaluable assistance in the research and drafting of this article)

Read the whole story
41 days ago
Share this story
Next Page of Stories