On the other hand, N3C is audited by thousands of researchers from hundreds of participating institutions and is accountable to them, paying great attention to transparency and repeatability.Everything the user does through the interface, it uses Palantir’s GovCloud The platform is carefully preserved, so anyone with access rights can trace their steps.
“This is not rocket science, and it’s nothing new. It’s just hard work. It’s tedious and must be done carefully. We must verify every step,” said Christopher Chuter, a professor of medicine at Johns Hopkins University, who is also N3C. Co-responsible person. “The worst thing we can do is to methodically turn data into garbage, which will give us the wrong answer.”
Handel pointed out that these efforts did not come easily. “The diversity of expertise required to achieve this goal—persistence, dedication, and, frankly, brute force—is unprecedented,” she said.
This brute force comes from many different fields, not just medicine.
“It’s really helpful to involve everyone from all aspects of science. People are more willing to cooperate during the epidemic,” said Mary Boland, professor of informatics at the University of Pennsylvania. “You can have engineers, computer scientists, physicists-all these people who don’t usually participate in public health research.”
Boland is a member of a team that uses N3C data to study whether the new coronavirus can increase irregular bleeding in women with polycystic ovary syndrome. She said that under normal circumstances, most researchers must use insurance claims data to obtain a large enough database for population-level analysis.
For example, claims data can answer questions about how effective drugs are in the real world. But these databases lack a lot of information, including laboratory results, symptoms reported by people, and even data on whether patients are alive or dead.
Collect and clean
In addition to the insurance claims database, most health data collaboration agencies in the United States use a joint model. Participants in these studies all agreed to format their own data sets in a common format and then run queries from groups such as the proportion of severe Covid cases by age group.Several international covid research groups, including Observing health data science and informatics (OHDSI, pronounced “Odyssey”), operating in this way avoids legal and political issues with cross-border patient data.
Founded in 2014, OHDSI has researchers from 30 countries and has 600 million patient records.
“This allows each institution to keep their data behind its own firewall and have its own data protection measures. It does not require any patient data to move back and forth,” Boland said. “This is comforting in many places, especially with all the recent hacking attacks.”