DocGraph Open Food Database

by Fred Trotter

DocGraph is building the Best Open Source Food database ever. We are going to solve the “Wyngz” problem.

United States HIT

8

Supporter(s)
$4,240

pledged of $4,000 goal
4606

days ago
Funded!

This project was successfully funded.
This project was successfully funded on 2013-11-09

Fred Trotter

4 Created | 3 Backed

http://careset.com

Profile

About our project

Summary

Despite having lots of open sources of food data, it is very very difficult to actually build useful food-related applications. CCHMC showed us just how difficult it was when they first approached us about building applications for kids with specific food sensitivities or allergies.

We would like your help to create an open source food database that is comprehensive and much easier for software developers and food scientists to use. We would like help funding an app that is capable of crowd-sourcing food data. Once that app is built, we want your help using the application to gather data about food.

Our ask is simple: We need money to develop our app, and then we need users to use our app to gather food data. With your help we can build the most comprehensive food database in existence and then share it back with you under Open Source licenses!

Please support this crowdfund and pass it along to anyone who is really interested in food!

Our Mission

The DocGraph Journal’s Open Food Database is an attempt to create a very high-quality, crowd-sourced, open source food database, that is deeply aware of complex ingredient relationships. Using this food database, clinicians and researchers who are treating food-based conditions (allergies, intolerances, sensitivities, and diseases) will be able to build hyper-targeted apps for patients and caregivers in order to make living with food related healthcare conditions easier.

Cincinnati Children’s Hospital (CCHMC) recently presented us with a fascinating problem. CCHMC frequently treats the “hard cases” in childhood illnesses. Sometimes a hard case means a combination of complex food-related conditions that makes treatment especially difficult. Imagine a child who has a peanut allergy, a soy allergy, and is also lactose intolerant. Another good example: a child with a tree nut allergy and irritable bowel syndrome.

There are very few children who are unfortunate enough to have these kind of “combinatoric rare” diseases. In some cases, we can estimate that the number of people who suffer from a given combination might number under 100 worldwide. However, when a child does have these complex combinations of food related diseases, they frequently find their way to centers of excellence like CCHMC. As a result, CCHMC has several populations of patients that have highly complex and restrictive diets that are very arduous for their patients to follow and for their parents to support.

Organizations like CCHMC would like to be able to create apps that help these parents to grocery shop for their children. In order to build these types of apps two things are needed. First, a developer needs to be specifically familiar with the issues that a given patient group faces. CCHMC has many talented and clinically-savvy programmers on staff, so that is not a problem for them. The other condition is to have a database that is able to support advanced food algebra. CCHMC has agreed to be the initial sponsor in our crowdfund to create just such a database.

Crowd Strategy

As always with our crowdfunding efforts, many of the rewards come with access to the data itself. Our crowdfunding is always the cheapest way to get access to the project’s data as we offer deep discounts for those willing to support us build out our different data infrastructures!

From now on, this proposal is going to get pretty dry. We are going to discuss the specifics of what we mean by “advanced food database” in detail. We are going to discuss how our strategy will eventually mature in the most advanced and useful food database available anywhere, under any license, at any price. We will discuss even more deeply why this database must be either costless or nearly costless in order to ensure that very targeted food apps can be developed. We will show how our fundamental design will respect the rights of patients to impact the underlying data and the rights of the Open Source community to fork us. But at this point, you are in either one of three categories.

You are sold! Because you are or love someone with a food related condition and the notion that this effort can dramatically improve the software that is available to make life easier makes common sense. Or perhaps you are an app developer who wants to develop a food application who is familiar with the current lack of options for food databases!
You are interested in principle. This is because you are generally a nice person and you care about the experiences that other people have in the world.
You are not at all interested. You are already bored a little, and you are thinking of visiting some other part of the Internet.

If you are in camp 3, thanks for reading so far, but it just gets worse from here. This is a highly technical problem and you will only find our further discussion of it even more boring. Thanks for visiting…

If you are in camp 2, then we have a very specific ask of you. We are developing a food item scanning app that will be available on iPhones and Androids. This is not a typical crowdfund because we are developing a new tool to enable the crowd-sourcing of food data. We need users to commit to help us with crowd-sourcing the food data.

Please consider donating $10 to our effort and help us by scanning 10 items from a grocery trip or from your pantry. The scanning process will take you less than 15 minutes and the payment is less than you sometimes pay for a good lunch. We need to get samples of grocery items from across the country and this will help us tremendously.

Ironically, having many sponsors across the country at this level is actually more important to us than a few major financial sponsors. We believe that people with food allergies and other food conditions will deeply participate in this effort. But we need scans, lots of them, from very different places. So we have $10 support levels available with 10 scans, 100 scans, and 1000 scans.

We also need help processing the scans. Once someone has scanned in a grocery item using our scanner application (first by scanning the barcode, then by taking a picture of the product, a picture of the ingredients list, its nutrition list, where it was manufactured… price is optional but welcome) we will need someone to type in the ingredients list and the nutrition information. Transcribing, verifying and encoding someone else’s scan is just as valuable to us as an actual scan, so you can do any part of the process to earn credit for “scans”.

If you are willing to participate in this crowdfunding/sourcing process, you will be immortalized as part of our founding community as scanning partners. We will happily link to the website (personal/corporate blog or twitter account… anything without naked people etc etc) of your choice.

If you are still reading, we hope it is only because you really love the idea of collaboratively building an Open Source food database, and you want to be even more involved! Great!

As we mentioned above, we will be developing two major components of this application, an app for iPhones and Android that will allow a community of food scanners to upload details of various foods to a central database, and a central database that will accept that scanned data. We already have prototypes of both applications working!! As with the data itself, these apps will be available when they are finished under Open Source Eventually licenses, so that you can hack on them directly if you want to!

All of the crowdfunding support levels focus on ways that you can participate in generating the database and ways that you can enjoy using the data to build your own applications. We even have support levels where we will build a specifically targeted app around your dietary requirements.

We will be offering this dataset to the public using our standard Open Source Eventually licensing model. This model presents three licensing options for accessing the data that results in this project.

Wait for the Open Source version. This version costs $1 (to cover hosting) but will not be available until six months after the first OSE release.
Purchase an inexpensive Open Source Eventually license for $200. This license is not itself Open Source, but gives you access to an up to date version of the data that will eventually revert to an Open Source license. This option requires you to release your application under the Open Source license of your choice.
If you do not want to release your app as an Open Source Project you can purchase a proprietary-friendly license for $2000. This license lets you do anything you like with the data, except resell it yourself.

Generally, we will charge $1, $200, and $2000 per year for the various versions of the database. We will have other prices available for REST APIs, including a free option for light-weight usage and browsing.

However, for this crowdfund only we have some pretty amazing offers. If you will help us by scanning or verifying items, we will give a 50% discount on the price of the data set! This discount will be good for next three years!

This means the cheapest way to get immediate access to this data set is to scan or verify 100 items, and to pay only $100! This is perfect for food researchers, Open Source developers, curious data scientists, or empowered patient and caregivers who yearn to be familiar with the underlying technology that impacts them. Likewise, during this crowd-funding, the proprietary-friendly version of this data will be available for $1000, with pricing locked in for three years, too!

Technical Strategy

We gave a lecture about our technical strategy at Graph Connect SF 2013. https://vimeo.com/76806552

Ok, so why is this better than the numerous other Open/Closed Food database projects out there? First, we plan on being much more comprehensive. We have committed to CCHMC that we will be scanning enough products to support their specific patient cohort, upon release, this effort will have more items in the database than any other open source effort.

Second, we are going to be working with a semantically valid food ontology: we are going to provide context that is relevant for people with complex food conditions. Most food databases are focused on a specific use case, which skews their usefulness for other use cases.

Here are areas current food databases focus on for one specific use case, which skews their usefulness for other use cases. They are designed to:

make calorie calculators possible.
track ingredients to support people with food allergies.
support public health researchers looking at the nutrition of populations.
support dieting applications that focus on foods available from restaurants.
support higher level recipe databases.
focus on only foods with UPC bar codes, so that apps can support automated scanning.
track ingredient-drug-condition interactions and are usually found alongside medication databases
use locally sourced food stuffs
support religious or ethical eating rules
track what specific genetic modifications have taken place on a food.
not consider genetically modified foods or allow for foods to be simply marked as genetically modified.

A great example of this effect is the Allergen Online database, which focuses on the protein structure problems with allergy ingredients. Unfortunately, there is no way to easily translate this data into something that can be used on an app while grocery shopping. Of course you could use data from the Open Food Facts project, which tracks UPC codes and ingredient labels, but this data has no notion of “similar protein structures”. Neither one of those database would be terribly useful for powering recipe data. There are literally hundreds of open food data sources that solve important parts of the larger problem, but remain disconnected.

As you can see, food is full of the “edge cases” that make data modeling difficult.

Eventually, we would like to mature this open database into something that supports all of these use cases. In order to do that, we need to support a data model that makes very few presumptions about foods and their ingredients. Some foods come with barcodes, some do not. Some foods are prepared in factories, some in restaurants and some at home. Some databases presume that the nutritional value of fruits and vegetables is a constant, despite the considerable evidence that different methods of growing (organic vs inorganic, etc.) and different genetic strains of foods can have different typical nutritional values. Some databases presume, for instance, that the ingredients label on foods is completely trustworthy, which is only “usually” a good assumption… (a fact that people in the food allergy community frequently learn the hard way).

Frequently a given food database will force an application developer to think in a particular way. Depending on the app a developer might want to think in terms of “fruit vs vegetables”, “apples vs oranges” or “granny smith apples vs ruby red apples”. If the database does not support thinking about food in the same way the developer does, then things can get pretty dicey.

It will take a tremendous amount of time and several versions of this food database in order to get this “right”, but for now (and under our first $20k funding goal) we are planning on starting with the “worst case” scenario that CCHMC needs to support. They want to develop apps for the families of kids with food related illness (which will sometimes include food allergies). They want those apps to include scanners that will allow parents to quickly tell if a product in a grocery store is on their child’s diet.

Severe food allergies are not based strictly on “ingredients”, but rather on the structure of the proteins in those ingredients. Specifically, an allergy is an immune response that targets “specific and similar” protein structures. Different antibodies in the immune system can react to different protein structures. One of the reason that a “peanut allergy” can be so dangerous is that the body is having multiple reactions to different peanut proteins all at the same time.

Protein structures can also change. For some people cooked eggs are OK, and raw eggs are not. Some peanut oils, if they are treated properly can be safe, etc etc.

Clearly, “ingredients” is not a deep enough database level to track if we want to support food allergy applications. Our food database will track ingredients, the protein structures that are typical found in ingredients, how those proteins change, and which classes of antibodies might react to them.

We also want to support “fuzzy” distinctions. Some people with peanut allergies have learned that major candy bar manufacturers will put a label that says “produced in a facility that has handled peanuts” on everything for liability reasons, and that their candy bars can still be safe to eat. Some people with peanut allergies want to avoid all peanut oil, no matter how or where it was processed. Allergies range in severity and allergy sufferers vary in their lifestyle approaches. A good food database should not enforce a particular position on anyone, but allow end users to exercise their positions within the data model. With this flexibility, the database will support the development of hundreds of food-lifestyle applications in an ecosystem, empowered by communities of foodies, food bloggers, and patient families.

Moreover, a food database should be sensitive to feedback from its users. Frequently it is allergy sufferers who discover that a manufacturing process has changed and is no longer “peanut safe” for instance. There should be no “fact” in a food database that is free from scrutiny, verification, and even editing by its end users. This is the reason that a food database, in our opinion, must be crowd-sourced in the end. This is a perfect case where the most authoritative voice for many issues will be the chorus of patients. When clinicians and researchers have the expertise, our database will rely on them, but patients will always have the right to second guess the presumptions in the database in a way that can change the data in the end. Practically this means that a great food database should support a notion of tracking and resolving “allegations” regarding specific ingredients or other data points. We plan on supporting the allegations feature as soon as possible.

Hopefully, it will be possible to use this layer to capture some of the really strange things that sometime happens in the world of foods. A great example of this is Colberts segment on Wyngz.

CCHMC presented us with a tremendous opportunity. We have a specific use case that includes the hardest parts of food allergy modeling, along with the practical goal of building applications that enable in-store use by parents. With those two goals in mind, we will be developing the first version of our food ontology… and this will be an Open Source project.

We have already seen dramatic improvements in our understanding of the original DocGraph data sets and we expect our community to help us design a data structure and data model that will make this project into the best food database ever. And if the database does not have that potential under our management, the Open Source community should fire us and fork our dataset… doing something better without us!

Until we have critical mass, our central advantage is our humility relative to patients voices and the Open Source developer community. Ironically, this humility is the key ingredient in ultimately developing critical mass.

Over the course of this crowdfund, we will be announcing new features that we will add to the database if we meet certain funding goals. So watch this page closely.

Because this is both a crowd-funding effort, and a crowd-sourcing effort, there are going to be many ways for you to participate. Whether you have time or money, there is a way for you to participate. In the very short term, you will benefit a very small group of patients at CCHMC. After that, we will have a resource that is very useful to anyone with a combination of food issues. Eventually, this database will become something that everyone with a food sensitivities benefits from. Perhaps, after some years, this data will have breadth and depth enough to benefit society at large. Who knows, the life you improve or save by contributing now could be your own!

Have a question? If the info above does not help, you can ask the project creator directly.
Ask a Question Report this project to MedStartr

Rewards

For $10 or more

2 Supporter(s)

Just Helping Out (scan 10 items): Get listed on our supporter page, and get to feel like an awesome human being when falling asleep tonight!

For $10 or more

2 Supporter(s)

Really Behind Us (scan 100 items): Get listed on our supporter page at the 100 scan level. Everyone will know that this actually took some effort as you raid your pantry for us!

For $10 or more

0 Supporter(s)

Founding Scan Partner (scan 1000 items): Get listed on our supporter page, at the 1000 scan level. This is a selfless act that will require you to potentially recruit some friends. It's a great weekend project and you can make it easier by spending some time verifying the scans of others instead of uploading your own scans. Or you can go on a massive shopping trip. Whatever works for you.

For $20 or more

1 Supporter(s)

Thank You Tweet: @DocGraph will tweet a thank you for your contribution, including a @mention of the twitter account of your choice!!

For $60 or more

1 Supporter(s)

Everyone Loves a T-Shirt: This is a great way to get us some extra money without bothering with scans! If you like, you can choose to contribute $10 extra at any funding level above $60 and we will also send you a t-shirt!

For $100 or more

1 Supporter(s)

OSE Food Database (scan 100 items): With this you get one year of access (and three years of discounts) to the Open Source Eventually version of the DocGraph Journal Food Database (normally $200).

For $100 or more

0 Supporter(s)

Scan Pirate (scan 1000 items): You will be awarded the title of “Scan Pirate”. You will get a t-shirt, a limited Regina Holiday DocGraph print, and you and a friend are invited to an exclusive launch event for the release of this dataset. We will also give you some kind of super awesome pirate gear. Feel free to get a team together to do this one, but only two people get to attend the launch party. Travel expenses are not included.

For $200 or more

0 Supporter(s)

Scan More Proprietary Friendly Database (scan 1000 items): Have more time than money, but still want a proprietary version of this food database? This is a great way to stretch your dollar, and you can introduce your friends to concept of a grocery store “scan party”! (Normally $2000)

For $500 or more

0 Supporter(s)

Regina Holiday DocGraph Print: If you love rare stuff, have a little cash, and not very much time, then you can get a rare, limited-run print from celebrated patient artist Regina Holliday

For $1000 or more

0 Supporter(s)

Proprietary Friendly Database (scan 100 items): Want to release a proprietary application with the food database? This level is for you!! (Normally $2000)

For $1000 or more

0 Supporter(s)

"In the name of" support: Do you know a loved one who suffered or suffers from a food allergy. We will have a special “in the name of” page on the DocGraph site where you can list the name, or nickname, of your loved one along with a short message to/for them. This is a great way to show a loved one support in their struggles and a way to remember someone who struggled with food related disease.

For $1000 or more

0 Supporter(s)

Scan Ninja and Team Dinner (scan 1000 items): You will be awarded the title of “Scan Ninja”. You will get a t-shirt, a limited Regina Holiday DocGraph print, and you and a friend are invited to an exclusive launch event for the release of this dataset, followed by steak (or Vegan , we know our audience here) dinner with Fred Trotter and the DocGraph development team. We will also give you some kind of super awesome ninja gear. Feel free to get a team together to do this one, but only two people get to attend the dinner and launch party. Travel expenses are not included.

For $5000 or more

0 Supporter(s)

Corporate Sponsorship: If you would like your logo on the DocGraph Journal’s Food database website as a corporate sponsor, this is the level for you. We always take requests from our sponsors when possible!

For $5000 or more

0 Supporter(s)

Sponsor a Database Platform: We will support a release of the data set in the database of your choice (if you choose a proprietary database, you also have to buy us a license to do the work...). If you want us to target a specific database platform to make your life easier, this level is for you.

For $10000 or more

0 Supporter(s)

Name Our Food Database: We think “Open Food Database” is a terrible name for this data project. Food related illness impact so many people; every year people die and suffer because they ate something they thought was safe, but wasn't. Perhaps you have lost someone that you love this way, or someone you love is still suffering and deserves our specific attention. We would like the name of our database to reflect our reasons for doing this, which is to make life easier for people with food-related conditions. You can choose to name this food database with persons first, last or reasonable nickname. We reserve the right to work with you to find a suitable name if the name of your loved one is something like “Firstdatabank” or “Butthead”.

For $15000 or more

0 Supporter(s)

Build My App: Would you like to build a specific, simple application, leveraging the database to support a specific dietary constraint? We will release an MVP version of your app as an Open Source reference app for how to actually use the database to develop simple applications that solve hard problems with smarter data. $15k is enough to develop an MVP using this database, and to make that reasonable, we have to use our definition of “Minimum Viable Product” and not necessarily yours.

No updates found .

No comments found .

Login to post your comment! Click here to Login

Chris Dunigan

backed on 11/05/2013

Anna

backed on 11/05/2013

Natalia Chernysh

backed on 11/01/2013

David Lovenheim

backed on 10/29/2013

Amman Kallumpram

backed on 10/24/2013

Sarah DeNeve

backed on 10/24/2013

Dr. Ronan Kavanagh

backed on 10/22/2013

Hugh Byers

backed on 10/22/2013

Brooke Brody

backed on 10/20/2013

Janey Peugh

backed on 10/19/2013

Rob Lamberts

backed on 10/08/2013

thong tue thich

backed on 10/01/2013

8

$4,240

4606

Funded!

About our project

Summary

Our Mission

Crowd Strategy

Technical Strategy

Rewards

For $10 or more

For $10 or more

For $10 or more

For $20 or more

For $60 or more

For $100 or more

For $100 or more

For $200 or more

For $500 or more

For $1000 or more

For $1000 or more

For $1000 or more

For $5000 or more

For $5000 or more

For $10000 or more

For $15000 or more

Login to post your comment! Click here to Login